Speak
Alexa
Web Server
Alexa
Listen
Amazon Echo

How?

Simple `POST` webhooks over HTTPS using JSON paylods

No seriously, how?

  1. Write your skill app (more on this later)
  2. Define skill in Amazon Developer Portal
    • Specify an "invocation" name
    • Define expected "intents" and "slots"
    • (Possibly create custom "slots")
    • Specify sample utterances
  3. Get certified, and profit enjoy!

Defining Your Skill

Amazon Developer Portal

Invocation Names

What user's will "invoke" to hit your app.

For example:
"Alexa, ask connect tech where the after party is Friday"

Must be multiple words,
no copyright infringing,
no proper names,
no confusion with built-in skills,
...

Read the Alexa Documentation!

Intents and Slots

What is the user trying to do?
And what data does the app need to do it?

Intents and Slots


{
  "intents": [
    {
        "intent": "StopIntent"
    },
    {
      "intent": "Schedule",
      "slots": [
        {
          "name": "Day",
          "type": "AMAZON.DATE"
        }
      ]
    }
  ]
}
    

Intents and Slots

Where is this file?

Copied and pasted into the developer portal.
(Under "Interaction Model")

ಠ_ಠ

Built-In Intents and Slot Types

Custom Slots


{
  "intents": [
    {
      "intent": "FavoriteColor",
      "slots": [
        {
          "name": "Color",
          "type": "COLOR_SLOT"
        }
      ]
    }
  ]
}
    

Custom Slot Values


red
orange
yellow
green
blue
violet
black
white
navy blue
royal purple
    

Custom Slots

Where in my code do I put those values?

Copy and paste them into the developer portal.
(Under "Interaction Model")

ಠ_ಠ

Pro Tip

Add a "config" directory to your application:

  • intents.json
  • slots.txt
  • ...

Custom Slot Values

There is a 50,000 entry limit across ALL slots in your skill!

Sample Utterances

Alexa, ask connect tech where the after party is Friday

Alexa, ask connect tech where Friday's party is

Sample Utterances

Remember your intents!


{
  "intent": "Schedule",
  "slots": [
    {
      "name": "Day",
      "type": "AMAZON.DATE"
    }
  ]
}
    

Sample Utterances


IntentName  the phrase with any {SlotNames} embedded
    

Schedule  about the after party on {Day}
Schedule  when the after party is on {Day}
Schedule  about {Day} after party
Schedule  when {Day} party is
...
    

Sample Utterances

Where do I put these?

Copy and paste them into the developer portal.
(Under "Interaction Model")

ಠ_ಠ

(There is a ~200,000 character limit.)

Back to the beginning

Writing your skill app

Writing your skill app

  1. Accept POST request
  2. Verify request using Amazon certs, etc
  3. Process JSON request data
  4. Respond with JSON document

Wait, what language am I writing in?

Doesn't matter!

So long as you can accept POST requests over HTTPS.

Verifying Requests

Alexa Documentation
  1. Server must use SSL with certificate from CA
  2. Check `SignatureCertChainUrl` header validity
  3. Retrieve cert file from header URL
  4. Check cert for validity (PEM-encoded X.509)
  5. Extract public key from cert file
  6. Decode encrypted `Signature` header (base64)
  7. Use public key to decrypt signature and retrieve hash
  8. Compare hash to SHA-1 hash of entire raw request body
  9. Check timestamp of request, reject if older than 150 sec

ಠ_ಠ

Verifying Requests

Just use a library:

The Request Body


{
    "session": { ... },
    "request": { ... },
    "version": "1.0"
}
    

The Request - "session"


{
  "session": {
    "sessionId": "SessionId.6a4789.....",
    "application": { "applicationId": "amzn1.ask.skill.2ec93....." },
    "attributes": {
        "someSessionDataThing": "jordan"
    },
    "user": { "userId": "amzn1.ask.account.AFP3ZWK564FDOQ6....." },
    "new": true
  },
  "request": { ... },
  "version": "1.0"
}
    

The Request - "request"


{
  "session": { ... },
  "request": {
    "type": "IntentRequest",
    "requestId": "EdwRequestId.ba6dbc3f.....",
    "locale": "en-US",
    "timestamp": "2017-09-21T09:15:27Z",
    "intent": {
      "name": "Schedule",
      "slots": {
        "Day": { "name": "Day", "value": "2017-09-22" }
      }
    }
  },
  "version": "1.0"
}
    

Request Types


{
  "session": { ... },
  "request": {
    "type": "IntentRequest",
    ...
  },
  "version": "1.0"
}
    
  • LaunchRequest
  • IntentRequest
  • SessionEndedRequest

LaunchRequest

No explicit intent

Example: "Alexa, launch connect tech"

Think about what you should say back!

SessionEndedRequest

User cancels action, times out, or there is an error.

Your skill app MUST NOT RESPOND to this request... at all.

ಠ_ಠ

Streaming Audio Requests

AudioPlayer and PlaybackController requests

(I don't know much on those, so we'll move on.)

Responding to Intents

The Response


{
  "version": "1.0",
  "response": { ... },
  "sessionAttributes": {
      "someSessionDataThing": "jordan"
  }
}
    

The Response


{
  "version": "1.0",
  "response": {
    "outputSpeech": { ... },
    "card": { ... },
    "shouldEndSession": true
  },
  "sessionAttributes": { ... }
}
    

The Response - Output Speech


{
  "version": "1.0",
  "response": {
    "outputSpeech": {
      "type": "SSML",
      "ssml": "The party is on at 7:30 p.m.!"
    },
    "card": { ... },
    "shouldEndSession": true
  },
  "sessionAttributes": { ... }
}
    

SSML?

Speech Synthesis Markup Language
(a w3c rec from 2004)

Read Amazon's docs on it.

(You can also use "PlainText")

SSML

  • <speak> (root)
  • <emphasis level="strong">
  • <audio src="..."> (some restrictions)
  • <br time="2s">
  • <phoneme alphabet="ipa" ph="pɪˈkɑːn">pecan</phoneme>
    (vs "pi.kæn")
  • <say-as interpret-as="cardinal">5</say-as> ("five")
  • <say-as interpret-as="ordinal">5</say-as> ("fifth")

The Response - Cards

Alexa app card example

The Response - Cards


{
  "version": "1.0",
  "response": {
    "outputSpeech": { ... },
    "card": {
      "title": "Friday After Party",
      "text": "This is going to be cray cray",
      "image": {
        "smallImageUrl": "https://s3.amazonaws.com/connect-tech/party.png",
        "largeImageUrl": "https://s3.amazonaws.com/connect-tech/party.png"
      },
      "type": "Standard"
    },
    "shouldEndSession": true
  },
  "sessionAttributes": { ... }
}
    

Whew... that was a lot.

I know... and there's more.

(Dialogs/Conversations, AudioPlayer, Echo Show, Smart Home skills, Reprompts, ...)

Some Considerations...

  • Use ANY back end!
  • Server must verify ALL requests
  • No way to speak without request (no server push)
  • No easy way to accept arbitrary length/format input
  • Must get skill certified for public use

Testing your skill

  • Amazon Developer Portal
  • Amazon Echo device
  • Echosim.io

Getting Certified

  1. policy checks
  2. security checks
  3. functional checks
  4. voice interface and UX tests

How long will it take?

¯\_(ツ)_/¯