Testing SSML on Alexa vs Google Home/DialogFlow

Both Alexa and the Google Home device support Speech Synthesis Markup Language (SSML).
For Alexa – see this SSML documentation, and for Google Home/DialogFlow, see this SSML documentation. SSML was created by the W3C’s Voice Browser working group.

Even though the Alexa doc says you need to wrap the SSML in the <speech> tag, I never used it there. Perhaps this is because they have a separate flag that states you are using SSML, as shown in this example:

<pre>
"outputSpeech": {
    "type": "SSML",
    "ssml": "<speak>This output speech uses SSML.</speak>"
}
</pre>

Actually, the more probable reason is that I’m using the alexa-sdk for JSON, and I think it treats all speech output as SSML.

The main features of SSML that I’ve used are 1) playing short mp3 sound files, and 2) adding pauses to make some of the phrases sound more natural. But you can also change the pitch, speed, and do other cool and interesting things with SSML.

Unfortunately the testing tool in DialogFlow will not read or pronounce the SSML. To get that to work you have to use the “Action Simulator” (https://console.actions.google.com/). (Of course, you have to create your Hebrew Vocabulary for Alexa app. It re-purposes some of the 2700 mp3 files that I created for my earlier Hebrew products. Now I was trying to test my first sample program on the with the Google Actions of DialogFlow. I’m waiting on my actual “Google Home Smart Speaker” to arrive later this week. So I wanted to test the SSML online somehow. DialogFlow has a “play” button (clickable word) by the text, but when you click it, it will only read the text inside the <audio> tags rather than playing the mp3 file. That text is there so you can provide your own error message (for example, the mp3 is not formatted properly, or is missing.) Notice that it also did not show the full audio “src” filename.

Below is an example of a couple of my tests. I had it programmed with an intent called “test word #”, where # refers to an item in an array of Hebrew words that I have in a JSON file. You can see how word 300 didn’t exist, and how 400 did.

Before I wrapped my response with the <speech> tag, it was ignoring the SSML, and reading the less than, greater than sign, and everything in the phrase. Notice the last two entries by my Google profile picture. One shows the audio/SSML, and that was before I used the <speech> tag. I just changed my JSON code, and re-uploaded it (I’m running from an AWS Lambda function as a ‘webhook’). I repeated the exact same intent (“test word 400”) and this time it provided the spoken response. I didn’t have to click anything, it just said the proper response.

Alexa let’s you test SSML online in a somewhat round-about way. You have to go Developer.Amazon.com, and create your Alexa Skill. You have to at least create a dummy skill to test use the SSML “Voice Simulator”, because it’s found on the left “Test” tab under your skill. (Note: The “test tab” disappears after you submit your Alexa Skill for certification.)

I’m also offering my Voice App – Google Home and http://irvingseoexpert.com/services/alexa-skill-development-dallas/>Alexa developer services. Check out the link for details.

Related posts:

Leave a Reply Cancel reply