How to Create an On-Demand Feature for Your Alexa Skill

Submitted by Dan on Fri, 07/02/2021 - 13:18

As 2020 was coming to a close and 2021 was just around the corner, I was finishing development of a new version of our skill for the Alexa smart speaker. It was a total re-write, and it changed the skill from being able to perform only a single intent (play our live stream) to providing a dialog-based menu of many intents to choose from. With the new skill, listeners can choose to either play our live stream, play our HD2 live stream, play the latest newscast recording, or play an on-demand. (One day I would like to add another intent: make a donation.)

Of all the new intents, the on-demand feature is the one people are most curious about. And the curiosity is not only from our listeners. Other Alexa developers are showing interest in how I was able to put this together. This post is aimed at those who would like to develop an on-demand intent for an Alexa skill. I used the Node.js development environment in AWS Lambda, so that is what I will discuss, but he concepts should apply to Python and Java as well.

On-demand is the ability to listen or watch content at your command. It's not on a schedule, it plays when you tell it to. And in our case, the on-demand content is anything else that we have recordings of in our Content Management System (we're currently using a custom Drupal install) - that includes podcasts (like KPR Presents and Conversations), on-air shows (like Retro Cocktail Hour and Classics Live), and our collection of live studio recordings.

And because voice interface is limited, I decided to only allow the latest recording to be available for listening. Listeners will just have to go to our website to catch an older episode.

Thankfully, I have already put all these shows into podcast RSS form so that they can be fed to our mobile app. So it was a no-brainer decision to get the mp3s from the RSS xml.

Here's the basic steps

  1. Determine which RSS feed we need
  2. Download it
  3. Parse through the XML to get the mp3 url
  4. return an audioPlayer response with the mp3 url

In Node.js, steps 2 and 3 will require adding third-party node packages. Setting up this environment was actually the most time-consuming task I had to deal with. Fully explaining this process is out of the scope of this post, but my solution was to set up a local node application and then use npm to add the packages:

danny$  npm install request
danny$  npm install xml2js
danny$  npm install alexa-sdk

I then installed the AWS command line utility on my workstation so that I could push my local copy to Lambda. I also set up versioning in Lambda, that is also out of the scope of this article, but I wrote about how to do that here.

So with that out of the way, I could begin development of the on-demand feature. In the Alexa Developer Console, I set up the intent that I called playOnDemandIntent, I turned on auto delegation, and I created a slot to hold the value. Here's the screenshots:



The interaction model was all set up, now I could start writing the code that handles the response.

First thing I did was define a data structure for the programs and shows we were making available on Alexa:

const onDemandObjects = [
    {
        title: 'film music fridays',
        ssml: 'film music fridays',
        feed: '/widgets/podcasts/fmf.xml',
    },
    {
        title: 'retro cocktail hour',
        ssml: 'retro cocktail hour',
        feed: '/widgets/podcasts/rch.xml',
    },
    {
        title: 'live studio performances',
        ssml: 'live studio performances',
        feed: '/widgets/podcasts/live-performances.xml',
    },
    {
        title: 'conversations',
        ssml: 'conversations',
        feed: '/widgets/podcasts/conversations.xml',
    },
    {
        title: 'when experts attack',
        ssml: 'when experts attack',
        feed: '/widgets/podcasts/whenexpertsattack.xml',
    },
    {
        title: 'kpr presents',
        ssml: 'kpr pri zents',
        feed: '/widgets/podcasts/kpr-presents.xml',
    },
    {
        title: 'commentaries',
        ssml: 'commentaries',
        feed: '/widgets/podcasts/commentaries.xml',
    }
];

If I ever want to add or remove an on-demand program, all I have to do is edit this structure and deploy a new version in Lambda.

Also, notice that it includes a member variable for SSML (no it doesn't stand for Smart Speaker Markup Language, although it could. It actually stands for Speech Synthesis Markup Language) for the show's title. This proved useful for when Alexa needed a little help with the pronunciation, such as KPR Presents.

Next up: identify the name of the show that the user wants to listen to. This is trickier than it may seem because Alexa does not always "hear" what the user is saying. For example, you might tell Alexa that you want to listen to UPR for example, but Alexa might record you as saying "you pee are". That might be the extreme case, but I still found it necessary to make a structure of wrong titles with associated correct titles.

        // get the name of the program they gave to Alexa:
        var programTitle = this.event.request.intent.slots.programNameSlot.value;

        console.log('playing on-demand program: ' + programTitle);

        // correct any problems with the title string
        var titleReplacements = {
            // wrong title : correct title
            'k. p. r. presents':'kpr presents',
            'KPR presents':'kpr presents',
            'kay pee are presents':'kpr presents',
            'kay pee our presents':'kpr presents',
            'live performances':'live studio performances',
            'studio performances':'live studio performances',
            'live studio performance':'live studio performances',
            'live studio':'live studio performances',
            'performances':'live studio performances',
            'jazz':'live studio performances',
            'classical':'live studio performances',
            'live':'live studio performances',
            'metro cocktail hour':'retro cocktail hour',
            'metro cocktail':'retro cocktail hour',
            'retro cocktail':'retro cocktail hour',
            'cocktail hour':'retro cocktail hour',
            'experts attack':'when experts attack',
            'film music friday':'film music fridays',
            'film music Fridays':'film music fridays',
            'film music Friday':'film music fridays',
        };
        if (programTitle in titleReplacements) {
            programTitle = titleReplacements[programTitle];
        }

Now we could use the show objects that was defined earlier, which contains a member variable for the feed url.

        // get the show they want to hear
        var showObjs = onDemandObjects.filter(function(x){
          if(x.title == programTitle) return true;
        });
        if (showObjs.length < 1) {
          var programs = getOnDemandTitlesSSML();
          this.response.speak("I couldn't find a program called " + programTitle + ". Available programs are " + programs + ". To try again, tell me to play on demand.").listen(this.t('LAUNCH_MESSAGE_REPROMPT'));
          this.emit(':responseReady');
        }

        var showObj = showObjs[0];
        var feedURL = 'https://kansaspublicradio.org' + showObj.feed;

Notice that I used a little helper function that put the available program names into SSML to return in the Alexa response if the user needed some help with the title name. I used the array reduce() function to make it a little more elegant. Here is that function:

// make a SSML list of program titles
var getOnDemandTitlesSSML = function() {
    return onDemandObjects.reduce(function (total, obj, index, array) {
        if (index == 1) {
            return total.ssml + ', ' + obj.ssml;
        } else if (index+1 == array.length) {
            return total + ', and ' + obj.ssml;
        } else {
            return total + ', ' + obj.ssml;
        }
    });
};

Next up: make an http request for the rss xml feed and save it to memory so it can be parsed. We'll use the npm request package for this.

        // get the XML file
        var request = require("request");
        const options = {
            url: feedURL,
            timeout: 15000
        };

        request.get(options, (error, response, xml) => {
            // notice that, instead of normal anonymous function, the arrow function expression is used: () => {}
            // this is because it allows us to use the 'this' binding of the parent function inside the lambda function
            // https://stackoverflow.com/questions/20279484/how-to-access-the-correct-this-inside-a-callback

            console.log('http error:', error); // Print the error if one occurred
            console.log('http statusCode:', response && response.statusCode); // Print the response status code if a response was received

            /*
             * DO SOMETHING WITH THE response VARIABLE HERE
             */
        });

The response variable contains the xml for the rss feed. We'll need to parse through this to find the latest podcast episode and then extract the mp3 from that episode. We'll use the xml2js npm module to help with this.

            // now get the url of the mp3 for the first episode by parsing through the xml
            var parseString = require('xml2js').parseString;
            parseString(xml, (err, result) => {
                // asyncronous callback fundtion nested inside of asyncronous callback function.... 8)

                console.log('xml parse error: ' + err);
                //console.log('result:' + JSON.stringify(result));
                var onDemandMp3 = '';
                var onDemandMp3 = result.rss.channel[0].item[0].enclosure[0].$.url;
                console.log('is this it? ' + onDemandMp3);

                /*
                 * DO SOMETHING WITH THE onDemandMp3 HERE
                 */
            });

Finally, we have the mp3 url and we can build the Alexa response:

                // build the response
                if (onDemandMp3) {
                    // success
                    var words = "Now playing the latest recording of " + showObj.ssml;
                    this.response.speak(words).audioPlayerPlay("REPLACE_ALL", onDemandMp3, "1", null, 0); //(behavior, url, token, expectedPreviousToken, offsetInMilliseconds)
                    this.emit(':responseReady');
                } else {
                    // not found
                    console.log('ERROR! mp3 url not found in the xml')
                    var words = "I'm sorry but there was a problem. You can try again by telling me to play on demand, or you can try another command.";
                    this.response.speak(words).listen(this.t('LAUNCH_MESSAGE_REPROMPT'));
                    this.emit(':responseReady');
                }

ALL TOGETHER NOW - the intent handler function looks like this:

    'playOnDemandIntent': function() {

        // autodelegate must be on

        // get the name of the program they gave to Alexa:
        var programTitle = this.event.request.intent.slots.programNameSlot.value;

        console.log('playing on demand program: ' + programTitle);

        // correct any problems with the title string
        var titleReplacements = {
            'k. p. r. presents':'kpr presents',
            'KPR presents':'kpr presents',
            'kay pee are presents':'kpr presents',
            'kay pee our presents':'kpr presents',
            'live performances':'live studio performances',
            'studio performances':'live studio performances',
            'live studio performance':'live studio performances',
            'live studio':'live studio performances',
            'performances':'live studio performances',
            'jazz':'live studio performances',
            'classical':'live studio performances',
            'live':'live studio performances',
            'metro cocktail hour':'retro cocktail hour',
            'metro cocktail':'retro cocktail hour',
            'retro cocktail':'retro cocktail hour',
            'cocktail hour':'retro cocktail hour',
            'experts attack':'when experts attack',
            'film music friday':'film music fridays',
            'film music Fridays':'film music fridays',
            'film music Friday':'film music fridays',
        };
        if (programTitle in titleReplacements) {
            programTitle = titleReplacements[programTitle];
        }

        // get the show they want to hear
        var showObjs = onDemandObjects.filter(function(x){
          if(x.title == programTitle) return true;
        });
        if (showObjs.length < 1) {
          var programs = getOnDemandTitlesSSML();
          this.response.speak("I couldn't find a program called " + programTitle + ". Available programs are " + programs + ". To try again, tell me to play on demand.").listen(this.t('LAUNCH_MESSAGE_REPROMPT'));
          this.emit(':responseReady');
        }

        var showObj = showObjs[0];
        var feedURL = 'https://kansaspublicradio.org' + showObj.feed;

        // get the XML file
        var request = require("request");
        const options = {
            url: feedURL,
            timeout: 15000
        };

        request.get(options, (error, response, xml) => {
            // notice that, instead of normal annonymous function, the arrow function expression is used: () => {}
            // this is because it allows us to use the 'this' binding of the parent function inside the lambda function
            // https://stackoverflow.com/questions/20279484/how-to-access-the-correct-this-inside-a-callback

            console.log('http error:', error); // Print the error if one occurred
            console.log('http statusCode:', response && response.statusCode); // Print the response status code if a response was received

            // now get the url of the mp3 for the first episode by parsing through the xml
            var parseString = require('xml2js').parseString;
            parseString(xml, (err, result) => {
                // asyncronous callback fundtion nested inside of asyncronous callback function.... 8)

                console.log('xml parse error: ' + err);
                //console.log('result:' + JSON.stringify(result));
                var onDemandMp3 = '';
                var onDemandMp3 = result.rss.channel[0].item[0].enclosure[0].$.url;
                console.log('is this it? ' + onDemandMp3);

                // build the response
                if (onDemandMp3) {
                    // success
                    var words = "Now playing the latest recording of " + showObj.ssml;
                    this.response.speak(words).audioPlayerPlay("REPLACE_ALL", onDemandMp3, "1", null, 0); //(behavior, url, token, expectedPreviousToken, offsetInMilliseconds)
                    this.emit(':responseReady');
                } else {
                    // not found
                    console.log('ERROR! mp3 url not found in the xml')
                    var words = "I'm sorry but there was a problem. You can try again by telling me to play on demand, or you can try another command.";
                    this.response.speak(words).listen(this.t('LAUNCH_MESSAGE_REPROMPT'));
                    this.emit(':responseReady');
                }
            });
        });
    },

After all of this work, I will admit: this is not the most efficient way to do this. And in fact, there is over a second of lag time between telling Alexa what on-demand content you want to listen to, and when the response is returned and it begins playing.

It would be better and faster if the xml files could be routinely downloaded and saved to storage with automation, and then the intent response could just grab those files instead fo downloading them to memory each time. But in Lambda, which is a "serverless" environment, I just don't know if or how that would work. Perhaps that operation could be done on an external server and pushed to the lambda function via crone and the aws-cli tools.

So for now my solution is working well, even if a bit slow. But if you have any suggestions, feel free to reach out to me at my email address - dmantyla@ku.edu - or through the npr+friends slack channel. Thanks for reading!

-Danny