Saturday, December 8, 2018

How Can You Build Your Own AI Assistant Using

Leave a Comment
How Can You Build Your Own AI Assistant Using

If you are interested in learning more about AI, then for your AI AI feeling, see our screencast Microsoft Cognitive Services and Text Analytics API.

How Can You Build Your Own AI Assistant Using

Artificially increasing the world of intelligent assistants - Siri, Cortana, Alexa, OK Google, Facebook M, Bixby - all big players of technology have their own. However, many developers do not realize that making your own AI assistant is easy! You can customize it to your own needs, your own IOT connected device, in your own custom API. sky's the limit.

Note: This article was updated in 2017 so that the recent changes can be reflected.

Earlier, in 2016 I made a guide on five simple methods of making artificial intelligence, where I included some simple options to make AI assistants. In this article, I want to see a special service that makes incredibly easy to get a fully featured AI assistant with very little initial set up -

Constructing an AI Assistant through

This post is one of a chains of articles aimed to help you get a simple personal assistant running with API. One of the series is:...

  1. How to Create Your Own AI Assistant Using (this one!).
  2. Customizing Your Assistant with Intent and Context.
  3. Empowering Your Assistant with Entities.
  4. What is the way to Connect Your Assistant to the IoT.
Know What is is a service that allows developers to build speech-to-text, natural language processing, artificial intelligent systems, which you can train with your own custom functionality. They have a series of existing knowledge bases that the systems created with are automatically called "domains" - which we will focus on in this article. The domain encyclopedia provide a thorough knowledge base of knowledge, language translation, weather and more. In future articles, I will cover some more advanced aspects of which allow you to further your assistant.

Let's Getting Started With
To begin, we will go to the website and click on the "Start Free" button or "Sign up free" button in the top right corner.

Then we are taken to the registration form which is very simple: Enter your name, email and password and click "Sign up". For those who avoid another set of login credentials, you can also use the right button to sign up using your account or Google account.

Since Google was purchased by, it is fully migrated to use Google Accounts to log in. So if you are new to, you will need to sign in with your Google Account:

Click on Allow on screen to grant access to your Google account:

You will also need to read and agree to your Terms of Service:

Once signed up, you will be taken directly to the API interface where you can make your virtual AI assistant. Each assistant whom you create and teach specific skills, the APIs In the "agent" is called. Therefore, to begin, you create your first agent by clicking on the "Create Agent" button on the left-hand side:

It is to be needed to authorize again to have additional permissions for your Google Account. It's normal and OK! Click "Authorize" to continue:

And allow:

On the next screen, enter in your agent's description, which includes the following:

Name: It is for your own reference to isolate agents in the interface. You can call the agent whatever you like - either the person's name (I have chosen Barry) or the name that represents the work they are supporting (such as the light controller).

Description: A human readable description so that you can remember what is responsible for the agent. This is optional and if your agent's name is self-explanatory, then it may not be necessary.

Language: The language in which the agent works. After selecting it, it can not be changed - so choose wisely! For this tutorial, choose English, because English has access to the most domains. You can see which domains are available for each language in the language table in documents.

Timezone: As you'd expect, it's a timezone for your agent. Chances are this will have already detected your current timezone.

It will also automatically set up the Google Cloud Platform project for your agent, so you do not need to do anything in this regard; It's all automatic! It is good to know that this is happening, however, if you make many trials and many agents, just know that many Google Cloud Platform projects are being created which you want to clear a few days.

When you input your agent's settings, select "Save" next to the agent's name to save everything:

The Best Test Console:
Once you become your agent, you can check it with the test console on the right. You can enter the questions at the top and send it to your agent, who will tell you what will be done by listening to those statements. Enter a question like "How are you?" And see what it returns. Your results should appear below:

If you scroll down on the right side of results, you'll see more details for how interpreted your request (as seen in the above screenshot). Below that, there is a button called "Show JSON". Click here to see how the API will return.... will open JSON viewer and show you a JSON response that looks like this:

   "id": "21345678",
   "timestamp": "2017-05-12T08: 04: 49.031Z",
   "lang": "en",
   "result": {
     "source": "agent",
     "resolvedQuery": "How are you?",
     "action": "input.unknown",
     "actionIncomplete": false,
     "parameters": {},
     "contexts": [],
     "metadata": {
       "intentId": "6320071",
       "webhookUsed": "false",
       "webhookForSlotFillingUsed": "false",
       "intentName": "Default Fallback Intent"
     "fulfillment": {
       "speech": "Sorry, can you say that again?",
       "messages": [
           "type": 0,
           "speech": "Sorry, could you say that again?"
     "score": 1
   "status": {
     "code": 200,
     "errorType": "success"
   "sessionId": "243c"
As you will see ... your agent does not know how to answer! Right now, this is not exactly "intelligent" artificial intelligence: it should still be added to the intelligence bit. Input field input. The unknown value tells you that it is not sure how to move forward. Above, "Sorry, can you call it again?" A message is returning, which is one of its default fallbacks. Rather than telling people that it does not understand, it just asks them to say again ... more and more. It's not ideal, and I want to convert it to something that does not understand the bot. If you like about such a thing and want to change what you say here, you can find it on the "intentions" page by clicking on the "Default Fallback Intent" item there.

A note for those who had used some time ago (or seen it in action): you were actually hoping to be slightly more available outside the box. First, by default, "Who is Steve Jobs?" As was able to answer questions. This is no longer the case! You need to add your own integration with third party APIs to take action and source information. provides sentences for the parsing of the sentence and the interpretation of things.

Adding Small Task with

There is a default functionality that you can add that gives your bot a small indication of intelligence - the "little thing" feature. This provides answers to commonly asked questions ... "How are you?" Including However this is not turned on by default. To turn it on, go to the "Little Thing" menu item on the left and click on "Enable".

After enabling, if you scroll down, you can see a series of categories of normal small talk phrases.There we are to find the "Hello / Goodbye" section and click on it to expand it. Question "How are you?" Add some different responses to the question and then click "Save" on the top right. Upon adding the phrase, you will see a percentage figure next to the "Hello / Bye" section, to show how much you have optimized your chatbot.

If you then go to the test console and ask him "How are you?" Again, now you should answer with one of the responses you have entered!

If it does not respond properly, check that you actually clicked "Save" before going! It does not save automatically.

Ideally, you would like to customize as many small talk responses as you can: This is what will give your API another unique personality. You can choose the tone and structure of your reactions. Is this a mess chatbot that hates people to talk to? Is this chatbot surrounded by cats? Or maybe a chatbot that answers in the pre-teen internet / text talk? You decide!

Now that you have at least a few small talk elements running, your agent is now ready to integrate you into your web app interface. To do this, you will need to get your API key to give your agent remote access.

 Find your API keys:

The API key you need will be on the agent's settings page. To find it, click the cog icon next to the name of your agent. Copy and paste "Client Access Token" anywhere on the page that appears. That is, we will need to ask questions for the service of APIs:

The Code:
If you want to see the working code and play with it, then this is available on the dot. Feel free to use it and expand on the idea of your own AI personal assistant.

If you can try it, then I am running a baritone. Enjoy it!

Connecting to Using through JavaScript:
You currently have a working personal assistant who is running anywhere in the API's cloud. Now you need a way to talk to your personal interface from your personal interface. has a series of Platform SDKs that work with Android, iOS, Web app, Unity, Cordova, C ++ and more. You can integrate it into Slack bot or Facebook Messenger bot too! For this example, you will use HTML and Javascript to create a simple personal assistant web app. My demo creates the concepts shown in's HTML + JS Gist.

Your app will do the following:

  • Accept the written commands in the input field, when you press the Enter key, submit that command.
  • Or, using the HTML5 Speech Recognition API (this works only on Google Chrome 25 and above), if the user clicks on "say", they can speak their orders and they will be automatically in the input field Can be written in the form.
  • Once the order is received, you can use jQuery to submit AJAX POST request on will return your knowledge as a JSON object, as you saw in the above test console.
  • You will read in that JSON file using Javascript and display results on your web app.
  • If available, your web app will also use the Web Speech API (available in Google Chrome 33 and above) to give you verbal feedback.
The entire web app is available on the given link above. Feel free to see how I have styled things and have structured HTML. I will mainly explain each piece kept in this article, focusing on the sides of the SDK. I will also tell you which bits are briefly using the HTML5 Speech Recognition API and Web Speech API.

Your javascript contains the following variables:
var accessToken = "YOURACCESSTOKEN",
    baseUrl = "",
    messageRecording = "Recording...",
    messageCouldntHear = "I could not hear you, could you say that again?", 
    messageInternalError = "Oh no, there has been an internal server error",
    messageSorry = "I'm sorry, I don't have the answer to that yet.";
Here's what for each of these:

  • access token. This is the API key that you have copied from the API interface. These allow you to access the SDK and also say which agent you are accessing. I want to use Barry, my personal agent.
  • baseurl. This is the base URL for all calls of SDK. If a new version of SDK comes, you can update it here.
  • $ SpeechInput It stores your <Input> element so that you can access it in your JavaScript.
  • $ RecBtn It stores your <Button> element that you will use when users want to click and instead want to talk to the web app.
  • Recognition You are to store your webkitSpeechRecognition () functionality in this variable. This is for the HTML5 speech recognition API.
  • Message Recording, Message Could NotHear, Messages InternalError and Message Souri. There are messages to show this message when the app is recording the user's voice, when you have an internal error, and if your agent does not understand, they can not hear their voice. You store these as variables so that you can easily change them at the top of your script, and also that you can specify which app you do not want to talk aloud later.

In these lines of code, see when the user presses the Enter key in the input field. When this happens, send the Send () function to send data to

$speechInput.keypress(function(event) {
  if (event.which == 13) {
After this, if the user clicks on the recording button to tell the app to listen (or stop listening to it if listening). If they click on it, switch the switch recognition () function to switch to switching from the recording and vice versa:

$recBtn.on("click", function(event) {
Finally, for your initial jQuery setup, you set up a button that will show and hide the JSON response at the bottom right of your screen. It's only to keep things clean: Most of the time you do not want to look at JSON data, but every now and then some are unexpected, you can click this button to see if JSON is viewable or not:

$(".debug__btn").on("click", function() {
  return false;

Using the HTML5 Speech Recognition API Technique:

As mentioned above, you will use the HTML5 Speech Recognition API to listen to the user and transcribe the words written by them. It currently works in Google Chrome.

Our starting detection () function looks like this:

function startRecognition() {
  recognition = new webkitSpeechRecognition();

  recognition.onstart = function(event) {
  recognition.onresult = function(event) {
    recognition.onend = null;

    var text = "";
    for (var i = event.resultIndex; i 
It runs the HTML5 Speech Recognition API. It uses functions within all webkitSpeechRecognition (). Here are some pointers for what's happening:

  • recognition.onstart. Runs while recording from a user's microphone. You tell your message to the user using the feedback () function that you are listening to. I will cover the feedback () function in more detail soon. updateRec () switches the text from "Pause" to "Speak" for your recording button.

  • recognition.onresult. When you have a result from voice recognition then it moves. You parse the result and set your text field to use that result through the set text (this function only adds text to the input field and then runs your sender () function).

  • recognition.onend Voice recognition runs on end. If you get successful results, you can set it to recognize it to prevent it from running. In this way, if the recognition. If you run online, then you know that the Voice Recognition API user has not understood. If the function runs, you answer to tell the user that you have not heard them correctly.

  • identity.lang Sets the language you are looking for In the case of demo, it is looking for US English.

  • recognition.start (). The whole process begins!

Your Stop Recognition () function is very easy. This prevents your identity and sets it to zero. Then, the button updates to show that you are no longer recording:
function stopRecognition() {
  if (recognition) {
    recognition = null;
Switch recognition () Toggles whether you are starting recognition or stopping recognition by checking the identity variable. This allows your button to turn on and off recognition:
function switchRecognition() {
  if (recognition) {
  } else {

Communicating Through

To send your query to, you use the Send () function that looks like this:

function send() {
  var text = $speechInput.val();
    type: "POST",
    url: baseUrl + "query",
    contentType: "application/json; charset=utf-8",
    dataType: "json",
    headers: {
      "Authorization": "Bearer " + accessToken
    data: JSON.stringify({query: text, lang: "en", sessionId: "runbarry"}),

    success: function(data) {
    error: function() {

This is a general AJAX POST request on using jQuery. You make sure that you are sending JSON data and expecting JSON data from it. To be your API key for, you also need to set two headers - authorization and OCP-API-subscription-key. You send in your data format in {q: text, lang: "en"} and wait for the response.

When you receive a response, you run readyResponse (). In this function, you format the JSON string that you will insert into your debug section of the web app and you will take the result of the response of, which will give you the text response of your assistant. You display each message through the response () and debugRespond ():
function prepareResponse(val) {
  var debugJSON = JSON.stringify(val, undefined, 2),
      spokenResponse = val.result.speech;

Your debug response () function places text in your field for a JSON response:
function debugRespond(val) {
Your response () has some more steps in your answer () function:
function respond(val) {
  if (val == "") {
    val = messageSorry;

  if (val !== messageRecording) {
    var msg = new SpeechSynthesisUtterance();
    var voices = window.speechSynthesis.getVoices();
    msg.voiceURI = "native";
    msg.text = val;
    msg.lang = "en-US";

In the beginning, you check to see if the response value is empty or not. If so, then you set it to say that it is not certain about the answer to that question, because has not given you a valid response:
if (val == "") {
  val = messageSorry;
If you do have a message to output and it is not a saying that you're recording, then you use the Web Speech API to say the speech out of the speech. SynthesisUtterance object. I found that without setting of voice URI and language, my browser's default voice was German! This made its speech You can use the window.speech Synthesis.speak (msg) function:...
if (val !== messageRecording) {
  var msg = new SpeechSynthesisUtterance();
  msg.voiceURI = "native";
  msg.text = val;
  msg.lang = "en-US";
Note: It is important not to talk about this "recording ..." text: if you do, the microphone will lift that speech and add it to the recorded query.

Finally, display your feedback box and add that text to it so that the user can also read it:
Hosting Your Web Interface is important:
For best results, you may need to host it on an HTTPS enabled web server. Your request for is more than HTTPS, so it is also better to host your web interface on HTTPS. If you are looking to use it as a prototype and you do not have an easily available HTTPS secure web server, then try! This is a new service that can host the code snippet, which includes both front end and back-end (node.js) code.

For example, Barry is also hosted at Hosting at this time is completely free of charge! This is a great service and I recommend going to it.

If you want to make this a big project, then either encrypt it for a free SSL / TLS certificate or to purchase it from your web host.

In Action:
If you run a web app using your styles within the GuitHub repo, something looks like this:

If you click on Speak and How are you? If you ask a question by clicking on it, in the beginning it shows that you are recording:

(When you click on that button, you may need to allow Chrome to access your microphone, of course it will be as long as you do not serve the page as HTTPS.)

After this it gives a visual feedback (and also speaks, which is difficult to show in the screenshot) such as:

You can also click on the button at the bottom right to see the JSON response gave it to you, just if you want to debug the result:

If you were in the first time "I could not hear you, can you say it again?" Then message will check your microphone permissions in your browser. If you are loading pages locally (for example, if your address bar starts with file: ///), Chrome does not give any access to the microphone, and thus you will not make any difference with this error ! You will need to host it somewhere. (Try described above.)

Personally, I'm not a fan of some of the smallest things, like this:

I have optimized one of those groups in those settings that we had seen before. For example, I found this little talk statement quite awkward in the list and it was decided to adapt it like this:

So get out of there and customize your own chatbot! Make it unique and have fun!

Having Issues?

I found that sometimes, if the Web Speech API tried to say a little longer, then Chrome's speech stopped working. If this is the case appear for you, close the tab and open a new one to try again.

Remarkable Conclusion:
As I'm sure you can see, API chatbot-style AI is a very easy way to run and run personal assistant.

Want to develop your bot? Anything can be done: Here is the full series written on the allinonedownload.

If you make your own personal assistant using, then I would love to hear about it! Did you name your barrier? What questions have you established for this? Let me know in the comments below.

If You Enjoyed This, Take 5 Seconds To Share It


Post a Comment