Programmable Voice

Programmable Voice in Fonoster allows you to control the flow of a phone call using a set of verbs. Verbs work in conjunction with the VoiceServer to create a voice application.

Overview

The following is an example of how to create an application in Fonoster using the SDK:

create-app.js

const SDK = require("@fonoster/sdk");

const client = new SDK.Client({ accessKeyId: "WO000000000000000000000000000000" });

const appConfig = {
  name: "Custom Voice App",
  type: "EXTERNAL",
  endpoint: "welcome.demo.fonoster.local", // Built-in demo application
  speechToText: {
    productRef: "stt.deepgram",
    config: {
      languageCode: "en-US"
    }
  },
  textToSpeech: {
    productRef: "tts.deepgram",
    config: {
      voice: "aura-asteria-en"
    }
  }
}

client.loginWithApiKey("AP0eerv2g7qow3e950k7twu4rvydcunq3k", "fNc...")
  .then(async() => new SDK.Applications(client).createApplication(appConfig))
  .catch(console.error);

In the example above, we created a new voice application using the SDK. The application is configured to use Deepgram for speech-to-text and text-to-speech. The application is also configured to use the “aura-asteria-en” voice for text-to-speech.

However, so far, we have only told Fonoster the speech configuration and the location of the application represented by the endpoint property.

You also need to run a VoiceServer using your application’s logic.

The Voice Server

The VoiceServer works similarly to an Express server. It accepts requests and returns responses. The VoiceServer processes verbs and executes the desired actions.

An example of running a VoiceServer in Fonoster:

voice-server.js

const VoiceServer = require("@fonoster/voice").default;

new VoiceServer().listen(async (req, response) => {
  // Verbs go here
  await response.answer();
  await response.say("Hello World!");
  await response.hangup();
});

Like with Express, you can use the request object to access information about the call. For example, you can access the caller’s phone number with req.callerNumber.

Verbs

Verbs are the building blocks of a voice application. They are used to control the flow of a phone call. Verbs are executed in the order they are called.

Here is a list of the available verbs in Fonoster:

Answer - Accepts an incoming call
Hangup - Closes the call
Play - Takes a URL with a media file and streams the sound back to the calling party
PlayDtmf - Takes a DTMF sequence and plays it back to the calling party
Say - Takes a text, synthesizes the text into audio, and streams back the result
Gather - Waits for DTMF or speech events and returns back the result
SGather - Returns a stream for future DTMF and speech results
Stream - Creates a bidirectional stream to send and receive audio from a caller
Dial - Passes the call to an Agent or a Number at the PSTN
Record - It records the voice of the calling party and saves the audio on the Storage sub-system
Mute - It tells the channel to stop sending media, effectively muting the channel
Unmute - It tells the channel to allow media flow

Run any setup code before calling the Answer verb. The Answer verb should be the first verb in your application. Similarly, the Hangup verb should be the last in your application.

Speech settings

Programmable Voice applications support a variety of speech-to-text and text-to-speech vendors. The speechToText and textToSpeech objects allow you to define the speech-to-text and text-to-speech engines to use.

You can mix and match vendors to suit your needs. For example, you can use Deepgram for speech-to-text and Google for text-to-speech. Please check the Speech Vendors section for more information on configuring speech-to-text and text-to-speech.

Exposing the VoiceServer with Ngrok

During development, you can use Ngrok to expose your VoiceServer to the internet. Ngrok creates a secure tunnel to your local machine. This allows you to test your voice application without deploying it to a server.

To use Ngrok, install it on your machine and run the following command:

ngrok tcp 50061

Replace 50061 with the port your VoiceServer is running on. Ngrok will provide you with a URL that you can use to access your VoiceServer.

See NPM for details

For full documentation, please visit NPM.

Get Started

Concepts

Guides

Self Hosting

Contributing

Programmable Voice

Overview

The Voice Server

Verbs

Speech settings

Exposing the VoiceServer with Ngrok

See NPM for details

Get Started

Concepts

Guides

Self Hosting

Contributing

​Overview

​The Voice Server

​Verbs

​Speech settings

​Exposing the VoiceServer with Ngrok

See NPM for details

Overview

The Voice Server

Verbs

Speech settings

Exposing the VoiceServer with Ngrok