Giving a Voice to Rasa with Botium Speech

Voice platforms like Alexa and Google Assistant make it easy to build your own voice experience, even without going deeper in audio processing — everything is part of the platform. But what if you want to rather go for a solution hosted by yourself, running an assistant on your own website, in your own infrastructure ?

The goal of this article is to show a way how you can build your own voice platform using Open Source tools Rasa and Botium Speech Processing.

Image for post
Image for post
An awesome isometric visualization

Rasa is a developer-friendly and extensible chatbot building tool for self-hosting. Botium Speech Processing is a unified, developer-friendly API to the best available free and Open-Source Speech-To-Text and Text-To-Speech services. Let’s combine this, but first let’s quickly have a look on the architecture.

Architecture

  1. User speaks into a microphone
  2. A Speech-To-Text service translates into text (Botium Speech Processing)
  3. An NLU engine extracts information out of the text (Rasa)
  4. A dialogue engine builds text response (Rasa)
  5. A Text-To-Speech service translates into spoken text (Botium Speech Processing)
  6. User listens to the audio file
Image for post
Image for post
Picture is from this Rasa blog post

Installation Steps

So let’s come to the fun part.

Prerequisites

Here is what you need to have available on your workstation:

  • Git client
  • Docker and Docker-Compose

Launch Botium Speech Processing Service

Botium Speech Processing comes with a reasonable default configuration.

Both of them are free and Open Source and a good match to get started with voice technologies, on the other hand they are without a doubt among the best free voice tools available.

Launching it can be done with a few command line calls.

$ git clone https://github.com/codeforequity-at/botium-speech-processing.git
$ cd botium-speech-processing
$ docker-compose up -d

Depending on network speed and hardware this step can take a while.

Pointing your browser to http://localhost will show the API explorer for Botium Speech Processing.

Setup Rasa

We will use Sara, the Rasa Demo Bot, as an example.

You can find first-hand information from the Github repository

I prefer to use Docker instead of installing everything locally. So you can use these command line calls to download the Rasa demo bot and run a first training:

$ git clone https://github.com/RasaHQ/rasa-demo.git
$ cd rasa-demo
$ docker run --rm -v .:/app rasa/rasa:latest-full train --domain domain.yml --data data/core data/nlu --out models/dialogue --augmentation 0

Depending on network speed and hardware this step can take a while.

Place this docker-compose.yml file into the Rasa folder:

version: '3.0'
services:
rasa:
image: rasa/rasa:latest-full
ports:
- 5005:5005
volumes:
- ./:/app
environment:
RASA_DUCKLING_HTTP_URL: http://rasa-duckling:8000
command: run --model models/dialogue --endpoints endpoints.yml
rasa-actions:
build:
context: .
ports:
- 5055:5055
rasa-duckling:
image: rasa/duckling
ports:
- 8000:8000

In the file endpoints.yml change the actions endpoint url from http://localhost:5055/webhook to http://rasa-actions:5055/webhook. Now launch the Rasa service:

$ docker-compose up -d

The Rasa service is now waiting for connections.

Add Voice Capabilities to Rasa

This Github repository includes a custom connector based on the Rasa builtin Socket.io-connector which adds Speech-To-Text and Text-To-Speech capabilities to Rasa.

First, clone the repository and copy the connectors folder to the Rasa folder:

$ git clone https://github.com/codeforequity-at/botium-speech-processing.git
$ cd botium-speech-processing
$ cp -R connectors <rasa-dir>

In the file connectors/rasa/credentials.yml, there is a sample configuration for the Rasa custom connector.

You can either use this file directly or copy the configuration of the botium.SocketIOVoiceInput connector to your existing Rasa credentials.yml

Change the file to point to your local workstation for speech processing (it also starts a REST connector for convenience and other tests):

botium.SocketIOVoiceInput:
socketio_path: /socket.io
user_message_evt: user_uttered
bot_message_evt: bot_uttered
session_persistence: false
botium_speech_url: http://localhost
botium_speech_apikey:
botium_speech_language: en
botium_speech_voice: dfki-poppy-hsmm
rest:

Then, change the docker-compose.yml file for Rasa to use this connector.

version: '3.0'
services:
rasa:
image: rasa/rasa:latest-full
ports:
- 5005:5005
volumes:
- ./:/app
environment:
PYTHONPATH: "/app/connectors/rasa:/app"
RASA_DUCKLING_HTTP_URL: http://rasa-duckling:8000
command: run --cors "*" --credentials /app/connectors/rasa/credentials.yml --enable-api --model models/dialogue --endpoints endpoints.yml
rasa-actions:
build:
context: .
ports:
- 5055:5055
rasa-duckling:
image: rasa/duckling
ports:
- 8000:8000

Restart Rasa to make the changes to your Docker containers.

$ docker-compose up -d

Testing

There is a simple test client based on the Rasa Voice Interface available in the Botium Speech Processing project.

In the connectors/rasa/client directory, change the Rasa endpoint in the docker-compose.yml file:

version: '3'
services:
frontend:
build:
context: .
args:
RASA_ENDPOINT: http://localhost:5005
RASA_PATH: /socket.io
PUBLIC_PATH: /
image: botium/botium-speech-rasa-voice
restart: always
ports:
- 4700:8080

Then launch the website with “docker-compose up -d” and access the web interface at http://localhost:4700 to give a chat to your Rasa chatbot.

Image for post
Image for post
The voice interface in action

Now it is time to run on your microphone and speakers and have a chat with Rasa!

Co-Founder and CTO Botium🤓 — Guitarist 🎸 — 3xFather 🐣

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store