Conversational AI Made Easy: Developing an ML FAQ Model Demo from Scratch Using Rasa and Docker

In today’s fast-paced digital era, conversational AI chatbots have emerged as a game-changer in delivering efficient and personalized user interactions. These artificially intelligent virtual assistants are designed to mimic human conversations, providing users with quick and relevant responses to queries.

A crucial aspect of building successful chatbots is the ability to handle frequently asked questions (FAQs) seamlessly. FAQs form a significant portion of user queries in various domains, such as customer support, e-commerce, and information retrieval. Being able to provide accurate and prompt answers to common questions not only improves user satisfaction but also frees up human agents to focus on more complex tasks. 

In this article, we’ll look at how to use the open source Rasa framework along with Docker to build and deploy a containerized, conversational AI chatbot.

Dark purple background with docker and rasa logos

Meet Rasa 

To tackle the challenge of FAQ handling, developers turn to sophisticated technologies like Rasa, an open source conversational AI framework. Rasa offers a comprehensive set of tools and libraries that empower developers to create intelligent chatbots with natural language understanding (NLU) capabilities. With Rasa, you can build chatbots that understand user intents, extract relevant information, and provide contextual responses based on the conversation flow.

Rasa allows developers to build and deploy conversational AI chatbots and provides a flexible architecture and powerful NLU capabilities (Figure 1).

Illustration of rasa architecture, showing connections between rasa sdk,  dialogue policies, nlu pipeline, agent, input/output channels, etc.
Figure 1: Overview of Rasa.

Rasa is a popular choice for building conversational AI applications, including chatbots and virtual assistants, for several reasons. For example, Rasa is an open source framework, which means it is freely available for developers to use, modify, and contribute to. It provides a flexible and customizable architecture that gives developers full control over the chatbot’s behavior and capabilities. 

Rasa’s NLU capabilities allow you to extract intent and entity information from user messages, enabling the chatbot to understand and respond appropriately. Rasa supports different language models and machine learning (ML) algorithms for accurate and context-aware language understanding. 

Rasa also incorporates ML techniques to train and improve the chatbot’s performance over time. You can train the model using your own training data and refine it through iterative feedback loops, resulting in a chatbot that becomes more accurate and effective with each interaction. 

Additionally, Rasa can scale to handle large volumes of conversations and can be extended with custom actions, APIs, and external services. This capability allows you to integrate additional functionalities, such as database access, external API calls, and business logic, into your chatbot.

Why containerizing Rasa is important

Containerizing Rasa brings several important benefits to the development and deployment process of conversational AI chatbots. Here are four key reasons why containerizing Rasa is important:

1. Docker provides a consistent and portable environment for running applications.

By containerizing Rasa, you can package the chatbot application, its dependencies, and runtime environment into a self-contained unit. This approach allows you to deploy the containerized Rasa chatbot across different environments, such as development machines, staging servers, and production clusters, with minimal configuration or compatibility issues. 

Docker simplifies the management of dependencies for the Rasa chatbot. By encapsulating all the required libraries, packages, and configurations within the container, you can avoid conflicts with other system dependencies and ensure that the chatbot has access to the specific versions of libraries it needs. This containerization eliminates the need for manual installation and configuration of dependencies on different systems, making the deployment process more streamlined and reliable.

2. Docker ensures the reproducibility of your Rasa chatbot’s environment.

By defining the exact dependencies, libraries, and configurations within the container, you can guarantee that the chatbot will run consistently across different deployments. 

3. Docker enables seamless scalability of the Rasa chatbot.

With containers, you can easily replicate and distribute instances of the chatbot across multiple nodes or servers, allowing you to handle high volumes of user interactions. 

4. Docker provides isolation between the chatbot and the host system and between different containers running on the same host.

This isolation ensures that the chatbot’s dependencies and runtime environment do not interfere with the host system or other applications. It also allows for easy management of dependencies and versioning, preventing conflicts and ensuring a clean and isolated environment in which the chatbot can operate.

Building an ML FAQ model demo application

By combining the power of Rasa and Docker, developers can create an ML FAQ model demo that excels in handling frequently asked questions. The demo can be trained on a dataset of common queries and their corresponding answers, allowing the chatbot to understand and respond to similar questions with high accuracy.

In this tutorial, you’ll learn how to build an ML FAQ model demo using Rasa and Docker. You’ll set up a development environment to train the model and then deploy the model using Docker. You will also see how to integrate a WebChat UI frontend for easy user interaction. Let’s jump in.

Getting started

The following key components are essential to completing this walkthrough:

Illustration showing connections between local folder, app folder, preinstalled rasa dependencies, etc.
Figure 2: Docker container with pre-installed Rasa dependencies and Volume mount point.

Deploying an ML FAQ demo app is a simple process involving the following steps:

  • Clone the repository
  • Set up the configuration files. 
  • Initialize Rasa.
  • Train and run the model. 
  • Bring up the WebChat UI app. 

We’ll explain each of these steps below.

Cloning the project

To get started, you can clone the repository by running the following command:
docker-ml-faq-rasa % tree -L 2
├── Dockerfile-webchat
├── actions
│   ├──
│   ├── __pycache__
│   └──
├── config.yml
├── credentials.yml
├── data
│   ├── nlu.yml
│   ├── rules.yml
│   └── stories.yml
├── docker-compose.yaml
├── domain.yml
├── endpoints.yml
├── index.html
├── models
│   ├── 20230618-194810-chill-idea.tar.gz
│   └── 20230619-082740-upbeat-step.tar.gz
└── tests
    └── test_stories.yml

6 directories, 16 files

Before we move to the next step, let’s look at each of the files one by one.

File: domain.yml

This file describes the chatbot’s domain and includes crucial data including intents, entities, actions, and answers. It outlines the conversation’s structure, including the user’s input, and the templates for the bot’s responses.

version: "3.1"

  - greet
  - goodbye
  - affirm
  - deny
  - mood_great
  - mood_unhappy
  - bot_challenge

  - text: "Hey! How are you?"

  - text: "Here is something to cheer you up:"
    image: ""

  - text: "Did that help you?"

  - text: "Great, carry on!"

  - text: "Bye"

  - text: "I am a bot, powered by Rasa."

  session_expiration_time: 60
  carry_over_slots_to_new_session: true

As shown previously, this configuration file includes intents, which represent the different types of user inputs the bot can understand. It also includes responses, which are the bot’s predefined messages for various situations. For example, the bot can greet the user, provide a cheer-up image, ask if the previous response helped, express happiness, say goodbye, or mention that it’s a bot powered by Rasa.

The session configuration sets the expiration time for a session (in this case, 60 seconds) and specifies whether the bot should carry over slots (data) from a previous session to a new session.

File: nlu.yml

The NLU training data are defined in this file. It includes input samples and the intents and entities that go with them. The NLU model, which connects user inputs to the right actions, is trained using this data.

version: "3.1"

- intent: greet
  examples: |
    - hey
    - hello
    - hi
    - hello there
    - good morning
    - good evening
    - moin
    - hey there
    - let's go
    - hey dude
    - goodmorning
    - goodevening
    - good afternoon

- intent: goodbye
  examples: |
    - cu
    - good by
    - cee you later
    - good night
    - bye
    - goodbye
    - have a nice day
    - see you around
    - bye bye
    - see you later

- intent: affirm
  examples: |
    - yes
    - y
    - indeed
    - of course
    - that sounds good
    - correct

- intent: deny
  examples: |
    - no
    - n
    - never
    - I don't think so
    - don't like that
    - no way
    - not really

- intent: mood_great
  examples: |
    - perfect
    - great
    - amazing
    - feeling like a king
    - wonderful
    - I am feeling very good
    - I am great
    - I am amazing
    - I am going to save the world
    - super stoked
    - extremely good
    - so so perfect
    - so good
    - so perfect

- intent: mood_unhappy
  examples: |
    - my day was horrible
    - I am sad
    - I don't feel very well
    - I am disappointed
    - super sad
    - I'm so sad
    - sad
    - very sad
    - unhappy
    - not good
    - not very good
    - extremly sad
    - so saad
    - so sad

- intent: bot_challenge
  examples: |
    - are you a bot?
    - are you a human?
    - am I talking to a bot?
    - am I talking to a human?

This configuration file defines several intents, which represent different types of user inputs that the chatbot can recognize. Each intent has a list of examples, which are example phrases or sentences that users might type or say to express that particular intent.

You can customize and expand upon this configuration by adding more intents and examples that are relevant to your chatbot’s domain and use cases.

File: stories.yml

This file is used to define the training stories, which serve as examples of user-chatbot interactions. A series of user inputs, bot answers, and the accompanying intents and entities make up each story.

version: "3.1"


- story: happy path
  - intent: greet
  - action: utter_greet
  - intent: mood_great
  - action: utter_happy

- story: sad path 1
  - intent: greet
  - action: utter_greet
  - intent: mood_unhappy
  - action: utter_cheer_up
  - action: utter_did_that_help
  - intent: affirm
  - action: utter_happy

- story: sad path 2
  - intent: greet
  - action: utter_greet
  - intent: mood_unhappy
  - action: utter_cheer_up
  - action: utter_did_that_help
  - intent: deny
  - action: utter_goodbye

The stories.yml file you provided contains a few training stories for a Rasa chatbot. These stories represent different conversation paths between the user and the chatbot. Each story consists of a series of steps, where each step corresponds to an intent or an action.

Here’s a breakdown of steps for the training stories in the file:

Story: happy path

  • User greets with an intent: greet
  • Bot responds with an action: utter_greet
  • User expresses a positive mood with an intent: mood_great
  • Bot acknowledges the positive mood with an action: utter_happy

Story: sad path 1

  • User greets with an intent: greet
  • Bot responds with an action: utter_greet
  • User expresses an unhappy mood with an intent: mood_unhappy
  • Bot tries to cheer the user up with an action: utter_cheer_up
  • Bot asks if the previous response helped with an action: utter_did_that_help
  • User confirms that it helped with an intent: affirm
  • Bot acknowledges the confirmation with an action: utter_happy

Story: sad path 2

  • User greets with an intent: greet
  • Bot responds with an action: utter_greet
  • User expresses an unhappy mood with an intent: mood_unhappy
  • Bot tries to cheer the user up with an action: utter_cheer_up
  • Bot asks if the previous response helped with an action: utter_did_that_help
  • User denies that it helped with an intent: deny
  • Bot says goodbye with an action: utter_goodbye

These training stories are used to train the Rasa chatbot on different conversation paths and to teach it how to respond appropriately to user inputs based on their intents.

File: config.yml

The configuration parameters for your Rasa project are contained in this file.

# The config recipe.
recipe: default.v1

# The assistant project unique identifier
# This default value must be replaced with a unique assistant name within your deployment
assistant_id: placeholder_default

# Configuration for Rasa NLU.
language: en

# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See for more information.
#   - name: WhitespaceTokenizer
#   - name: RegexFeaturizer
#   - name: LexicalSyntacticFeaturizer
#   - name: CountVectorsFeaturizer
#   - name: CountVectorsFeaturizer
#     analyzer: char_wb
#     min_ngram: 1
#     max_ngram: 4
#   - name: DIETClassifier
#     epochs: 100
#     constrain_similarities: true
#   - name: EntitySynonymMapper
#   - name: ResponseSelector
#     epochs: 100
#     constrain_similarities: true
#   - name: FallbackClassifier
#     threshold: 0.3
#     ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See for more information.
#   - name: MemoizationPolicy
#   - name: RulePolicy
#   - name: UnexpecTEDIntentPolicy
#     max_history: 5
#     epochs: 100
#   - name: TEDPolicy
#     max_history: 5
#     epochs: 100
#     constrain_similarities: true

The configuration parameters for your Rasa project are contained in this file. Here is a breakdown of the configuration file:

1. Assistant ID:

  • Assistant_id: placeholder_default
  • This placeholder value should be replaced with a unique identifier for your assistant.

2. Rasa NLU configuration:

  • Language: en
    • Specifies the language used for natural language understanding.
  • Pipeline:
    • Defines the pipeline of components used for NLU processing.
    • The pipeline is currently commented out, and the default pipeline is used.
    • The default pipeline includes various components like tokenizers, featurizers, classifiers, and response selectors.
    • If you want to customize the pipeline, you can uncomment the lines and adjust the pipeline configuration.

3. Rasa core configuration:

  • Policies:
    • Specifies the policies used for dialogue management.
    • The policies are currently commented out, and the default policies are used.
    • The default policies include memoization, rule-based, and TED (Transformer   Embedding Dialogue) policies
    • If you want to customize the policies, you can uncomment the lines and adjust the policy configuration.


The custom actions that your chatbot can execute are contained in this file. Retrieving data from an API, communicating with a database, or doing any other unique business logic are all examples of actions.

# This files contains your custom actions which can be used to run
# custom Python code.
# See this guide on how to implement these action:

# This is a simple example for a custom action which utters "Hello World!"

# from typing import Any, Text, Dict, List
# from rasa_sdk import Action, Tracker
# from rasa_sdk.executor import CollectingDispatcher
# class ActionHelloWorld(Action):
#     def name(self) -> Text:
#         return "action_hello_world"
#     def run(self, dispatcher: CollectingDispatcher,
#             tracker: Tracker,
#             domain: Dict[Text, Any]) -> List[Dict[Text, Any]]:
#         dispatcher.utter_message(text="Hello World!")
#         return []

Explanation of the code:

  • The ActionHelloWorld class extends the Action class provided by the rasa_sdk.
  • The name method defines the name of the custom action, which in this case is “action_hello_world”.
  • The run method is where the logic for the custom action is implemented.
  • Within the run method, the dispatcher object is used to send a message back to the user. In this example, the message sent is “Hello World!”.
  • The return [] statement indicates that the custom action has completed its execution.

File: endpoints.yml

The endpoints for your chatbot are specified in this file, including any external services or the webhook URL for any custom actions.  

Initializing Rasa

This command initializes a new Rasa project in the current directory ($(pwd)):

docker run  -p 5005:5005 -v $(pwd):/app rasa/rasa:3.5.2 init --no-prompt

It sets up the basic directory structure and creates essential files for a Rasa project, such as config.yml, domain.yml, and data/nlu.yml. The -p flag maps port 5005 inside the container to the same port on the host, allowing you to access the Rasa server. <IMAGE>:3.5.2 refers to the Docker image for the specific version of Rasa you want to use.

Training the model

The following command trains a Rasa model using the data and configuration specified in the project directory:

docker run -v $(pwd):/app rasa/rasa:3.5.2 train --domain domain.yml --data data --out models

The -v flag mounts the current directory ($(pwd)) inside the container, allowing access to the project files. The --domain domain.yml flag specifies the domain configuration file, --data data points to the directory containing the training data, and --out models specifies the output directory where the trained model will be saved.

Running the model

This command runs the trained Rasa model in interactive mode, enabling you to test the chatbot’s responses:

docker run -v $(pwd):/app rasa/rasa:3.5.2 shell

The command loads the trained model from the models directory in the current project directory ($(pwd)). The chatbot will be accessible in the terminal, allowing you to have interactive conversations and see the model’s responses.

Verify Rasa is running:

curl localhost:5005
Hello from Rasa: 3.5.2

Now you can send the message and test your model with curl:

curl --location 'http://localhost:5005/webhooks/rest/webhook' \
--header 'Content-Type: application/json' \
--data '{
 "sender": "Test person",
 "message": "how are you ?"}'

Running WebChat

The following command deploys the trained Rasa model as a server accessible via a WebChat UI:

docker run -p 5005:5005 -v $(pwd):/app rasa/rasa:3.5.2 run -m models --enable-api --cors "*" --debug

The -p flag maps port 5005 inside the container to the same port on the host, making the Rasa server accessible. The -m models flag specifies the directory containing the trained model. The --enable-api flag enables the Rasa API, allowing external applications to interact with the chatbot. The --cors "*" flag enables cross-origin resource sharing (CORS) to handle requests from different domains. The --debug flag enables debug mode for enhanced logging and troubleshooting.

docker run -p 8080:80 harshmanvar/docker-ml-faq-rasa:webchat

Open http://localhost:8080 in the browser (Figure 3).

Screenshot of sample chatbot conversation in browser.
Figure 3: WebChat UI.

Defining services using a Compose file

Here’s how our services appear within a Docker Compose file:

    image: rasa/rasa:3.5.2
      - 5005:5005
      - ./:/app
    command: run -m models --enable-api --cors "*" --debug
    image: harshmanvar/docker-ml-faq-rasa:webchat 
      context: .
      dockerfile: Dockerfile-webchat
      - 8080:80

Your sample application has the following parts:

  • The rasa service is based on the rasa/rasa:3.5.2 image. 
  • It exposes port 5005 to communicate with the Rasa API. 
  • The current directory (./) is mounted as a volume inside the container, allowing the Rasa project files to be accessible. 
  • The command run -m models --enable-api --cors "*" --debug starts the Rasa server with the specified options.
  • The webchat service is based on the harshmanvar/docker-ml-faq-rasa:webchat image. It builds the image using the Dockerfile-webchat file located in the current context (.). Port 8080 on the host is mapped to port 80 inside the container to access the webchat interface.

You can clone the repository or download the docker-compose.yml file directly from GitHub.

Bringing up the container services

You can start the WebChat application by running the following command:

docker compose up -d --build

Then, use the docker compose ps command to confirm that your stack is running properly. Your terminal will produce the following output:

docker compose ps
NAME                            IMAGE                                  COMMAND                  SERVICE             CREATED             STATUS              PORTS
docker-ml-faq-rassa-rasa-1      harshmanvar/docker-ml-faq-rasa:3.5.2   "rasa run -m models …"   rasa                6 seconds ago       Up 5 seconds>5005/tcp
docker-ml-faq-rassa-webchat-1   docker-ml-faq-rassa-webchat            "/docker-entrypoint.…"   webchat             6 seconds ago       Up 5 seconds>80/tcp

Viewing the containers via Docker Dashboard

You can also leverage the Docker Dashboard to view your container’s ID and easily access or manage your application (Figure 4):

Screenshot showing running containers in docker dashboard.
Figure 4: View containers with Docker Dashboard.


Congratulations! You’ve learned how to containerize a Rasa application with Docker. With a single YAML file, we’ve demonstrated how Docker Compose helps you quickly build and deploy an ML FAQ Demo Model app in seconds. With just a few extra steps, you can apply this tutorial while building applications with even greater complexity. Happy developing.

Check out Rasa on DockerHub.


0 thoughts on "Conversational AI Made Easy: Developing an ML FAQ Model Demo from Scratch Using Rasa and Docker"