AI/ML – Docker https://www.docker.com Tue, 11 Jul 2023 19:53:09 +0000 en-US hourly 1 https://wordpress.org/?v=6.2.2 https://www.docker.com/wp-content/uploads/2023/04/cropped-Docker-favicon-32x32.png AI/ML – Docker https://www.docker.com 32 32 Optimizing Deep Learning Workflows: Leveraging Stable Diffusion and Docker on WSL 2 https://www.docker.com/blog/stable-diffusion-and-docker-on-wsl2/ Tue, 11 Jul 2023 14:15:00 +0000 https://www.docker.com/?p=43799 Deep learning has revolutionized the field of artificial intelligence (AI) by enabling machines to learn and generate content that mimics human-like creativity. One advancement in this domain is Stable Diffusion, a text-to-image model released in 2022. 

Stable Diffusion has gained significant attention for its ability to generate highly detailed images conditioned on text descriptions, thereby opening up new possibilities in areas such as creative design, visual storytelling, and content generation. With its open source nature and accessibility, Stable Diffusion has become a go-to tool for many researchers and developers seeking to harness the power of deep learning. 

In this article, we will explore how to optimize deep learning workflows by leveraging Stable Diffusion alongside Docker on WSL 2, enabling seamless and efficient experimentation with this cutting-edge technology.

Dark purple background with the Docker logo in the bottom left corner and a paint palette in the center

In this comprehensive guide, we will walk through the process of setting up the Stable Diffusion WebUI Docker, which includes enabling WSL 2 and installing Docker Desktop. You will learn how to download the required code from GitHub and initialize it using Docker Compose

The guide provides instructions on adding additional models and managing the system, covering essential tasks such as reloading the UI and determining the ideal location for saving image output. Troubleshooting steps and tips for monitoring hardware and GPU usage are also included, ensuring a smooth and efficient experience with Stable Diffusion WebUI (Figure 1).

Screenshot of Stable Diffusion WebUI showing five different cat images.
Figure 1: Stable Diffusion WebUI.

Why use Docker Desktop for Stable Diffusion?

In the realm of image-based generative AI, setting up an effective execution and development environment on a Windows PC can present particular challenges. These challenges arise due to differences in software dependencies, compatibility issues, and the need for specialized tools and frameworks. Docker Desktop emerges as a powerful solution to tackle these challenges by providing a containerization platform that ensures consistency and reproducibility across different systems.

By leveraging Docker Desktop, we can create an isolated environment that encapsulates all the necessary components and dependencies required for image-based generative AI workflows. This approach eliminates the complexities associated with manual software installations, conflicting library versions, and system-specific configurations.

Using Stable Diffusion WebUI

The Stable Diffusion WebUI is a browser interface that is built upon the Gradio library, offering a convenient way to interact with and explore the capabilities of Stable Diffusion. Gradio is a powerful Python library that simplifies the process of creating interactive interfaces for machine learning models.

Setting up the Stable Diffusion WebUI environment can be a tedious and time-consuming process, requiring multiple steps for environment construction. However, a convenient solution is available in the form of Stable Diffusion WebUI Docker project. This Docker image eliminates the need for manual setup by providing a preconfigured environment.

If you’re using Windows and have Docker Desktop installed, you can effortlessly build and run the environment using the docker-compose command. You don’t have to worry about preparing libraries or dependencies beforehand because everything is encapsulated within the container.

You might wonder whether there are any problems because it’s a container. I was anxious before I started using it, but I haven’t had any particular problems so far. The images, models, variational autoencoders (VAEs), and other data that are generated are shared (bind mounted) with my Windows machine, so I can exchange files simply by dragging them in Explorer or in the Files of the target container on Docker Desktop. 

The most trouble I had was when I disabled the extension without backing it up, and in a moment blew away about 50GB of data that I had spent half a day training. (This is a joke!)

Architecture

I’ve compiled a relatively simple procedure to start with Stable Diffusion using Docker Desktop on Windows. 

Prerequisites:

  • Windows 10 Pro, 21H2 Build 19044.2846
  • 16GB RAM
  • NVIDIA GeForce RTX 2060 SUPER
  • WSL 2 (Ubuntu)
  • Docker Desktop 4.18.0 (104112)

Setup with Docker Compose

We will use the WebUI called AUTOMATIC1111 to utilize Stable Diffusion this time. The environment for these will be constructed using Docker Compose. The main components are shown in Figure 2.

Illustration of execution environment using Automatic1111 showing Host, portmapping, Docker image, bind mount information, etc.
Figure 2: Configuration built using Docker Compose.

The configuration of Docker Compose is defined in docker-compose.yml. We are using a Compose extension called x-base_service to describe the major components common to each service.

To start, there are settings for bind mount between the host and the container, including /data, which loads modes, and /output, which outputs images. Then, we make the container recognize the GPU by loading the NVIDIA driver.

Furthermore, the service named sd-auto:58 runs AUTOMATIC1111, WebUI for Stable Diffusion, within the container. Because there is a port mapping (TCP:7860), between the host and the container in the aforementioned common service settings, it is possible to access from the browser on the host side to the inside of the container.

Getting Started

Prerequisite

WSL 2 must be activated and Docker Desktop installed.

On the first execution, it downloads 12GB of Stable Diffusion 1.5 models, etc. The Web UI cannot be used until this download is complete. Depending on your connection, it may take a long time until the first startup.

Downloading the code

First, download the Stable Diffusion WebUI Docker code from GitHub. If you download it as a ZIP, click Code > Download ZIP and the stable-diffusion-webui-docker-master.zip file will be downloaded (Figure 3). 

Unzip the file in a convenient location. When you expand it, you will find a folder named stable-diffusion-webui-docker-master. Open the command line or similar and run the docker compose command inside it.

 Screenshot showing Stable Diffusion WebUI Docker being downloaded as ZIP file from GitHub.
Figure 3: Downloading the configuration for Docker Compose from the repository.

Or, if you have an environment where you can use Git, such as Git for Windows, it’s quicker to download it as follows:

git clone https://github.com/AbdBarho/stable-diffusion-webui-docker.git

In this case, the folder name is stable-diffusion-webui-docker. Move it with cd stable-diffusion-webui-docker.

Supplementary information for those familiar with Docker

If you just want to get started, you can skip this section.

By default, the timezone is UTC. To adjust the time displayed in the log and the date of the directory generated under output/txt2img to Japan time, add TZ=Asia/Tokyo to the environment variables of the auto service. Specifically, add the following description to environment:.

auto: &automatic
    <<: *base_service
    profiles: ["auto"]
    build: ./services/AUTOMATIC1111
    image: sd-auto:51
    environment:
      - CLI_ARGS=--allow-code --medvram --xformers --enable-insecure-extension-access --api
      - TZ=Asia/Tokyo

Tasks at first startup

The rest of the process is as described in the GitHub documentation. Inside the folder where the code is expanded, run the following command:

docker compose --profile download up --build

After the command runs, the log of a container named webui-docker-download-1 will be displayed on the screen. For a while, the download will run as follows, so wait until it is complete:

webui-docker-download-1  | [DL:256KiB][#4561e1 1.4GiB/3.9GiB(36%)][#42c377 1.4GiB/3.9GiB(37%)]

If the process ends successfully, it will be displayed as exited with code 0 and returned to the original prompt:

…(snip)
webui-docker-download-1  | https://github.com/xinntao/Real-ESRGAN/blob/master/LICENSE 
webui-docker-download-1  | https://github.com/xinntao/ESRGAN/blob/master/LICENSE 
webui-docker-download-1  | https://github.com/cszn/SCUNet/blob/main/LICENSE 
webui-docker-download-1 exited with code 0

If a code other than 0 comes out like the following, the download process has failed:

webui-docker-download-1  | 42c377|OK  |   426KiB/s|/data/StableDiffusion/sd-v1-5-inpainting.ckpt 
webui-docker-download-1  | 
webui-docker-download-1  | Status Legend: 
webui-docker-download-1  | (OK):download completed.(ERR):error occurred. 
webui-docker-download-1  | 
webui-docker-download-1  | aria2 will resume download if the transfer is restarted. 
webui-docker-download-1  | If there are any errors, then see the log file. See '-l' option in help/m 
an page for details. 
webui-docker-download-1 exited with code 24

In this case, run the command again and check whether it ends successfully. Once it finishes successfully, run the command to start the WebUI. 

Note: The following is for AUTOMATIC1111’s UI and GPU specification:

docker compose --profile auto up --build

When you run the command, loading the model at the first startup may take a few minutes. It may look like it’s frozen like the following display, but that’s okay:

webui-docker-auto-1  | LatentDiffusion: Running in eps-prediction mode
webui-docker-auto-1  | DiffusionWrapper has 859.52 M params.

If you wait for a while, the log will flow, and the following URL will be displayed:

webui-docker-auto-1  | Running on local URL:  http://0.0.0.0:7860

Now the startup preparation of the Web UI is set. If you open http://127.0.0.1:7860 from the browser, you can see the Web UI. Once open, select an appropriate model from the top left of the screen, write some text in the text field, and select the Generate button to start generating images (Figure 4).

Screenshot showing text input and large orange "generate" button for creating images.
Figure 4: After selecting the model, input the prompt and generate the image.

When you click, the button will be reversed. Wait until the process is finished (Figure 5).

Screenshot of gray "interrupt" and "skip" buttons.
Figure 5: Waiting until the image is generated.

At this time, the log of image generation appears on the terminal you are operating, and you can also check the similar display by looking at the log of the container on Docker Desktop (Figure 6).

Screenshot of progress log, showing 100% completion.
Figure 6: 100% indicates that the image generation is complete.

When the status reaches 100%, the generation of the image is finished, and you can check it on the screen (Figure 7).

Screenshot of Stable Diffusion WebUI showing "space cat" as text input with image of gray and white cat on glowing purple background.
Figure 7: After inputting “Space Cat” in the prompt, a cat image was generated at the bottom right of the screen.

The created images are automatically saved in the output/txt2img/date folder directly under the directory where you ran the docker compose command.

To stop the launched WebUI, enter Ctrl+C on the terminal that is still running the docker compose command.

Gracefully stopping... (press Ctrl+C again to force)
Aborting on container exit...
[+] Running 1/1
 ? Container webui-docker-auto-1  Stopped                                                     11.4s
canceled

When the process ends successfully, you will be able to run the command again. To use the WebUI again after restarting, re-run the docker compose command:

docker compose --profile auto up --build

To see the operating hardware status, use the task manager to look at the GPU status (Figure 8).

Screenshot of Windows Task Manager showing GPU status.
Figure 8: From the Performance tab of the Windows Task Manager, you can monitor the processing of CUDA and similar tasks on the GPU.

To check whether the GPU is visible from inside the container and to see whether the information comes out, run the nvidia-smi command from docker exec or the Docker Desktop terminal.

root@e37fcc5a5810:/stable-diffusion-webui# nvidia-smi 
Mon Apr 17 07:42:27 2023 
+---------------------------------------------------------------------------------------+ 
| NVIDIA-SMI 530.41.03              Driver Version: 531.41       CUDA Version: 12.1     | 
|-----------------------------------------+----------------------+----------------------+ 
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC | 
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. | 
|                                         |                      |               MIG M. | 
|=========================================+======================+======================| 
|   0  NVIDIA GeForce RTX 2060 S...    On | 00000000:01:00.0  On |                  N/A | 
| 42%   40C    P8                6W / 175W|   2558MiB /  8192MiB |      2%      Default | 
|                                         |                      |                  N/A | 
+-----------------------------------------+----------------------+----------------------+ 
+---------------------------------------------------------------------------------------+ 
| Processes:                                                                            | 
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory | 
|        ID   ID                                                             Usage      | 
|=======================================================================================| 
|    0   N/A  N/A       149      C   /python3.10                               N/A      | 
+---------------------------------------------------------------------------------------+

Adding models and VAEs

If you download a model that is not included from the beginning, place files with extensions, such as .safetensors in stable-diffusion-webui-docker\data\StableDiffusion. In the case of VAE, place .skpt files in stable-diffusion-webui-docker\data\VAE.

If you’re using Docker Desktop, you can view and operate inside on the Files of the webui-docker-auto-1 container, so you can also drag it into Docker Desktop. 

Figure 9 shows the Docker Desktop screen. It says MOUNT in the Note column, and it shares the information in the folder with the container from the Windows host side.

Screenshot of Docker Desktop showing blue MOUNT indicator under Note column.
Figure 9: From the Note column, you can see whether the folder is mounted or has been modified.

Now, after placing the file, a link to Reload UI is shown in the footer of the WebUI, so select there (Figure 10).

Screenshot of Stable Diffusion WebUI showing Reload UI option.
Figure 10: By clicking Reload UI, the WebUI settings are reloaded.

When you select Reload UI, the system will show a loading screen, and the browser connection will be cut off. When you reload the browser, the model and VAE files are automatically loaded. To remove a model, delete the model file from data\StableDiffusion.

Conclusion

With Docker Desktop, image generation using the latest generative AI environment can be done easier than ever. Typically, a lot of time and effort is required just to set up the environment, but Docker Desktop solves this complexity. If you’re interested, why not take a challenge in the world of generative AI? Enjoy!

Learn more

]]>
Conversational AI Made Easy: Developing an ML FAQ Model Demo from Scratch Using Rasa and Docker https://www.docker.com/blog/developing-using-rasa-and-docker/ Thu, 06 Jul 2023 13:52:20 +0000 https://www.docker.com/?p=43777 In today’s fast-paced digital era, conversational AI chatbots have emerged as a game-changer in delivering efficient and personalized user interactions. These artificially intelligent virtual assistants are designed to mimic human conversations, providing users with quick and relevant responses to queries.

A crucial aspect of building successful chatbots is the ability to handle frequently asked questions (FAQs) seamlessly. FAQs form a significant portion of user queries in various domains, such as customer support, e-commerce, and information retrieval. Being able to provide accurate and prompt answers to common questions not only improves user satisfaction but also frees up human agents to focus on more complex tasks. 

In this article, we’ll look at how to use the open source Rasa framework along with Docker to build and deploy a containerized, conversational AI chatbot.

Dark purple background with Docker and Rasa logos

Meet Rasa 

To tackle the challenge of FAQ handling, developers turn to sophisticated technologies like Rasa, an open source conversational AI framework. Rasa offers a comprehensive set of tools and libraries that empower developers to create intelligent chatbots with natural language understanding (NLU) capabilities. With Rasa, you can build chatbots that understand user intents, extract relevant information, and provide contextual responses based on the conversation flow.

Rasa allows developers to build and deploy conversational AI chatbots and provides a flexible architecture and powerful NLU capabilities (Figure 1).

 Illustration of Rasa architecture, showing connections between Rasa SDK,  Dialogue Policies, NLU Pipeline, Agent, Input/output channels, etc.
Figure 1: Overview of Rasa.

Rasa is a popular choice for building conversational AI applications, including chatbots and virtual assistants, for several reasons. For example, Rasa is an open source framework, which means it is freely available for developers to use, modify, and contribute to. It provides a flexible and customizable architecture that gives developers full control over the chatbot’s behavior and capabilities. 

Rasa’s NLU capabilities allow you to extract intent and entity information from user messages, enabling the chatbot to understand and respond appropriately. Rasa supports different language models and machine learning (ML) algorithms for accurate and context-aware language understanding. 

Rasa also incorporates ML techniques to train and improve the chatbot’s performance over time. You can train the model using your own training data and refine it through iterative feedback loops, resulting in a chatbot that becomes more accurate and effective with each interaction. 

Additionally, Rasa can scale to handle large volumes of conversations and can be extended with custom actions, APIs, and external services. This capability allows you to integrate additional functionalities, such as database access, external API calls, and business logic, into your chatbot.

Why containerizing Rasa is important

Containerizing Rasa brings several important benefits to the development and deployment process of conversational AI chatbots. Here are four key reasons why containerizing Rasa is important:

1. Docker provides a consistent and portable environment for running applications.

By containerizing Rasa, you can package the chatbot application, its dependencies, and runtime environment into a self-contained unit. This approach allows you to deploy the containerized Rasa chatbot across different environments, such as development machines, staging servers, and production clusters, with minimal configuration or compatibility issues. 

Docker simplifies the management of dependencies for the Rasa chatbot. By encapsulating all the required libraries, packages, and configurations within the container, you can avoid conflicts with other system dependencies and ensure that the chatbot has access to the specific versions of libraries it needs. This containerization eliminates the need for manual installation and configuration of dependencies on different systems, making the deployment process more streamlined and reliable.

2. Docker ensures the reproducibility of your Rasa chatbot’s environment.

By defining the exact dependencies, libraries, and configurations within the container, you can guarantee that the chatbot will run consistently across different deployments. 

3. Docker enables seamless scalability of the Rasa chatbot.

With containers, you can easily replicate and distribute instances of the chatbot across multiple nodes or servers, allowing you to handle high volumes of user interactions. 

4. Docker provides isolation between the chatbot and the host system and between different containers running on the same host.

This isolation ensures that the chatbot’s dependencies and runtime environment do not interfere with the host system or other applications. It also allows for easy management of dependencies and versioning, preventing conflicts and ensuring a clean and isolated environment in which the chatbot can operate.

Building an ML FAQ model demo application

By combining the power of Rasa and Docker, developers can create an ML FAQ model demo that excels in handling frequently asked questions. The demo can be trained on a dataset of common queries and their corresponding answers, allowing the chatbot to understand and respond to similar questions with high accuracy.

In this tutorial, you’ll learn how to build an ML FAQ model demo using Rasa and Docker. You’ll set up a development environment to train the model and then deploy the model using Docker. You will also see how to integrate a WebChat UI frontend for easy user interaction. Let’s jump in.

Getting started

The following key components are essential to completing this walkthrough:

 Illustration showing connections between Local folder, app folder, preinstalled Rasa dependencies, etc.
Figure 2: Docker container with pre-installed Rasa dependencies and Volume mount point.

Deploying an ML FAQ demo app is a simple process involving the following steps:

  • Clone the repository
  • Set up the configuration files. 
  • Initialize Rasa.
  • Train and run the model. 
  • Bring up the WebChat UI app. 

We’ll explain each of these steps below.

Cloning the project

To get started, you can clone the repository by running the following command:

https://github.com/dockersamples/docker-ml-faq-rasa
docker-ml-faq-rasa % tree -L 2
.
├── Dockerfile-webchat
├── README.md
├── actions
│   ├── __init__.py
│   ├── __pycache__
│   └── actions.py
├── config.yml
├── credentials.yml
├── data
│   ├── nlu.yml
│   ├── rules.yml
│   └── stories.yml
├── docker-compose.yaml
├── domain.yml
├── endpoints.yml
├── index.html
├── models
│   ├── 20230618-194810-chill-idea.tar.gz
│   └── 20230619-082740-upbeat-step.tar.gz
└── tests
    └── test_stories.yml

6 directories, 16 files

Before we move to the next step, let’s look at each of the files one by one.

File: domain.yml

This file describes the chatbot’s domain and includes crucial data including intents, entities, actions, and answers. It outlines the conversation’s structure, including the user’s input, and the templates for the bot’s responses.

version: "3.1"

intents:
  - greet
  - goodbye
  - affirm
  - deny
  - mood_great
  - mood_unhappy
  - bot_challenge

responses:
  utter_greet:
  - text: "Hey! How are you?"

  utter_cheer_up:
  - text: "Here is something to cheer you up:"
    image: "https://i.imgur.com/nGF1K8f.jpg"

  utter_did_that_help:
  - text: "Did that help you?"

  utter_happy:
  - text: "Great, carry on!"

  utter_goodbye:
  - text: "Bye"

  utter_iamabot:
  - text: "I am a bot, powered by Rasa."

session_config:
  session_expiration_time: 60
  carry_over_slots_to_new_session: true

As shown previously, this configuration file includes intents, which represent the different types of user inputs the bot can understand. It also includes responses, which are the bot’s predefined messages for various situations. For example, the bot can greet the user, provide a cheer-up image, ask if the previous response helped, express happiness, say goodbye, or mention that it’s a bot powered by Rasa.

The session configuration sets the expiration time for a session (in this case, 60 seconds) and specifies whether the bot should carry over slots (data) from a previous session to a new session.

File: nlu.yml

The NLU training data are defined in this file. It includes input samples and the intents and entities that go with them. The NLU model, which connects user inputs to the right actions, is trained using this data.

version: "3.1"

nlu:
- intent: greet
  examples: |
    - hey
    - hello
    - hi
    - hello there
    - good morning
    - good evening
    - moin
    - hey there
    - let's go
    - hey dude
    - goodmorning
    - goodevening
    - good afternoon

- intent: goodbye
  examples: |
    - cu
    - good by
    - cee you later
    - good night
    - bye
    - goodbye
    - have a nice day
    - see you around
    - bye bye
    - see you later

- intent: affirm
  examples: |
    - yes
    - y
    - indeed
    - of course
    - that sounds good
    - correct

- intent: deny
  examples: |
    - no
    - n
    - never
    - I don't think so
    - don't like that
    - no way
    - not really

- intent: mood_great
  examples: |
    - perfect
    - great
    - amazing
    - feeling like a king
    - wonderful
    - I am feeling very good
    - I am great
    - I am amazing
    - I am going to save the world
    - super stoked
    - extremely good
    - so so perfect
    - so good
    - so perfect

- intent: mood_unhappy
  examples: |
    - my day was horrible
    - I am sad
    - I don't feel very well
    - I am disappointed
    - super sad
    - I'm so sad
    - sad
    - very sad
    - unhappy
    - not good
    - not very good
    - extremly sad
    - so saad
    - so sad

- intent: bot_challenge
  examples: |
    - are you a bot?
    - are you a human?
    - am I talking to a bot?
    - am I talking to a human?

This configuration file defines several intents, which represent different types of user inputs that the chatbot can recognize. Each intent has a list of examples, which are example phrases or sentences that users might type or say to express that particular intent.

You can customize and expand upon this configuration by adding more intents and examples that are relevant to your chatbot’s domain and use cases.

File: stories.yml

This file is used to define the training stories, which serve as examples of user-chatbot interactions. A series of user inputs, bot answers, and the accompanying intents and entities make up each story.

version: "3.1"

stories:

- story: happy path
  steps:
  - intent: greet
  - action: utter_greet
  - intent: mood_great
  - action: utter_happy

- story: sad path 1
  steps:
  - intent: greet
  - action: utter_greet
  - intent: mood_unhappy
  - action: utter_cheer_up
  - action: utter_did_that_help
  - intent: affirm
  - action: utter_happy

- story: sad path 2
  steps:
  - intent: greet
  - action: utter_greet
  - intent: mood_unhappy
  - action: utter_cheer_up
  - action: utter_did_that_help
  - intent: deny
  - action: utter_goodbye

The stories.yml file you provided contains a few training stories for a Rasa chatbot. These stories represent different conversation paths between the user and the chatbot. Each story consists of a series of steps, where each step corresponds to an intent or an action.

Here’s a breakdown of steps for the training stories in the file:

Story: happy path

  • User greets with an intent: greet
  • Bot responds with an action: utter_greet
  • User expresses a positive mood with an intent: mood_great
  • Bot acknowledges the positive mood with an action: utter_happy

Story: sad path 1

  • User greets with an intent: greet
  • Bot responds with an action: utter_greet
  • User expresses an unhappy mood with an intent: mood_unhappy
  • Bot tries to cheer the user up with an action: utter_cheer_up
  • Bot asks if the previous response helped with an action: utter_did_that_help
  • User confirms that it helped with an intent: affirm
  • Bot acknowledges the confirmation with an action: utter_happy

Story: sad path 2

  • User greets with an intent: greet
  • Bot responds with an action: utter_greet
  • User expresses an unhappy mood with an intent: mood_unhappy
  • Bot tries to cheer the user up with an action: utter_cheer_up
  • Bot asks if the previous response helped with an action: utter_did_that_help
  • User denies that it helped with an intent: deny
  • Bot says goodbye with an action: utter_goodbye

These training stories are used to train the Rasa chatbot on different conversation paths and to teach it how to respond appropriately to user inputs based on their intents.

File: config.yml

The configuration parameters for your Rasa project are contained in this file.

# The config recipe.
# https://rasa.com/docs/rasa/model-configuration/
recipe: default.v1

# The assistant project unique identifier
# This default value must be replaced with a unique assistant name within your deployment
assistant_id: placeholder_default

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en

pipeline:
# # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
# # If you'd like to customize it, uncomment and adjust the pipeline.
# # See https://rasa.com/docs/rasa/tuning-your-model for more information.
#   - name: WhitespaceTokenizer
#   - name: RegexFeaturizer
#   - name: LexicalSyntacticFeaturizer
#   - name: CountVectorsFeaturizer
#   - name: CountVectorsFeaturizer
#     analyzer: char_wb
#     min_ngram: 1
#     max_ngram: 4
#   - name: DIETClassifier
#     epochs: 100
#     constrain_similarities: true
#   - name: EntitySynonymMapper
#   - name: ResponseSelector
#     epochs: 100
#     constrain_similarities: true
#   - name: FallbackClassifier
#     threshold: 0.3
#     ambiguity_threshold: 0.1

# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
# # No configuration for policies was provided. The following default policies were used to train your model.
# # If you'd like to customize them, uncomment and adjust the policies.
# # See https://rasa.com/docs/rasa/policies for more information.
#   - name: MemoizationPolicy
#   - name: RulePolicy
#   - name: UnexpecTEDIntentPolicy
#     max_history: 5
#     epochs: 100
#   - name: TEDPolicy
#     max_history: 5
#     epochs: 100
#     constrain_similarities: true

The configuration parameters for your Rasa project are contained in this file. Here is a breakdown of the configuration file:

1. Assistant ID:

  • Assistant_id: placeholder_default
  • This placeholder value should be replaced with a unique identifier for your assistant.

2. Rasa NLU configuration:

  • Language: en
    • Specifies the language used for natural language understanding.
  • Pipeline:
    • Defines the pipeline of components used for NLU processing.
    • The pipeline is currently commented out, and the default pipeline is used.
    • The default pipeline includes various components like tokenizers, featurizers, classifiers, and response selectors.
    • If you want to customize the pipeline, you can uncomment the lines and adjust the pipeline configuration.

3. Rasa core configuration:

  • Policies:
    • Specifies the policies used for dialogue management.
    • The policies are currently commented out, and the default policies are used.
    • The default policies include memoization, rule-based, and TED (Transformer   Embedding Dialogue) policies
    • If you want to customize the policies, you can uncomment the lines and adjust the policy configuration.

File: actions.py

The custom actions that your chatbot can execute are contained in this file. Retrieving data from an API, communicating with a database, or doing any other unique business logic are all examples of actions.

# This files contains your custom actions which can be used to run
# custom Python code.
#
# See this guide on how to implement these action:
# https://rasa.com/docs/rasa/custom-actions


# This is a simple example for a custom action which utters "Hello World!"

# from typing import Any, Text, Dict, List
#
# from rasa_sdk import Action, Tracker
# from rasa_sdk.executor import CollectingDispatcher
#
#
# class ActionHelloWorld(Action):
#
#     def name(self) -> Text:
#         return "action_hello_world"
#
#     def run(self, dispatcher: CollectingDispatcher,
#             tracker: Tracker,
#             domain: Dict[Text, Any]) -> List[Dict[Text, Any]]:
#
#         dispatcher.utter_message(text="Hello World!")
#
#         return []

Explanation of the code:

  • The ActionHelloWorld class extends the Action class provided by the rasa_sdk.
  • The name method defines the name of the custom action, which in this case is “action_hello_world”.
  • The run method is where the logic for the custom action is implemented.
  • Within the run method, the dispatcher object is used to send a message back to the user. In this example, the message sent is “Hello World!”.
  • The return [] statement indicates that the custom action has completed its execution.

File: endpoints.yml

The endpoints for your chatbot are specified in this file, including any external services or the webhook URL for any custom actions.  

Initializing Rasa

This command initializes a new Rasa project in the current directory ($(pwd)):

docker run  -p 5005:5005 -v $(pwd):/app rasa/rasa:3.5.2 init --no-prompt

It sets up the basic directory structure and creates essential files for a Rasa project, such as config.yml, domain.yml, and data/nlu.yml. The -p flag maps port 5005 inside the container to the same port on the host, allowing you to access the Rasa server. <IMAGE>:3.5.2 refers to the Docker image for the specific version of Rasa you want to use.

Training the model

The following command trains a Rasa model using the data and configuration specified in the project directory:

docker run -v $(pwd):/app rasa/rasa:3.5.2 train --domain domain.yml --data data --out models

The -v flag mounts the current directory ($(pwd)) inside the container, allowing access to the project files. The --domain domain.yml flag specifies the domain configuration file, --data data points to the directory containing the training data, and --out models specifies the output directory where the trained model will be saved.

Running the model

This command runs the trained Rasa model in interactive mode, enabling you to test the chatbot’s responses:

docker run -v $(pwd):/app rasa/rasa:3.5.2 shell

The command loads the trained model from the models directory in the current project directory ($(pwd)). The chatbot will be accessible in the terminal, allowing you to have interactive conversations and see the model’s responses.

Verify Rasa is running:

curl localhost:5005
Hello from Rasa: 3.5.2

Now you can send the message and test your model with curl:

curl --location 'http://localhost:5005/webhooks/rest/webhook' \
--header 'Content-Type: application/json' \
--data '{
 "sender": "Test person",
 "message": "how are you ?"}'

Running WebChat

The following command deploys the trained Rasa model as a server accessible via a WebChat UI:

docker run -p 5005:5005 -v $(pwd):/app rasa/rasa:3.5.2 run -m models --enable-api --cors "*" --debug

The -p flag maps port 5005 inside the container to the same port on the host, making the Rasa server accessible. The -m models flag specifies the directory containing the trained model. The --enable-api flag enables the Rasa API, allowing external applications to interact with the chatbot. The --cors "*" flag enables cross-origin resource sharing (CORS) to handle requests from different domains. The --debug flag enables debug mode for enhanced logging and troubleshooting.

docker run -p 8080:80 harshmanvar/docker-ml-faq-rasa:webchat

Open http://localhost:8080 in the browser (Figure 3).

Screenshot of sample chatbot conversation in browser.
Figure 3: WebChat UI.

Defining services using a Compose file

Here’s how our services appear within a Docker Compose file:

services:
  rasa:
    image: rasa/rasa:3.5.2
    ports:
      - 5005:5005
    volumes:
      - ./:/app
    command: run -m models --enable-api --cors "*" --debug
  Webchat:
    image: harshmanvar/docker-ml-faq-rasa:webchat 
    build:
      context: .
      dockerfile: Dockerfile-webchat
    ports:
      - 8080:80

Your sample application has the following parts:

  • The rasa service is based on the rasa/rasa:3.5.2 image. 
  • It exposes port 5005 to communicate with the Rasa API. 
  • The current directory (./) is mounted as a volume inside the container, allowing the Rasa project files to be accessible. 
  • The command run -m models --enable-api --cors "*" --debug starts the Rasa server with the specified options.
  • The webchat service is based on the harshmanvar/docker-ml-faq-rasa:webchat image. It builds the image using the Dockerfile-webchat file located in the current context (.). Port 8080 on the host is mapped to port 80 inside the container to access the webchat interface.

You can clone the repository or download the docker-compose.yml file directly from GitHub.

Bringing up the container services

You can start the WebChat application by running the following command:

docker compose up -d --build

Then, use the docker compose ps command to confirm that your stack is running properly. Your terminal will produce the following output:

docker compose ps
NAME                            IMAGE                                  COMMAND                  SERVICE             CREATED             STATUS              PORTS
docker-ml-faq-rassa-rasa-1      harshmanvar/docker-ml-faq-rasa:3.5.2   "rasa run -m models …"   rasa                6 seconds ago       Up 5 seconds        0.0.0.0:5005->5005/tcp
docker-ml-faq-rassa-webchat-1   docker-ml-faq-rassa-webchat            "/docker-entrypoint.…"   webchat             6 seconds ago       Up 5 seconds        0.0.0.0:8080->80/tcp

Viewing the containers via Docker Dashboard

You can also leverage the Docker Dashboard to view your container’s ID and easily access or manage your application (Figure 4):

Screenshot showing running containers in Docker dashboard.
Figure 4: View containers with Docker Dashboard.

Conclusion

Congratulations! You’ve learned how to containerize a Rasa application with Docker. With a single YAML file, we’ve demonstrated how Docker Compose helps you quickly build and deploy an ML FAQ Demo Model app in seconds. With just a few extra steps, you can apply this tutorial while building applications with even greater complexity. Happy developing.

Check out Rasa on DockerHub.

]]>
Full-Stack Reproducibility for AI/ML with Docker and Kaskada https://www.docker.com/blog/full-stack-reproducibility-for-ai-ml-with-docker-kaskada/ Tue, 20 Jun 2023 12:55:51 +0000 https://www.docker.com/?p=43301 Docker is used by millions of developers to optimize the setup and deployment of development environments and application stacks. As artificial intelligence (AI) and machine learning (ML) are becoming key components of many applications, the core benefits of Docker are now doing more heavy lifting and accelerating the development cycle. 

Gartner predicts that “by 2027, over 90% of new software applications that are developed in the business will contain ML models or services as enterprises utilize the massive amounts of data available to the business.”

This article, written by our partner DataStax, outlines how Kaskada, open source, and Docker are helping developers optimize their AI/ML efforts.

Stylized brain in hexagon on light blue background with Docker and Kaskada logos

Introduction

As a data scientist or machine learning practitioner, your work is all about experimentation. You start with a hunch about the story your data will tell, but often you’ll only find an answer after false starts and failed experiments. The faster you can iterate and try things, the faster you’ll get to answers. In many cases, the insights gained from solving one problem are applicable to other related problems. Experimentation can lead to results much faster when you’re able to build on the prior work of your colleagues.

But there are roadblocks to this kind of collaboration. Without the right tools, data scientists waste time managing code dependencies, resolving version conflicts, and repeatedly going through complex installation processes. Building on the work of colleagues can be hard due to incompatible environments — the dreaded “it works for me” syndrome.

Enter Docker and Kaskada, which offer a similar solution to these different problems: a declarative language designed specifically for the problem at hand and an ecosystem of supporting tools (Figure 1).

Illustration showing representation of Dockerfile defining steps to build a reproducible dev environment.
Figure 1: Dockerfile defines the build steps.

For Docker, the Dockerfile format describes the exact steps needed to build a reproducible development environment and an ecosystem of tools for working with containers (Docker Hub, Docker Desktop, Kubernetes, etc.). With Docker, data scientists can package their code and dependencies into an image that can run as a container on any machine, eliminating the need for complex installation processes and ensuring that colleagues can work with the exact same development environment.

With Kaskada, data scientists can compute and share features as code and use those throughout the ML lifecycle — from training models locally to maintaining real-time features in production. The computations required to produce these datasets are often complex and expensive because standard tools like Spark have difficulty reconstructing the temporal context required for training real-time ML models.

Kaskada solves this problem by providing a way to compute features — especially those that require reasoning about time — and sharing feature definitions as code. This approach allows data scientists to collaborate with each other and with machine learning engineers on feature engineering and reuse code across projects. Increased reproducibility dramatically speeds cycle times to get models into production, increases model accuracy, and ultimately improves machine learning results.

Example walkthrough

Let’s see how Docker and Kaskada improve the machine learning lifecycle by walking through a simplified example. Imagine you’re trying to build a real-time model for a mobile game and want to predict an outcome, for example, whether a user will pay for an upgrade.

Setting up your experimentation environment

To begin, start a Docker container that comes preinstalled with Jupyter and Kaskada:

docker run --rm -p 8888:8888 kaskadaio/jupyter
open <jupyter URL from logs> 

This step instantly gives you a reproducible development environment to work in, but you might want to customize this environment. Additional development tools can be added by creating a new Dockerfile using this image as the “base image”:

# Dockerfile
FROM kaskadaio/jupyter

COPY requirements.txt
RUN pip install -r requirements.txt

In this example, you started with Jupyter and Kaskada, copied over a requirements file and installed all the dependencies in it. You now have a new Docker image that you can use as a data science workbench and share across your organization: Anyone in your organization with this Dockerfile can reproduce the same environment you’re using by building and running your Dockerfile.

docker build -t experimentation_env .
docker run --rm -p 8888:8888 experimentation_env

The power of Docker comes from the fact that you’ve created a file that describes your environment and now you can share this file with others.

Training your model

Inside a new Jupyter notebook, you can begin the process of exploring solutions to the problem — predicting purchase behavior. To begin, you’ll create tables to organize the different types of events produced by the imaginary game.

% pip install kaskada
% load_ext fenlmagic

session = LocalBuilder().build()

table.create_table(
table_name = "GamePlay",
time_column_name = "timestamp",
entity_key_column_name = "user_id",
)
table.create_table(
table_name = "Purchase",
time_column_name = "timestamp",
entity_key_column_name = "user_id",
)

table.load(
table_name = "GamePlay", 
file = "historical_game_play_events.parquet",
)
table.load(
table_name = "Purchase", 
file = "historical_purchase_events.parquet",
)

Kaskada is easy to install and use locally. After installing, you’re ready to start creating tables and loading event data into them. Kaskada’s vectorized engine is built for high-performance local execution, and, in many cases, you can start experimenting on your data locally, without the complexity of managing distributed compute clusters.

Kaskada’s query language was designed to make it easy for data scientists to build features and training examples directly from raw event data. A single query can replace complex ETL and pre-aggregation pipelines, and Kaskda’s unique temporal operations unlock native time travel for building training examples “as of” important times in the past.

%% fenl --var training

# Create views derived from the source tables
let GameVictory = GamePlay | when(GamePlay.won)
let GameDefeat = GamePlay | when(not GamePlay.won)

# Compute some features as inputs to our model
let features = {
  loss_duration: sum(GameVictory.duration),
  purchase_count: count(Purchase),
}

# Observe our features at the time of a player's second victory
let example = features
  | when(count(GameDefeat, window=since(GameVictory)) == 2)
  | shift_by(hours(1))

# Compute a target value
# In this case comparing purchase count at prediction and label time
let target = count(Purchase) > example.purchase_count

# Combine feature and target values computed at the different times
in extend(example, {target})

In the following example, you first apply filtering to the events, build simple features, observe them at the points in time when your model will be used to make predictions, then combine the features with the value you want to predict, computed an hour later. Kaskada lets you describe all these operations “from scratch,” starting with raw events and ending with an ML training dataset.

from sklearn.linear_model import LogisticRegression
from sklearn import preprocessing

x = training.dataframe[['loss_duration']]
y = training.dataframe['target']

scaler = preprocessing.StandardScaler().fit(X)
X_scaled = scaler.transform(X)

model = LogisticRegression(max_iter=1000)
model.fit(X_scaled, y)

Kaskada’s query language makes it easy to write an end-to-end transformation from raw events to a training dataset.

Conclusion

Docker and Kaskada enable data scientists and ML engineers to solve real-time ML problems quickly and reproducibly. With Docker, you can manage your development environment with ease, ensuring that your code runs the same way on every machine. With Kaskada, you can collaborate with colleagues on feature engineering and reuse queries as code across projects. Whether you’re working independently or as part of a team, these tools can help you get answers faster and more efficiently than ever before.

Get started with Kaskada’s official images on Docker Hub.

]]>
Effortlessly Build Machine Learning Apps with Hugging Face’s Docker Spaces https://www.docker.com/blog/build-machine-learning-apps-with-hugging-faces-docker-spaces/ Thu, 23 Mar 2023 17:44:24 +0000 https://www.docker.com/?p=41553 The Hugging Face Hub is a platform that enables collaborative open source machine learning (ML). The hub works as a central place where users can explore, experiment, collaborate, and build technology with machine learning. On the hub, you can find more than 140,000 models, 50,000 ML apps (called Spaces), and 20,000 datasets shared by the community.

Using Spaces makes it easy to create and deploy ML-powered applications and demos in minutes. Recently, the Hugging Face team added support for Docker Spaces, enabling users to create any custom app they want by simply writing a Dockerfile.

Another great thing about Spaces is that once you have your app running, you can easily share it with anyone around the world. 🌍

This guide will step through the basics of creating a Docker Space, configuring it, and deploying code to it. We’ll show how to build a basic FastAPI app for text generation that will be used to demo the google/flan-t5-small model, which can generate text given input text. Models like this are used to power text completion in all sorts of apps. (You can check out a completed version of the app at Hugging Face.)

banner hugging face docker

Prerequisites

To follow along with the steps presented in this article, you’ll need to be signed in to the Hugging Face Hub — you can sign up for free if you don’t have an account already.

Create a new Docker Space 🐳

To get started, create a new Space as shown in Figure 1.

Screenshot of Hugging Face Spaces, showing "Create new Space" button in upper right.
Figure 1: Create a new Space.

Next, you can choose any name you prefer for your project, select a license, and use Docker as the software development kit (SDK) as shown in Figure 2. 

Spaces provides pre-built Docker templates like Argilla and Livebook that let you quickly start your ML projects using open source tools. If you choose the “Blank” option, that means you want to create your Dockerfile manually. Don’t worry, though; we’ll provide a Dockerfile to copy and paste later. 😅

Screenshot of Spaces interface where you can add name, license, and select an SDK.
Figure 2: Adding details for the new Space.

When you finish filling out the form and click on the Create Space button, a new repository will be created in your Spaces account. This repository will be associated with the new space that you have created.

Note: If you’re new to the Hugging Face Hub 🤗, check out Getting Started with Repositories for a nice primer on repositories on the hub.

Writing the app

Ok, now that you have an empty space repository, it’s time to write some code. 😎

The sample app will consist of the following three files:

  • requirements.txt — Lists the dependencies of a Python project or application
  • app.py — A Python script where we will write our FastAPI app
  • Dockerfile — Sets up our environment, installs requirements.txt, then launches app.py

To follow along, create each file shown below via the web interface. To do that, navigate to your Space’s Files and versions tab, then choose Add fileCreate a new file (Figure 3). Note that, if you prefer, you can also utilize Git.

Screenshot showing selection of "Create a new file" under "Add file" dropdown menu.
Figure 3: Creating new files.

Make sure that you name each file exactly as we have done here. Then, copy the contents of each file from here and paste them into the corresponding file in the editor. After you have created and populated all the necessary files, commit each new file to your repository by clicking on the Commit new file to main button.

Listing the Python dependencies 

It’s time to list all the Python packages and their specific versions that are required for the project to function properly. The contents of the requirements.txt file typically include the name of the package and its version number, which can be specified in a variety of formats such as exact version numbers, version ranges, or compatible versions. The file lists FastAPI, requests, and uvicorn for the API along with sentencepiece, torch, and transformers for the text-generation model.

fastapi==0.74.*
requests==2.27.*
uvicorn[standard]==0.17.*
sentencepiece==0.1.*
torch==1.11.*
transformers==4.*

Defining the FastAPI web application

The following code defines a FastAPI web application that uses the transformers library to generate text based on user input. The app itself is a simple single-endpoint API. The /generate endpoint takes in text and uses a transformers pipeline to generate a completion, which it then returns as a response.

To give folks something to see, we reroute FastAPI’s interactive Swagger docs from the default /docs endpoint to the root of the app. This way, when someone visits your Space, they can play with it without having to write any code.

from fastapi import FastAPI
from transformers import pipeline

# Create a new FastAPI app instance
app = FastAPI()

# Initialize the text generation pipeline
# This function will be able to generate text
# given an input.
pipe = pipeline("text2text-generation", 
model="google/flan-t5-small")

# Define a function to handle the GET request at `/generate`
# The generate() function is defined as a FastAPI route that takes a 
# string parameter called text. The function generates text based on the # input using the pipeline() object, and returns a JSON response 
# containing the generated text under the key "output"
@app.get("/generate")
def generate(text: str):
    """
    Using the text2text-generation pipeline from `transformers`, generate text
    from the given input text. The model used is `google/flan-t5-small`, which
    can be found [here](<https://huggingface.co/google/flan-t5-small>).
    """
    # Use the pipeline to generate text from the given input text
    output = pipe(text)
    
    # Return the generated text in a JSON response
    return {"output": output[0]["generated_text"]}

Writing the Dockerfile

In this section, we will write a Dockerfile that sets up a Python 3.9 environment, installs the packages listed in requirements.txt, and starts a FastAPI app on port 7860.

Let’s go through this process step by step:

FROM python:3.9

The preceding line specifies that we’re going to use the official Python 3.9 Docker image as the base image for our container. This image is provided by Docker Hub, and it contains all the necessary files to run Python 3.9.

WORKDIR /code

This line sets the working directory inside the container to /code. This is where we’ll copy our application code and dependencies later on.

COPY ./requirements.txt /code/requirements.txt

The preceding line copies the requirements.txt file from our local directory to the /code directory inside the container. This file lists the Python packages that our application depends on

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

This line uses pip to install the packages listed in requirements.txt. The --no-cache-dir flag tells pip to not use any cached packages, the --upgrade flag tells pip to upgrade any already-installed packages if newer versions are available, and the -r flag specifies the requirements file to use.

RUN useradd -m -u 1000 user
USER user
ENV HOME=/home/user \\
	PATH=/home/user/.local/bin:$PATH

These lines create a new user named user with a user ID of 1000, switch to that user, and then set the home directory to /home/user. The ENV command sets the HOME and PATH environment variables. PATH is modified to include the .local/bin directory in the user’s home directory so that any binaries installed by pip will be available on the command line. Refer the documentation to learn more about the user permission.

WORKDIR $HOME/app

This line sets the working directory inside the container to $HOME/app, which is /home/user/app.

COPY --chown=user . $HOME/app

The preceding line copies the contents of our local directory into the /home/user/app directory inside the container, setting the owner of the files to the user that we created earlier.

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

This line specifies the command to run when the container starts. It starts the FastAPI app using uvicorn and listens on port 7860. The --host flag specifies that the app should listen on all available network interfaces, and the app:app argument tells uvicorn to look for the app object in the app module in our code.

Here’s the complete Dockerfile:

# Use the official Python 3.9 image
FROM python:3.9

# Set the working directory to /code
WORKDIR /code

# Copy the current directory contents into the container at /code
COPY ./requirements.txt /code/requirements.txt

# Install requirements.txt 
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

# Set up a new user named "user" with user ID 1000
RUN useradd -m -u 1000 user
# Switch to the "user" user
USER user
# Set home to the user's home directory
ENV HOME=/home/user \\
	PATH=/home/user/.local/bin:$PATH

# Set the working directory to the user's home directory
WORKDIR $HOME/app

# Copy the current directory contents into the container at $HOME/app setting the owner to the user
COPY --chown=user . $HOME/app

# Start the FastAPI app on port 7860, the default port expected by Spaces
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

Once you commit this file, your space will switch to Building, and you should see the container’s build logs pop up so you can monitor its status. 👀

If you want to double-check the files, you can find all the files at our app Space.

Note: For a more basic introduction on using Docker with FastAPI, you can refer to the official guide from the FastAPI docs.

Using the app 🚀

If all goes well, your space should switch to Running once it’s done building, and the Swagger docs generated by FastAPI should appear in the App tab. Because these docs are interactive, you can try out the endpoint by expanding the details of the /generate endpoint and clicking Try it out! (Figure 4).

Screenshot of FastAPI showing "Try it out!" option on the right-hand side.
Figure 4: Trying out the app.

Conclusion

This article covered the basics of creating a Docker Space, building and configuring a basic FastAPI app for text generation that uses the google/flan-t5-small model. You can use this guide as a starting point to build more complex and exciting applications that leverage the power of machine learning.

If you’re interested in learning more about Docker templates and seeing curated examples, check out the Docker Examples page. There you’ll find a variety of templates to use as a starting point for your own projects, as well as tips and tricks for getting the most out of Docker templates. Happy coding!

]]>
Docker and Hugging Face Partner to Democratize AI https://www.docker.com/blog/docker-and-hugging-face-partner-to-democratize-ai/ Thu, 23 Mar 2023 17:43:09 +0000 https://www.docker.com/?p=41645 Today, Hugging Face and Docker are announcing a new partnership to democratize AI and make it accessible to all software engineers. Hugging Face is the most used open platform for AI, where the machine learning (ML) community has shared more than 150,000 models; 25,000 datasets; and 30,000 ML apps, including Stable Diffusion, Bloom, GPT-J, and open source ChatGPT alternatives. These apps enable the community to explore models, replicate results, and lower the barrier of entry for ML — anyone with a browser can interact with the models.

Docker is the leading toolset for easy software deployment, from infrastructure to applications. Docker is also the leading platform for software teams’ collaboration.

Docker and Hugging Face partner so you can launch and deploy complex ML apps in minutes. With the recent support for Docker on Hugging Face Spaces, folks can create any custom app they want by simply writing a Dockerfile. What’s great about Spaces is that once you’ve got your app running, you can easily share it with anyone worldwide! 🌍 Spaces provides an unparalleled level of flexibility and enables users to build ML demos with their preferred tools — from MLOps tools and FastAPI to Go endpoints and Phoenix apps.

Spaces also come with pre-defined templates of popular open source projects for members that want to get their end-to-end project in production in a matter of seconds with just a few clicks.

Screen showing options to select the Space SDK, with Docker and 3 templates selected.

Spaces enable easy deployment of ML apps in all environments, not just on Hugging Face. With “Run with Docker,” millions of software engineers can access more than 30,000 machine learning apps and run them locally or in their preferred environment.

Screen showing app options, with Run with Docker selected.

“At Hugging Face, we’ve worked on making AI more accessible and more reproducible for the past six years,” says Clem Delangue, CEO of Hugging Face. “Step 1 was to let people share models and datasets, which are the basic building blocks of AI. Step 2 was to let people build online demos for new ML techniques. Through our partnership with Docker Inc., we make great progress towards Step 3, which is to let anyone run those state-of-the-art AI models locally in a matter of minutes.”

You can also discover popular Spaces in the Docker Hub and run them locally with just a couple of commands.

To get started, read Effortlessly Build Machine Learning Apps with Hugging Face’s Docker Spaces. Or try Hugging Face Spaces now.

]]>
How to Develop and Deploy a Customer Churn Prediction Model Using Python, Streamlit, and Docker https://www.docker.com/blog/how-to-develop-and-deploy-a-customer-churn-prediction-model-using-python-streamlit-and-docker/ Thu, 25 Aug 2022 14:21:50 +0000 https://www.docker.com/?p=36870 image5 1

Customer churn is a million-dollar problem for businesses today. The SaaS market is becoming increasingly saturated, and customers can choose from plenty of providers. Retention and nurturing are challenging. Online businesses view customers as churn when they stop purchasing goods and services. Customer churn can depend on industry-specific factors, yet some common drivers include lack of product usage, contract tenure, and cheaper prices elsewhere.

Limiting churn strengthens your revenue streams. Businesses and marketers must predict and prevent customer churn to remain sustainable. The best way to do so is by knowing your customers. And spotting behavioral patterns in historical data can help immensely with this. So, how do we uncover them? 

Applying machine learning (ML) to customer data helps companies develop focused customer-retention programs. For example, a marketing department could use an ML churn model to identify high-risk customers and send promotional content to entice them. 

To enable these models to make predictions with new data, knowing how to package a model as a user-facing, interactive application is essential. In this blog, we’ll take an ML model from a Jupyter Notebook environment to a containerized application. We’ll use Streamlit as our application framework to build UI components and package our model. Next, we’ll use Docker to publish our model as an endpoint. 

Docker containerization helps make this application hardware-and-OS agnostic. Users can access the app from their browser through the endpoint, input customer details, and receive a churn probability in a fraction of a second. If a customer’s churn score exceeds a certain threshold, that customer may receive targeted push notifications and special offers. The diagram below puts this into perspective: 

Streamlit Docker Diagram

Why choose Streamlit?

Streamlit is an open source, Python-based framework for building UIs and powerful ML apps from a trained model. It’s popular among machine learning engineers and data scientists as it enables quick web-app development — requiring minimal Python code and a simple API. This API lets users create widgets using pure Python without worrying about backend code, routes, or requests. It provides several components that let you build charts, tables, and different figures to meet your application’s needs. Streamlit also utilizes models that you’ve saved or pickled into the app to make predictions.

Conversely, alternative frameworks like FastAPI, Flask, and Shiny require a strong grasp of HTML/CSS to build interactive, frontend apps. Streamlit is the fastest way to build and share data apps. The Streamlit API is minimal and extremely easy to understand. Minimal changes to your underlying Python script are needed to create an interactive dashboard.

Getting Started

git clone https://github.com/dockersamples/customer-churnapp-streamlit

Key Components

  • An IDE or text editor 
  • Python 3.6+ 
  • PIP (or Anaconda)
  • Not required but recommended: An environment-management tool such as pipenv, venv, virtualenv, or conda
  • Docker Desktop

Before starting, install Python 3.6+. Afterwards, follow these steps to install all libraries required to run the model on your system. 

Our project directory structure should look like this:

$ tree
.
├── Churn_EDA_model_development.ipynb
├── Churn_model_metrics.ipynb
├── Dockerfile
├── Pipfile
├── Pipfile.lock
├── WA_Fn-UseC_-Telco-Customer-Churn.csv
├── train.py
├── requirements.txt
├── README.md
├── images
│   ├── churndemo.gif
│   ├── icone.png
│   └── image.png
├── model_C=1.0.bin
└── stream_app.py

Install project dependencies in a virtual environment 

We’ll use the Pipenv library to create a virtual Python environment and install the dependencies required to run Streamlit. The Pipenv tool automatically manages project packages through the Pipfile as you install or uninstall them. It also generates a Pipfile.lock file, which helps produce deterministic builds and creates a snapshot of your working environment. Follow these steps to get started.

1) Enter your project directory

cd customer-churnapp-streamlit

2) Install Pipenv

pip install pipenv

3) Install the dependencies

pipenv install

4) Enter the pipenv virtual environment

pipenv shell

After completing these steps, you can run scripts from your virtual environment! 

Building a simple machine-learning model

Machine learning uses algorithms and statistical models. These analyze historical data and make inferences from patterns without any explicit programming. Ultimately, the goal is to predict outcomes based on incoming data. 

In our case, we’re creating a model from historical customer data to predict which customers are likely to leave. Since we need to classify customers as either churn or no-churn, we’ll train a simple-yet-powerful classification model. Our model uses logistic regression on a telecom company’s historical customer dataset. This set tracks customer demographics, tenure, monthly charges, and more. However, one key question is also answered: did the customer churn? 

Logistic regression estimates an event’s probability based on a given dataset of independent variables. Since the outcome is a probability, the dependent variable is bounded between 0 and 1. The model will undergo multiple iterations and calculate best-fit coefficients for each variable. This quantifies just how much each impacts churn. With these coefficients, the model can assign churn likelihood scores between 0 and 1 to new customers. Someone who scores a 1 is extremely likely to churn. Someone with a 0 is incredibly unlikely to churn. 

Python has great libraries like Pandas, NumPy, and Matplotlib that support data analytics. Open-source frameworks like Scikit Learn have pre-built wrappers for various ML models. We’ll use their API to train a logistic-regression model. To understand how this basic churn prediction model was born, refer to Churn_EDA_model_development.ipynb. ML models require many attempts to get right. Therefore, we recommend using a Jupyter notebook or an IDE. 

In a nutshell we performed the below steps to create our churn prediction model:

  1. Initial data preparation 
    • Perform sanity checks on data types and column names 
    • Make data type corrections if needed 
  2. Data and feature understanding 
    • Check the distribution of numerical features
    • Check the distinct values of categorical features 
    • Check the target feature distribution 
  3. Exploratory data analysis 
    • Handle missing values 
    • Handle outliers 
    • Understand correlations and identify spurious ones 
  4. Feature engineering and importance 
    • Analyze churn rate and risk scores across different cohorts and feature groups 
    • Calculate mutual information 
    • Check feature correlations 
  5. Encoding categorical features and scaling numerical features 
    • Convert categorical features into numerical values using Scikit-Learn’s helper function: Dictionary Vectoriser 
    • Scale numerical features to standardize them into a fixed range 
  6. Model training 
    • Select an appropriate ML algorithm 
    • Train the model with custom parameters 
  7. Model evaluation 
    • Refer to Churn_model_metrics.ipynb 
    • Use different metrics to evaluate the model like accuracy, confusion table, precision, recall, ROC curves, AUROC, and cross-validation.
  8. Repeat steps 6 and 7 for different algorithms and model hyperparameters, then select the best-fit model.

It’s best practice to automate the training process using a Python script. Each time we choose to retrain the model with a new parameter or a new dataset, we can execute this script and save the resulting model. 

Check out train.py to explore how to package a model into a script that automates model training! 

Once we uncover the best-fit model, we must save it to reuse it later without running any of the above training code scripts. Let’s get started.

Save the model

In machine learning, we save trained models in a file and restore them to compare each with other models. We can also test them using new data. The save process is called Serialization, while restoration is called Deserialization.

We use a helper Python library called Pickle to save the model. The Pickle module implements a fundamental, yet powerful, algorithm for serializing and de-serializing a Python object structure. 

You can also use the following functions: 

  • pickle.dump serializes an object hierarchy using dump().
  • pickle.load deserializes a data stream via the load() function.

We’ve chosen Pickle since it supports models created using the Scikit-Learn framework and offers great loading performance. Similar training frameworks like Tensorflow and Keras have their own built-in libraries for saving models, which are designed to perform well with their architectures. 

Dump the Model and Dictionary Vectorizer

import pickle

with open('model_C=1.0.bin', 'wb') as f_out
   pickle.dump((dict_vectorizer, model), f_out)
   f_out.close() ## After opening any file it's necessary to close it

We just saved a binary file named model_C=1.0.bin and wrote the dict_vectorizer for one Hot Encoding and included Logistic Regression Model as an array within it. 

Create a new Python file

Now, we’ll create a stream_app.py script that both defines our app layout and trigger-able backend logic. This logic activates when users interact with different UI components. Crucially, this file is reusable with any model. 

This is just an example. We strongly recommend exploring more components and design options from the Streamlit library. If you’re skilled in HTML and JavaScript, you can create your own Streamlit components that grant you more control over your app’s layout. 

First, import the required libraries:

import pickle
import streamlit as st
import pandas as pd
from PIL import Image

Next, you’ll need to load the same binary file we saved earlier to deserialize the model and dictionary vectorizer.

model_file = 'model_C=1.0.bin'

with open(model_file, 'rb') as f_in:
    dv, model = pickle.load(f_in)

The following code snippet loads the images and displays them on your screen. The st.image portion helps display an image on the frontend:

image = Image.open('images/icone.png') 
image2 = Image.open('images/image.png')
  
st.image(image,use_column_width=False)

To display items in the sidebar, you’ll need the following code snippet:

add_selectbox = st.sidebar.selectbox("How would you like to predict?",
("Online", "Batch"))

st.sidebar.info('This app is created to predict Customer Churn')
st.sidebar.image(image2)

Streamlit’s sidebar renders a vertical, collapsible bar where users can select the type of model scoring they want to perform — like batch scoring (predictions for multiple customers) or online scoring (for single customers). We also add text and images to decorate the sidebar. 

The following code helps you display the main title:

st.title("Predicting Customer Churn")

You can display input widgets to collect customer details and generate predictions, when the user selects the ‘Online’ option:

if add_selectbox == 'Online':
  
		gender = st.selectbox('Gender:', ['male', 'female'])
		seniorcitizen= st.selectbox(' Customer is a senior citizen:', [0, 1])
		partner= st.selectbox(' Customer has a partner:', ['yes', 'no'])
		dependents = st.selectbox(' Customer has  dependents:', ['yes', 'no'])
		phoneservice = st.selectbox(' Customer has phoneservice:', ['yes', 'no'])
		multiplelines = st.selectbox(' Customer has multiplelines:', ['yes', 'no', 'no_phone_service'])
		internetservice= st.selectbox(' Customer has internetservice:', ['dsl', 'no', 'fiber_optic'])
		onlinesecurity= st.selectbox(' Customer has onlinesecurity:', ['yes', 'no', 'no_internet_service'])
		onlinebackup = st.selectbox(' Customer has onlinebackup:', ['yes', 'no', 'no_internet_service'])
		deviceprotection = st.selectbox(' Customer has deviceprotection:', ['yes', 'no', 'no_internet_service'])
		techsupport = st.selectbox(' Customer has techsupport:', ['yes', 'no', 'no_internet_service'])
		streamingtv = st.selectbox(' Customer has streamingtv:', ['yes', 'no', 'no_internet_service'])
		streamingmovies = st.selectbox(' Customer has streamingmovies:', ['yes', 'no', 'no_internet_service'])
		contract= st.selectbox(' Customer has a contract:', ['month-to-month', 'one_year', 'two_year'])
		paperlessbilling = st.selectbox(' Customer has a paperlessbilling:', ['yes', 'no'])
		paymentmethod= st.selectbox('Payment Option:', ['bank_transfer_(automatic)', 'credit_card_(automatic)', 'electronic_check' ,'mailed_check'])
		tenure = st.number_input('Number of months the customer has been with the current telco provider :', min_value=0, max_value=240, value=0)
		monthlycharges= st.number_input('Monthly charges :', min_value=0, max_value=240, value=0)
		totalcharges = tenure*monthlycharges
		output= ""
		output_prob = ""
		input_dict={
				"gender":gender ,
				"seniorcitizen": seniorcitizen,
				"partner": partner,
				"dependents": dependents,
				"phoneservice": phoneservice,
				"multiplelines": multiplelines,
				"internetservice": internetservice,
				"onlinesecurity": onlinesecurity,
				"onlinebackup": onlinebackup,
				"deviceprotection": deviceprotection,
				"techsupport": techsupport,
				"streamingtv": streamingtv,
				"streamingmovies": streamingmovies,
				"contract": contract,
				"paperlessbilling": paperlessbilling,
				"paymentmethod": paymentmethod,
				"tenure": tenure,
				"monthlycharges": monthlycharges,
				"totalcharges": totalcharges
			}
          
		if st.button("Predict"):
			X = dv.transform([input_dict])
			y_pred = model.predict_proba(X)[0, 1]
			churn = y_pred >= 0.5
			output_prob = float(y_pred)
			output = bool(churn)
		st.success('Churn: {0}, Risk Score: {1}'.format(output, output_prob))

Your app’s frontend leverages Streamlit’s input widgets like select box, slider, and number input. Users interact with these widgets by entering values. Input data is then packaged into a Python dictionary. The backend — which handles the prediction score computation logic — is defined inside the st.button layer and awaits the user trigger. When this happens, the dictionary is passed to the dictionary vectorizer which performs encoding for categorical features and makes it consumable for the model. 

Streamlit passes any transformed inputs to the model and calculates the churn prediction score. Using the threshold of 0.5, the churn score is converted into a binary class. The risk score and churn class are returned to the frontend via Streamlit’s success component. This displays a success message. 

To display the file upload button when the user selects “Batch” from the sidebar, the following code snippet might be useful:

if add_selectbox == 'Batch':
		file_upload = st.file_uploader("Upload csv file for predictions", type=["csv"])
		if file_upload is not None:
			data = pd.read_csv(file_upload)
			X = dv.transform([data])
			y_pred = model.predict_proba(X)[0, 1]
			churn = y_pred >= 0.5
			churn = bool(churn)
			st.write(churn)

When the user wants to batch score customers, the page layout will dynamically change to match this selection. Streamlit’s file uploader component will display a related widget. This prompts the user to upload a CSV file, which is then read using the panda library and processed by the dictionary vectorizer and model. It displays prediction scores on the frontend using st.write

The above application skeleton is wrapped within a main function in the below script. Running the script invokes the main function. Here’s how that final script looks:

import pickle
import streamlit as st
import pandas as pd
from PIL import Image
model_file = 'model_C=1.0.bin'


with open(model_file, 'rb') as f_in:
    dv, model = pickle.load(f_in)


def main():
	image = Image.open('images/icone.png') 
	image2 = Image.open('images/image.png')
	st.image(image,use_column_width=False) 
	add_selectbox = st.sidebar.selectbox(
	"How would you like to predict?",
	("Online", "Batch"))
	st.sidebar.info('This app is created to predict Customer Churn')
	st.sidebar.image(image2)
	st.title("Predicting Customer Churn")
	if add_selectbox == 'Online':
		gender = st.selectbox('Gender:', ['male', 'female'])
		seniorcitizen= st.selectbox(' Customer is a senior citizen:', [0, 1])
		partner= st.selectbox(' Customer has a partner:', ['yes', 'no'])
		dependents = st.selectbox(' Customer has  dependents:', ['yes', 'no'])
		phoneservice = st.selectbox(' Customer has phoneservice:', ['yes', 'no'])
		multiplelines = st.selectbox(' Customer has multiplelines:', ['yes', 'no', 'no_phone_service'])
		internetservice= st.selectbox(' Customer has internetservice:', ['dsl', 'no', 'fiber_optic'])
		onlinesecurity= st.selectbox(' Customer has onlinesecurity:', ['yes', 'no', 'no_internet_service'])
		onlinebackup = st.selectbox(' Customer has onlinebackup:', ['yes', 'no', 'no_internet_service'])
		deviceprotection = st.selectbox(' Customer has deviceprotection:', ['yes', 'no', 'no_internet_service'])
		techsupport = st.selectbox(' Customer has techsupport:', ['yes', 'no', 'no_internet_service'])
		streamingtv = st.selectbox(' Customer has streamingtv:', ['yes', 'no', 'no_internet_service'])
		streamingmovies = st.selectbox(' Customer has streamingmovies:', ['yes', 'no', 'no_internet_service'])
		contract= st.selectbox(' Customer has a contract:', ['month-to-month', 'one_year', 'two_year'])
		paperlessbilling = st.selectbox(' Customer has a paperlessbilling:', ['yes', 'no'])
		paymentmethod= st.selectbox('Payment Option:', ['bank_transfer_(automatic)', 'credit_card_(automatic)', 'electronic_check' ,'mailed_check'])
		tenure = st.number_input('Number of months the customer has been with the current telco provider :', min_value=0, max_value=240, value=0)
		monthlycharges= st.number_input('Monthly charges :', min_value=0, max_value=240, value=0)
		totalcharges = tenure*monthlycharges
		output= ""
		output_prob = ""
		input_dict={
				"gender":gender ,
				"seniorcitizen": seniorcitizen,
				"partner": partner,
				"dependents": dependents,
				"phoneservice": phoneservice,
				"multiplelines": multiplelines,
				"internetservice": internetservice,
				"onlinesecurity": onlinesecurity,
				"onlinebackup": onlinebackup,
				"deviceprotection": deviceprotection,
				"techsupport": techsupport,
				"streamingtv": streamingtv,
				"streamingmovies": streamingmovies,
				"contract": contract,
				"paperlessbilling": paperlessbilling,
				"paymentmethod": paymentmethod,
				"tenure": tenure,
				"monthlycharges": monthlycharges,
				"totalcharges": totalcharges
			}
		if st.button("Predict"):
           
			X = dv.transform([input_dict])
			y_pred = model.predict_proba(X)[0, 1]
			churn = y_pred >= 0.5
			output_prob = float(y_pred)
			output = bool(churn)
 
		st.success('Churn: {0}, Risk Score: {1}'.format(output, output_prob))

	if add_selectbox == 'Batch':

		file_upload = st.file_uploader("Upload csv file for predictions", type=["csv"])
		if file_upload is not None:
			data = pd.read_csv(file_upload)
			X = dv.transform([data])
			y_pred = model.predict_proba(X)[0, 1]
			churn = y_pred >= 0.5
			churn = bool(churn)
			st.write(churn)


if __name__ == '__main__':
	main()

You can download the complete script from our Dockersamples GitHub page.

Execute the script

streamlit run stream_app.py

View your Streamlit app

You can now view your Streamlit app in your browser. Navigate to the following:

image3 1

Containerizing the Streamlit app with Docker

Let’s explore how to easily run this app within a Docker container, using a Docker Official image. First, you’ll need to download Docker Desktop. Docker Desktop accelerates the image-building process while making useful images more discoverable. Complete this installation process once your download is finished.

Docker uses a Dockerfile to specify each image’s “layers.” Each layer stores important changes stemming from the base image’s standard configuration. Create an empty Dockerfile in your Streamlit project:

touch Dockerfile

Next, use your favorite text editor to open this Dockerfile. We’re going to build out this new file piece by piece. To start, let’s define a base image:

FROM python:3.8.12-slim

It’s now time to ensure that the latest pip modules are installed:

RUN /usr/local/bin/python -m pip install --upgrade pip

Next, let’s quickly create a directory to house our image’s application code. This is the working directory for your application:

WORKDIR /app

The following COPY instruction copies the requirements file from the host machine to the container image:

COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt

The EXPOSE instruction tells Docker that your container is listening on the specified network ports at runtime:

EXPOSE 8501

Finally, create an ENTRYPOINT to make your image executable:

ENTRYPOINT ["streamlit", "run"]
CMD ["stream_app.py"]

After assembling each piece, here’s your complete Dockerfile:

FROM python:3.8.12-slim
RUN /usr/local/bin/python -m pip install --upgrade pip
WORKDIR /app
COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
EXPOSE 8501
COPY . .
ENTRYPOINT ["streamlit", "run"]
CMD ["stream_app.py"]

Build your image

docker build -t customer_churn .

Run the app

docker run -d -p 8501:8501 customer_churn

View the app within Docker Desktop

You can do this by navigating to the Containers interface, which lists your running application as a named container:

image1 2

Access the app

First, select your app container in the list. This opens the Logs view. Click the button with a square icon (with a slanted arrow) located next to the Stats pane. This opens your app in your browser:

image2 1
image3 2

Alternatively, you can hover over your container in the list and click that icon once the righthand toolbar appears.

Develop and deploy your next machine learning model, today

Congratulations! You’ve successfully explored how to build and deploy customer churn prediction models using Streamlit and Docker. With a single Dockerfile, we’ve demonstrated how easily you can build an interactive frontend and deploy this application in seconds. 

With just a few extra steps, you can use this tutorial to build applications with much greater complexity. You can make your app more useful by implementing push-notification logic in the app — letting the marketing team send promotional emails to high-churn customers on the fly. Happy coding.

]]>
Build and Deploy a Retail Store Items Detection System Using No-Code AI Vision at the Edge https://www.docker.com/blog/build-retail-store-items-detection-system-no-code-ai/ Tue, 16 Aug 2022 14:00:45 +0000 https://www.docker.com/?p=36552 retail store item detection

Low-code and no-code platforms have risen sharply in popularity over the past few years. These platforms let users with little or no knowledge of coding build apps 20x faster with minimal coding. They’ve even evolved to a point where they’ve become indispensable tools for expert developers. Such platforms are highly visual and follow a user-friendly, modular approach. Consequently, you need to drag and drop software components into place — all of which are visually represented — to create an app.

Node-RED is a low-code programming language for event-driven applications. It’s a programming tool for wiring together hardware devices, APIs, and online services in new and interesting ways. Node-RED also provides a browser-based flow editor that makes it easy to wire together flows using the wide range of nodes within the palette. Accordingly, you can deploy it to its runtime with a single-click. Once more, you can create JavaScript functions within the editor using the rich-text editor. Finally, Node-RED ships with a built-in library that lets you save useful and reusable functions, templates or flows.

Node-RED’s lightweight runtime is built upon Node.js, taking full advantage of Node’s event-driven, non-blocking model. This helps it run at the edge of the network on low-cost hardware — like the Raspberry Pi and in the cloud. With over 225,000 modules in Node’s package repository, it’s easy to extend the range of palette nodes and add new capabilities.The flows created in Node-RED are stored using JSON, which is easily importable and exportable for sharing purposes. An online flow library lets you publish your best flows publicly.

Users have downloaded our Node-RED Docker Official Image over 100 million times from Docker Hub. What’s driving this significant download rate? There’s an ever-increasing demand for Docker containers to streamline development workflows, while giving Node-RED developers the freedom to innovate with their choice of project-tailored tools, application stacks, and deployment environments. Our Node-RED Official Image also supports multiple architectures like amd64, arm32v6, arm32v7, arm64v8, and s390x.

Why is containerizing Node-RED important?

The Node-RED Project has a huge community of third-party nodes available for installation. Also, note that the community doesn’t generally recommend using an odd-numbered Node version. This advice is tricky for new users, since they might end up fixing Node compatibility issues.

Running your Node-RED app in a Docker container lets users get started quickly with sensible defaults and customization via environmental variables. Users no longer need to worry about compatibility issues. Next, Docker enables users to build, share, and run containerized Node-RED applications — made accessible for developers of all skill levels.

Building your application

In this tutorial, you’ll learn how to build a retail store items detection system using Node-RED. First, you’ll set up Node-RED manually on an IoT Edge device without Docker. Second, you’ll learn how to run it within a Docker container via a one-line command. Finally, you’ll see how Docker containers help you build and deploy this detection system using Node-RED. Let’s jump in.

Hardware components

Software Components

Preparing your Seeed Studio reComputer and development environment

For this demonstration, we’re using a Seeed Studio reComputer. The Seeed Studio reComputer J1010 is powered by the Jetson Nano development kit. It’s a small, powerful, palm-sized computer that makes modern AI accessible to embedded developers. It’s built around the NVIDIA Jetson Nano system-on-module (SoM) and designed for edge AI applications.

Wire it up

Plug your WiFi adapter/Ethernet cable, Keyboard/Mouse, and USB camera into the reComputer system and turn it on using the power cable. Follow the steps to perform initial system startup.

item detection usb camerarecomputer system

Before starting, make sure you have Node installed in your system. Then, follow these steps to set up Node-RED on your Edge device.

Installing Node.js

Ensure that you have the latest stable version of Node.js installed in your system.

curl -fsSL https://deb.nodesource.com/setup_16.x | sudo -E bash -
sudo apt-get install -y nodejs

Verify Node.js and npm versions

The above installer will install both Node.js and npm. Let’s verify that they’re installed properly:

# check Node.js version
nodejs -v
v16.16.0
# check npm version
npm -v
8.11.0

Installing Node-RED

To install Node-RED, you can use the npm command that comes with Node.js:

sudo npm install -g --unsafe-perm node-red

changed 294 packages, and audited 295 packages in 17s

38 packages are looking for funding
run `npm fund` for details

found 0 vulnerabilities

Running Node-RED

Use the node-red command to start Node-RED in your terminal:

node-red
27 Jul 15:08:36 - [info]

Welcome to Node-RED
===================

27 Jul 15:08:36 - [info] Node-RED version: v3.0.1
27 Jul 15:08:36 - [info] Node.js  version: v16.16.0
27 Jul 15:08:36 - [info] Linux 4.9.253-tegra arm64 LE
27 Jul 15:08:37 - [info] Loading palette nodes
27 Jul 15:08:38 - [info] Settings file  : /home/ajetraina/.node-red/settings.js
27 Jul 15:08:38 - [info] Context store  : 'default' [module=memory]
27 Jul 15:08:38 - [info] User directory : /home/ajetraina/.node-red
27 Jul 15:08:38 - [warn] Projects disabled : editorTheme.projects.enabled=false
27 Jul 15:08:38 - [info] Flows file     : /home/ajetraina/.node-red/flows.json
27 Jul 15:08:38 - [info] Creating new flow file
27 Jul 15:08:38 - [warn]

---------------------------------------------------------------------
Your flow credentials file is encrypted using a system-generated key.

If the system-generated key is lost for any reason, your credentials
file will not be recoverable, you will have to delete it and re-enter
your credentials.

You should set your own key using the 'credentialSecret' option in
your settings file. Node-RED will then re-encrypt your credentials
file using your chosen key the next time you deploy a change.
---------------------------------------------------------------------

27 Jul 15:08:38 - [info] Server now running at http://127.0.0.1:1880/
27 Jul 15:08:38 - [warn] Encrypted credentials not found
27 Jul 15:08:38 - [info] Starting flows
27 Jul 15:08:38 - [info] Started flows

You can then access the Node-RED editor by navigating to http://localhost:1880 in your browser.

The log output shares some important pieces of information:

  • Installed versions of Node-RED and Node.js
  • Any errors encountered while trying to load the palette nodes
  • The location of your Settings file and User Directory
  • The name of the flow file currently being used

welcom to node red

Node-RED consists of a Node.js based runtime that provides a web address to access the flow editor. You create your application in the browser by dragging nodes from your palette into a workspace, From there, you start wiring them together. With one click, Node-RED deploys your application back to the runtime where it’s run.

Running Node-RED in a Docker container

The Node-RED Official Image is based on our Node.js Alpine Linux images, in order to keep them as slim as possible. Run the following command to create and mount a named volume called node_red_data to the container’s /data directory. This will allow us to persist any flow changes.

docker run -it -p 1880:1880 -v node_red_data:/data --name mynodered nodered/node-red

You can now access the Node-RED editor via http://localhost:1880 or http://<ip_address_Jetson>:1880.

Building and running your retail store items detection system

To build a fully functional retail store items detection system, follow these next steps.

Write the configuration files

We must define a couple of files that add Node-RED configurations — such as custom themes and custom npm packages.

First, create an empty folder called “node-red-config”:

mkdir node-red-config

Change your directory to node-red-config and run the following command to setup a new NPM package.

npm init

This utility will walk you through the package.json file creation process. It only covers the most common items, and tries to guess sensible defaults.

{
"name": "node-red-project",
"description": "A Node-RED Project",
"version": "0.0.1",
"private": true,
"dependencies": {
"@node-red-contrib-themes/theme-collection": "^2.2.3",
"node-red-seeed-recomputer": "git+https://github.com/Seeed-Studio/node-red-seeed-recomputer.git"
}
}

Create a file called settings.js inside the node-red-config folder and enter the following content. This file defines Node-RED server, runtime, and editor settings. We’ll mainly change the editor settings. For more information about individual settings, refer to the documentation.

module.exports = {

flowFile: 'flows.json',

flowFilePretty: true,

uiPort: process.env.PORT || 1880,

logging: {
console: {
level: "info",
metrics: false,
audit: false
}
},

exportGlobalContextKeys: false,

externalModules: {
},

editorTheme: {
theme: "midnight-red",

page: {
title: "reComputer Flow Editor"
},
header: {
title: "  Flow Editor&amp;amp;amp;lt;br/&amp;amp;amp;gt;",
image: "/data/seeed.webp", // or null to remove image
},

palette: {
},

projects: {
enabled: false,
workflow: {
mode: "manual"
}
},

codeEditor: {
lib: "ace",
options: {
theme: "vs",
}
}
},

functionExternalModules: true,

functionGlobalContext: {
},

debugMaxLength: 1000,

mqttReconnectTime: 15000,

serialReconnectTime: 15000,

}

You can download this image and put it under the node-red-config folder. This image file’s location is defined inside the settings.js file we just created.

Write the script

Create an empty file by running the following command:

touch docker-ubuntu.sh

In order to print colored output, let’s first define a few colors in the shell script. This will get reflected as an output when you execute the script at a later point of time:

IBlack='\033[0;90m'       # Black
IRed='\033[0;91m'         # Red
IGreen='\033[0;92m'       # Green
IYellow='\033[0;93m'      # Yellow
IBlue='\033[0;94m'        # Blue
IPurple='\033[0;95m'      # Purple
ICyan='\033[0;96m'        # Cyan
IWhite='\033[0;97m'       # White

The sudo command allows a normal user to run a command with elevated privileges so that they can perform certain administrative tasks. As this script involves running multiple tasks that involve administrative privileges, it’s always recommended to check if you’re running the script as a “sudo” user.

if ! [ $(id -u) = 0 ] ; then
echo "$0 must be run as sudo user or root"
exit 1
fi

The reComputer for Jetson is sold with 16 GB of eMMC. This ready-to-use hardware has Ubuntu 18.04 LTS and NVIDIA JetPack 4.6 installed, so the remaining user space available is about 2 GB. This could be a significant obstacle to using the reComputer for training and deployment in some projects. Hence, it’s sometimes important to remove unnecessary packages and libraries. This code snippet confirms that you have enough storage to install all included packages and Docker images.

If you have the required storage space, it’ll continue to the next section. Otherwise, the installer will ask if you want to free up some device space. Typing “y” for “yes” will delete unnecessary files and packages to clear some space.

storage=$(df   | awk '{ print  $4  } ' | awk 'NR==2{print}' )
#if storage &amp;amp;amp;gt; 3.8G
if [ $storage -gt 3800000 ] ; then
echo -e "${IGreen}Your storage space left is $(($storage /1000000))GB, you can install this application."
else
echo -e "${IRed}Sorry, you don't have enough storage space to install this application. You need about 3.8GB of storage space."
echo -e "${IYellow}However, you can regain about 3.8GB of storage space by performing the following:"
echo -e "${IYellow}-Remove unnecessary packages (~100MB)"
echo -e "${IYellow}-Clean up apt cache (~1.6GB)"
echo -e "${IYellow}-Remove thunderbird, libreoffice and related packages (~400MB)"
echo -e "${IYellow}-Remove cuda, cudnn, tensorrt, visionworks and deepstream samples (~800MB)"
echo -e "${IYellow}-Remove local repos for cuda, visionworks, linux-headers (~100MB)"
echo -e "${IYellow}-Remove GUI (~400MB)"
echo -e "${IYellow}-Remove Static libraries (~400MB)"
echo -e "${IRed}So, please agree to uninstall the above. Press [y/n]"
read yn
if [ $yn = "y" ] ; then
echo "${IGreen}starting to remove the above-mentioned"
# Remove unnecessary packages, clean apt cache and remove thunderbird, libreoffice
apt update
apt autoremove -y
apt clean
apt remove thunderbird libreoffice-* -y

# Remove samples
rm -rf /usr/local/cuda/samples \
/usr/src/cudnn_samples_* \
/usr/src/tensorrt/data \
/usr/src/tensorrt/samples \
/usr/share/visionworks* ~/VisionWorks-SFM*Samples \
/opt/nvidia/deepstream/deepstream*/samples

# Remove local repos
apt purge cuda-repo-l4t-*local* libvisionworks-*repo -y
rm /etc/apt/sources.list.d/cuda*local* /etc/apt/sources.list.d/visionworks*repo*
rm -rf /usr/src/linux-headers-*

# Remove GUI
apt-get purge gnome-shell ubuntu-wallpapers-bionic light-themes chromium-browser* libvisionworks libvisionworks-sfm-dev -y
apt-get autoremove -y
apt clean -y

# Remove Static libraries
rm -rf /usr/local/cuda/targets/aarch64-linux/lib/*.a \
/usr/lib/aarch64-linux-gnu/libcudnn*.a \
/usr/lib/aarch64-linux-gnu/libnvcaffe_parser*.a \
/usr/lib/aarch64-linux-gnu/libnvinfer*.a \
/usr/lib/aarch64-linux-gnu/libnvonnxparser*.a \
/usr/lib/aarch64-linux-gnu/libnvparsers*.a

# Remove additional 100MB
apt autoremove -y
apt clean
else
exit 1
fi
fi

This code snippet checks if the required software (curl, docker, nvidia-docker2, and Docker Compose) is installed:

apt update

if ! [ -x "$(command -v curl)" ]; then
apt install curl
fi

if ! [ -x "$(command -v docker)" ]; then
apt install docker
fi

if ! [ -x "$(command -v nvidia-docker)" ]; then
apt install nvidia-docker2
fi

if ! [ -x "$(command -v docker-compose)" ]; then
curl -SL https://files.seeedstudio.com/wiki/reComputer/compose.tar.bz2  -o /tmp/compose.tar.bz2
tar xvf /tmp/compose.tar.bz2 -C /usr/local/bin
chmod +x  /usr/local/bin/docker-compose
fi

Next, you need to create a node-red directory under $HOME and then copy all your Node-RED configuration files to your device’s home directory as shown in the snippet below:

mkdir -p $HOME/node-red
cp node-red-config/*  $HOME/node-red

The below snippet allows the script to bring up container services using Docker Compose CLI:

docker compose --file docker-compose.yaml  up -d

Note: You’ll see how to create a Docker Compose file in the next section.

Within the script, let’s specify the command to install a custom Node-RED theme package with three Node-RED blocks corresponding to video input, detection, and video view. We’ll circle back to these nodes later.

docker exec node-red-contrib-ml-node-red-1 bash -c "cd /data &amp;amp;amp;amp;&amp;amp;amp;amp; npm install"

Finally, the below command embedded in the script allows you to restart the node-red-contrib-ml-node-red-1 container to implement your theme changes:

docker restart node-red-contrib-ml-node-red-1

Lastly, save the script as docker-ubuntu.sh.

Define your services within a Compose file

Create an empty file by running the following command inside the same directory as docker-ubuntu.sh:

touch docker-compose.yaml

Add the following lines within your docker-compose.yml file. These specify which services Docker should initiate concurrently at application launch:

services:
node-red:
image: nodered/node-red:3.0.1
restart: always
network_mode: "host"
volumes:
- "$HOME/node-red:/data"
user: "0"
dataloader:
image: baozhu/node-red-dataloader:v1.2
restart: always
runtime: nvidia
network_mode: "host"
privileged: true
devices:
- "/dev:/dev"
- "/var/run/udev:/var/run/udev"
detection:
image: baozhu/node-red-detection:v1.2
restart: always
runtime: nvidia
network_mode: "host"

Your application has the following parts:

  • Three services backed by Docker images: your Node-RED node-red app, dataloader, and detection
  • The dataloader service container will broadcast an OpenCV video stream (either from a USB webcam or an IP camera with RTSP) using the Pub/Sub messaging pattern to port 5550. It’s important to note that one needs to pass privileged:true to allow your service containers to get access to USB camera devices.
  • The detection service container will grab the above video stream and perform inference using TensorRT implementation of YOLOv5. This is an object-detection algorithm that can identify objects in real-time.

Execute the script

Open your terminal and run the following command:

sudo ./docker-ubuntu.sh

It’ll take approximately 2-3 minutes for these scripts to execute completely.

View your services

Once your script is executed, you can verify that your container services are up and running:

docker compose ps

CONTAINER ID   IMAGE                             COMMAND                  CREATED          STATUS                        PORTS     NAMES
e487c20eb87b   baozhu/node-red-dataloader:v1.2   "python3 python/pub_…"   48 minutes ago   Up About a minute                       retail-store-items-detection-nodered-dataloader-1
4441bc3c2a2c   baozhu/node-red-detection:v1.2    "python3 python/yolo…"   48 minutes ago   Up About a minute                       retail-store-items-detection-nodered-detection-1
dd5c5e37d60d   nodered/node-red:3.0.1            "./entrypoint.sh"        48 minutes ago   Up About a minute (healthy)             retail-store-items-detection-nodered-node-red-1

Visit http://127.0.0.1:1880/ to access the app.

seeed studio node red

You’ll find built-in nodes (video input, detection, and video view) available in the palette:

seeed studio flow detection

Let’s try to wire nodes by dragging them one-by-one from your palette into a workspace. First, let’s drag video input from the palette to the workspace. Double-click “Video Input” to view the following properties, and select “Local Camera”.

Note: We choose a local camera here to grab the video stream from the connected USB webcam. However, you can also grab the video stream from an IP camera via RTSP.

seeed studio video input

You’ll see that Node-RED chooses “COCO dataset” model name by default:

seeed studio object detection

Next, drag Video View from the palette to the workspace. If you double-click on Video View, you’ll discover that msg.payload is already chosen for you under the Property section.

seeed studio video view

Wire up the nodes

Once you have all the nodes placed in the workspace, it’s time to wire nodes together. Nodes are wired together by pressing the left-mouse button on a node’s port, dragging to the destination node and releasing the mouse button (as shown in the following screenshot).

seeed studio video flow

Trigger Deploy at the top right corner to start the deployment process. By now, you should be able to see the detection process working as Node-RED detects items.

seeed studio image detection flow

Conclusion

The ultimate goal of modernizing software development is to deliver high-value software to end users even faster. Low-code technology like Node-RED and Docker help us achieve this by accelerating the time from ideation to software delivery. Docker helps accelerate the process of building, running, and sharing modern AI applications.

Docker Official Images help you develop your own unique applications — no matter what tech stack you’re accustomed to. With one YAML file, we’ve demonstrated how Docker Compose helps you easily build Node-RED apps. We can even take Docker Compose and develop real-world microservices applications. With just a few extra steps, you can apply this tutorial while building applications with much greater complexity. Happy coding!

References:

]]>
How to Train and Deploy a Linear Regression Model Using PyTorch – Part 1 https://www.docker.com/blog/how-to-train-and-deploy-a-linear-regression-model-using-pytorch-part-1/ Thu, 16 Jun 2022 14:00:29 +0000 https://www.docker.com/?p=34094 Python is one of today’s most popular programming languages and is used in many different applications. The 2021 StackOverflow Developer Survey showed that Python remains the third most popular programming language among developers. In GitHub’s 2021 State of the Octoverse report, Python took the silver medal behind Javascript.

Thanks to its longstanding popularity, developers have built many popular Python frameworks and libraries like Flask, Django, and FastAPI for web development.

However, Python isn’t just for web development. It powers libraries and frameworks like NumPy (Numerical Python), Matplotlib, scikit-learn, PyTorch, and others which are pivotal in engineering and machine learning. Python is arguably the top language for AI, machine learning, and data science development. For deep learning (DL), leading frameworks like TensorFlow, PyTorch, and Keras are Python-friendly.

We’ll introduce PyTorch and how to use it for a simple problem like linear regression. We’ll also provide a simple way to containerize your application. Also, keep an eye out for Part 2 — where we’ll dive deeply into a real-world problem and deployment via containers. Let’s get started.

What is PyTorch?

A Brief History and Evolution of PyTorch

Torch debuted in 2002 as a deep-learning library developed in the Lua language. Accordingly, Soumith Chintala and Adam Paszke (both from Meta) developed PyTorch in 2016 and based it on the Torch library. Since then, developers have flocked to it. PyTorch was the third-most-popular framework per the 2021 StackOverflow Developer Survey. However, it’s the most loved DL library among developers and ranks third in popularity. Pytorch is also the DL framework of choice for Tesla, Uber, Microsoft, and over 7,300 others.

PyTorch enables tensor computation with GPU acceleration, plus deep neural networks built on a tape-based autograd system. We’ll briefly break these terms down, in case you’ve just started learning about these technologies.

  • A tensor, in a machine learning context, refers to an n-dimensional array.
  • A tape-based autograd means that Pytorch uses reverse-mode automatic differentiation, which is a mathematical technique to compute derivatives (or gradients) effectively using a computer.

Since diving into these mathematics might take too much time, check out these links for more information:

PyTorch is a vast library and contains plenty of features for various deep learning applications. To get started, let’s evaluate a use case like linear regression.

What is Linear Regression?

Linear Regression is one of the most commonly used mathematical modeling techniques. It models a linear relationship between two variables. This technique helps determine correlations between two variables — or determines the value-dependent variable based on a particular value of the independent variable.

In machine learning, linear regression often applies to prediction and forecasting applications. You can solve it analytically, typically without needing any DL framework. However, this is a good way to understand the PyTorch framework and kick off some analytical problem-solving.

Numerous books and web resources address the theory of linear regression. We’ll cover just enough theory to help you implement the model. We’ll also explain some key terms. If you want to explore further, check out the useful resources at the end of this section.

Linear Regression Model

You can represent a basic linear regression model with the following equation:

Y = mX + bias

What does each portion represent?

  • Y is the dependent variable, also called a target or a label.
  • X is the independent variable, also called a feature(s) or co-variate(s).
  • bias is also called offset.
  • m refers to the weight or “slope.”

These terms are often interchangeable. The dependent and independent variables can be scalars or tensors.

The goal of the linear regression is to choose weights and biases so that any prediction for a new data point — based on the existing dataset — yields the lowest error rate. In simpler terms, linear regression is finding the best possible curve (line, in this case) to match your data distribution.

Loss Function

A loss function is an error function that expresses the error (or loss) between real and predicted values. A very popular way to measure loss is by using a root mean squared error, which we’ll also use.

Gradient Descent Algorithms

Gradient descent is a class of optimization algorithms that tries to solve the problem (either analytically or using deep learning models) by starting from an initial guess of weights and bias. It then iteratively reduces errors by updating weights and bias values with successively better guesses.

A simplified approach uses the derivative of the loss function and minimizes the loss. The derivative is the slope of the mathematical curve, and we’re attempting to reach the bottom of it — hence the name gradient descent. The stochastic gradient method samples smaller batches of data to compute updates which are computationally better than passing the entire dataset at each iteration.

To learn more about this theory, the following resources are helpful:

Linear Regression with Pytorch

Now, let’s talk about implementing a linear regression model using PyTorch. The script shown in the steps below is main.py — which resides in the GitHub repository and is forked from the “Dive Into Deep learning” example repository. You can find code samples within the pytorch directory.

For our regression example, you’ll need the following:

  • Python 3
  • PyTorch module (pip install torch) installed on your system
  • NumPy module (pip install numpy) installed
  • Optionally, an editor (VS Code is used in our example)

Problem Statement

As mentioned previously, linear regression is analytically solvable. We’re using deep learning to solve this problem since it helps you quickly get started and easily check the validity of your training data. This compares your training data against the data set.

We’ll attempt the following using Python and PyTorch:

  • Creating synthetic data where we’re aware of weights and bias
  • Using the PyTorch framework and built-in functions for tensor operations, dataset loading, model definition, and training

We don’t need a validation set for this example since we already have the ground truth. We’d assess our results by measuring the error against the weights and bias values used while creating our synthetic data.

Step 1: Import Libraries and Namespaces

For our simple linear regression, we’ll import the torch library in Python. We’ll also add some specific namespaces from our torch import. This helps create cleaner code:


# Step 1 import libraries and namespaces

import torch

from torch.utils import data

# `nn` is an abbreviation for neural networks

from torch import nn

Step 2: Create a Dataset

For simplicity’s sake, this example creates a synthetic dataset that aims to form a linear relationship between two variables with some bias.

i.e. y = mx + bias + noise


#Step 2: Create Dataset

#Define a function to generate noisy data

def synthetic_data(m, c, num_examples):

"""Generate y = mX + bias(c) + noise"""

X = torch.normal(0, 1, (num_examples, len(m)))

y = torch.matmul(X, m) + c

y += torch.normal(0, 0.01, y.shape)

return X, y.reshape((-1, 1))

&amp;amp;amp;amp;nbsp;

true_m = torch.tensor([2, -3.4])

true_c = 4.2

features, labels = synthetic_data(true_m, true_c, 1000)

Here, we use the built-in PyTorch function torch.normal to return a tensor of normally distributed random numbers. We’re also using the torch.matmul function to multiply tensor X with tensor m, and Y is distributed normally again.

The dataset looks like this when visualized using a simple scatter plot:

scatterplot

The code to create the visualization can be found in this GitHub repository.

Step 3: Read the Dataset and Define Small Batches of Data

#Step 3: Read dataset and create small batch

#define a function to create a data iterator. Input is the features and labels from synthetic data

# Output is iterable batched data using torch.utils.data.DataLoader

def load_array(data_arrays, batch_size, is_train=True):

"""Construct a PyTorch data iterator."""

dataset = data.TensorDataset(*data_arrays)

return data.DataLoader(dataset, batch_size, shuffle=is_train)

&amp;amp;amp;nbsp;

batch_size = 10

data_iter = load_array((features, labels), batch_size)

&amp;amp;amp;nbsp;

next(iter(data_iter))

Here, we use the PyTorch functions to read and sample the dataset. TensorDataset stores the samples and their corresponding labels, while DataLoader wraps an iterable around the TensorDataset for easier access.

The iter function creates a Python iterator, while next obtains the first item from that iterator.

Step 4: Define the Model

PyTorch offers pre-built models for different cases. For our case, a single-layer, feed-forward network with two inputs and one output layer is sufficient. The PyTorch documentation provides details about the nn.linear implementation.

The model also requires the initialization of weights and biases. In the code, we initialize the weights using a Gaussian (normal) distribution with a mean value of 0, and a standard deviation value of 0.01. The bias is simply zero.


#Step4: Define model &amp;amp;amp; initialization

# Create a single layer feed-forward network with 2 inputs and 1 outputs.

net = nn.Linear(2, 1)

&amp;amp;amp;nbsp;

#Initialize model params

net.weight.data.normal_(0, 0.01)

net.bias.data.fill_(0)

Step 5: Define the Loss Function

The loss function is defined as a root mean squared error. The loss function tells you how far from the regression line the data points are:


#Step 5: Define loss function
# mean squared error loss function
loss = nn.MSELoss()

Step 6: Define an Optimization Algorithm

For optimization, we’ll implement a stochastic gradient descent method.
The lr stands for learning rate and determines the update step during training.


#Step 6: Define optimization algorithm
# implements a stochastic gradient descent optimization method
trainer = torch.optim.SGD(net.parameters(), lr=0.03)

Step 7: Training

For training, we’ll use specialized training data for n epochs (five in our case), iteratively using minibatch features and corresponding labels. For each minibatch, we’ll do the following:

  • Compute predictions and calculate the loss
  • Calculate gradients by running the backpropagation
  • Update the model parameters
  • Compute the loss after each epoch

# Step 7: Training

# Use complete training data for n epochs, iteratively using a minibatch features and corresponding label

# For each minibatch:

# &amp;nbsp; Compute predictions by calling net(X) and calculate the loss l

# &amp;nbsp; Calculate gradients by running the backpropagation

# &amp;nbsp; Update the model parameters using optimizer

# &amp;nbsp; Compute the loss after each epoch and print it to monitor progress

num_epochs = 5

for epoch in range(num_epochs):

for X, y in data_iter:

l = loss(net(X) ,y)

trainer.zero_grad() #sets gradients to zero

l.backward() # back propagation

trainer.step() # parameter update

l = loss(net(features), labels)

print(f'epoch {epoch + 1}, loss {l:f}')

Results

Finally, compute errors by comparing the true value with the trained model parameters. A low error value is desirable. You can compute the results with the following code snippet:


#Results
m = net.weight.data
print('error in estimating m:', true_m - m.reshape(true_m.shape))
c = net.bias.data
print('error in estimating c:', true_c - c)

When you run your code, the terminal window outputs the following:

python3 main.py 
features: tensor([1.4539, 1.1952]) 
label: tensor([3.0446])
epoch 1, loss 0.000298
epoch 2, loss 0.000102
epoch 3, loss 0.000101
epoch 4, loss 0.000101
epoch 5, loss 0.000101
error in estimating m: tensor([0.0004, 0.0005])
error in estimating c: tensor([0.0002])

As you can see, errors gradually shrink alongside the values.

Containerizing the Script

In the previous example, we had to install multiple Python packages just to run a simple script. Containers, meanwhile, let us easily package all dependencies into an image and run an application.

We’ll show you how to quickly and easily Dockerize your script. Part 2 of the blog will discuss containerized deployment in greater detail.

Containerize the Script

Containers help you bundle together your code, dependencies, and libraries needed to run applications in an isolated environment. Let’s tackle a simple workflow for our linear regression script.

We’ll achieve this using Docker Desktop. Docker Desktop incorporates Dockerfiles, which specify an image’s overall contents.

Make sure to pull a Python base image (version 3.10) for our example:

FROM python:3.10

Next, we’ll install the numpy and torch dependencies needed to run our code:

RUN apt update && apt install -y python3-pip
RUN pip3 install numpy torch

Afterwards, we’ll need to place our main.py script into a directory:

COPY main.py app/

Finally, the CMD instruction defines important executables. In our case, we’ll run our main.py script:

CMD ["python3", "app/main.py" ]

Our complete Dockerfile is shown below, and exists within this GitHub repo:

FROM python:3.10
RUN apt update && apt install -y python3-pip
RUN pip3 install numpy torch
COPY main.py app/
CMD ["python3", "app/main.py" ]

Build the Docker Image

Now that we have every instruction that Docker Desktop needs to build our image, we’ll follow these steps to create it:

  1. In the GitHub repository, our sample script and Dockerfile are located in a directory called pytorch. From the repo’s home folder, we can enter cd deeplearning-docker/pytorch to access the correct directory.
  2. Our Docker image is named linear_regression. To build your image, run the docker build -t linear_regression. command.

Run the Docker Image

Now that we have our image, we can run it as a container with the following command:

docker run linear_regression

This command will create a container and execute the main.py script. Once we run the container, it’ll re-print the loss and estimates. The container will automatically exit after executing these commands. You can view your container’s status via Docker Desktop’s Container interface:

containers docker desktop

Desktop shows us that linear_regression executed the commands and exited successfully.

We can view our error estimates via the terminal or directly within Docker Desktop. I used a Docker Extension called Logs Explorer to view my container’s output (shown below):

Alternatively, you may also experiment using the Docker image that we created in this blog.

logs

As we can see, the results from running the script on my system and inside the container are comparable.

To learn more about using containers with Python, visit these helpful links:

Want to learn more about PyTorch theories and examples?

We took a very tiny peek into the world of Python, PyTorch, and deep learning. However, many resources are available if you’re interested in learning more. Here are some great starting points:

Additionally, endless free and paid courses exist on websites like YouTube, Udemy, Coursera, and others.

Stay tuned for more!

In this blog, we’ve introduced PyTorch and linear regression, and we’ve used the PyTorch framework to solve a very simple linear regression problem. We’ve also shown a very simple way to containerize your PyTorch application.

But, we have much, much more to discuss on deployment. Stay tuned for our follow-up blog — where we’ll tackle the ins and outs of deep-learning deployments! You won’t want to miss this one.

]]>
PyTorch at Uber - Sidney Zhang, Uber nonadult
Topic Spotlight: Here’s What You Can Expect at DockerCon 2022 https://www.docker.com/blog/what-you-can-expect-at-dockercon-2022/ Wed, 13 Apr 2022 19:02:47 +0000 https://www.docker.com/?p=33090

With less than one month to go before DockerCon 2022, we’re excited to unveil one of our most immersive agendas yet. We’re also planning to offer multiple tracks throughout the day — allowing you to jump between topics that grab your attention.

DockerCon 2022 is hosted virtually and will be streamed live. If you haven’t registered, we encourage you to join us May 9th and 10th for two free days of concentrated learning, skill sharpening, collaboration, and engagement with fellow developers.

Social_Twitter Horizontal Submarine

DockerCon brings together developers, Docker Community Leaders, Docker Captains, and our partners to boost your understanding of cloud-native development. But, we’re also excited to see how you’ve incorporated Docker into your projects. We want to help every developer discover more about Docker, learn how to conquer common development challenges, and excel within their respective roles.

That said, what’s in store? Follow along as we highlight what’s new this year and showcase a few can’t-miss topics.

What’s new at DockerCon 2022

We’re keeping things fresh and interesting by having our presenters connect with the audience and participate in live chats throughout the event. Accordingly, you’ll have the opportunity to chat with your favorite presenters. Here’s what you can look forward to (including some cool announcements):

Day-Zero Pre-Event Workshop

May 9th — from 7:00 a.m. to 10:00 a.m. PDT, and later from 4:00 p.m. to 7:00 p.m. PDT.

We want developers of all experience levels to get up and running with Docker as quickly as possible. Our instructor-led course will outline how to build, share, and run your applications using Docker containers. However, even developers with some Docker experience may learn some useful tips. Through hands-on instruction, you’ll discover that harnessing Docker is simple, approachable, and enjoyable.

We’re also introducing early learner demo content. Stay tuned for useful code samples, repo access, and even information on useful extensions.

Engaging Sessions

DockerCon 2022 opens with a fun, pre-show countdown filled with games and challenges — plus a live keynote. You’ll then be free to explore each session:

  • Mainstage – Live stream with engaging, center-stage talks on industry trends, new features, and team-building, plus panel sessions — with live hosts to guide you through your DockerCon experience.
  • Discover – Development tips and ways to incorporate tech stacks with Docker
  • Learn – Walkthroughs for deploying production environments, discussing best practices, and harnessing different programming languages
  • Excel – Detailed guidance on using containers, Docker, and applying workflows to emerging use cases
  • Blackbelt – Code-heavy, demo-driven walkthroughs of advanced technologies, Docker containerization, and building integrated application environments

DockerCon 2022 will also include a number of demos and chats about industry trends. You’ll get plenty of news and keep current on today’s exciting tech developments.

Each session spans 15 to 60 minutes. Feel free to move between virtual rooms as often as you’d like! You can view our complete agenda here, and all sessions are tagged for language and topic.

Exciting Announcements and Highlights

While DockerCon 2022 will feature a diverse topic lineup — all of which are immensely valuable to the community — we’d like to briefly showcase some topics that we find particularly noteworthy.

Introducing Docker SBOM

While we’re quick to implement containers and software tools to accelerate application development, we often don’t know each package’s contents in great detail. That can be a problem when using unverified, third-party images — or in any instance where component transparency is desirable.

Accordingly, we’ll be presenting our Software Bill of Materials (SBOM): a new way to generate lists of all container image contents. This is possible with a simple CLI command. The SBOM is useful from a security standpoint, yet it also allows you to better understand how your applications come together. Follow along as we show you how to summon your specific Docker SBOM, and why that information is so useful.

Using Python and Docker for Data Science and Scientific Computing

If you’re someone with a strong interest in machine learning, data science, and data-centric computing, you won’t want to miss this one. Researchers and professionals who work with big data daily have to perform a number of resource-intensive tasks. Data analysis is taxing, and restricting those processes to hardware that can only be scaled vertically is a massive challenge.

Scalable, containerized applications can solve this problem. If you want to learn more about optimizing your image builds, bolstering security, and improving Docker-related workflows, we encourage you to swing by this session.

From Legacy to Kubernetes

For newer developers and even experienced developers, setting up Kubernetes effectively can be a challenge. Configuration takes time and can feel like a chore. You have to memorize plenty of components, and understand how each impacts your implementation over time.

This presentation will show you how to spin up Kubernetes using Docker Desktop — our GUI-based platform for building, shipping, and deploying containerized applications. Built atop Docker Engine, one standout of Docker Desktop’s feature set is the ability to create a single, local Kubernetes cluster in one click.

Follow along as we dive into the basics of Kubernetes, moving legacy apps into containers, and implementing security best practices

Register Today!

We couldn’t be more thrilled to kick off DockerCon 2022. It’s our largest event of the year! More importantly, however, DockerCon allows you to meet and interact with tens of thousands of other developers. There’s plenty of conversation to be had, and we guarantee that you’ll learn a thing or two along the way. We’ll feature some informative sessions, casual chats, and even unveil a few surprises!

Registering for DockerCon is quick, easy, and free. Please visit our registration page to sign up and learn more about the awesome things coming at DockerCon 2022!

]]>
How to Deploy GPU-Accelerated Applications on Amazon ECS with Docker Compose https://www.docker.com/blog/deploy-gpu-accelerated-applications-on-amazon-ecs-with-docker-compose/ Tue, 16 Feb 2021 17:00:00 +0000 https://www.docker.com/blog/?p=27442 shutterstock 1315361570

Many applications can take advantage of GPU acceleration, in particular resource-intensive Machine Learning (ML) applications. The development time of such applications may vary based on the hardware of the machine we use for development. Containerization will facilitate development due to reproducibility and will make the setup easily transferable to other machines. Most importantly, a containerized application is easily deployable to platforms such as Amazon ECS, where it can take advantage of different hardware configurations.

In this tutorial, we discuss how to develop GPU-accelerated applications in containers locally and how to use Docker Compose to easily deploy them to the cloud (the Amazon ECS platform). We make the transition from the local environment to a cloud effortless, the GPU-accelerated application being packaged with all its dependencies in a Docker image, and deployed in the same way regardless of the target environment.

Requirements

In order to follow this tutorial, we need the following tools installed locally:

For deploying to a cloud platform, we rely on the new Docker Compose implementation embedded into the Docker CLI binary. Therefore, when targeting a cloud platform we are going to run docker compose commands instead of docker-compose. For local commands, both implementations of Docker Compose should work. If you find a missing feature that you use, report it on the issue tracker.

Sample application

Keep in mind that what we want to showcase is how to structure and manage a GPU accelerated application with Docker Compose, and how we can deploy it to the cloud. We do not focus on GPU programming or the AI/ML algorithms, but rather on how to structure and containerize such an application to facilitate portability, sharing and deployment.

For this tutorial, we rely on sample code provided in the Tensorflow documentation, to simulate a GPU-accelerated translation service that we can orchestrate with Docker Compose. The original code can be found documented at  https://www.tensorflow.org/tutorials/text/nmt_with_attention. For this exercise, we have reorganized the code such that we can easily manage it with Docker Compose.

This sample uses the Tensorflow platform which can automatically use GPU devices if available on the host. Next, we will discuss how to organize this sample in services to containerize them easily and what the challenges are when we locally run such a resource-intensive application.

Note: The sample code to use throughout this tutorial can be found here. It needs to be downloaded locally to exercise the commands we are going to discuss.

1. Local environment

Let’s assume we want to build and deploy a service that can translate simple sentences to a language of our choice. For such a service, we need to train an ML model to translate from one language to another and then use this model to translate new inputs. 

Application setup

We choose to separate the phases of the ML process in two different Compose services:

  • A training service that trains a model to translate between two languages (includes the data gathering, preprocessing and all the necessary steps before the actual training process).
  • A translation service that loads a model and uses it to `translate` a sentence.

This structure is defined in the docker-compose.dev.yaml from the downloaded sample application which has the following content:

docker-compose.yml

services:

 training:
   build: backend
   command: python model.py
   volumes:
     - models:/checkpoints

 translator:
   build: backend
   volumes:
     - models:/checkpoints
   ports:
     - 5000:5000

volumes:
 models:

We want the training service to train a model to translate from English to French and to save this model to a named volume models that is shared between the two services. The translator service has a published port to allow us to query it easily.

Deploy locally with Docker Compose

The reason for starting with the simplified compose file is that it can be deployed locally whether a GPU is present or not. We will see later how to add the GPU resource reservation to it.

Before deploying, rename the docker-compose.dev.yaml to docker-compose.yaml to avoid setting the file path with the flag -f for every compose command.

To deploy the Compose file, all we need to do is open a terminal, go to its base directory and run:

$ docker compose up
The new 'docker compose' command is currently experimental.
To provide feedback or request new features please open
issues at https://github.com/docker/compose-cli
[+] Running 4/0
 ⠿ Network "gpu_default"  Created                               0.0s
 ⠿ Volume "gpu_models"    Created                               0.0s
 ⠿ gpu_translator_1       Created                               0.0s
 ⠿ gpu_training_1         Created                               0.0s
Attaching to gpu_training_1, gpu_translator_1
...
translator_1  |  * Running on http://0.0.0.0:5000/ (Press CTRL+C
to quit)
...
HTTP/1.1" 200 -
training_1    | Epoch 1 Batch 0 Loss 3.3540
training_1    | Epoch 1 Batch 100 Loss 1.6044
training_1    | Epoch 1 Batch 200 Loss 1.3441
training_1    | Epoch 1 Batch 300 Loss 1.1679
training_1    | Epoch 1 Loss 1.4679
training_1    | Time taken for 1 epoch 218.06381964683533 sec
training_1    | 
training_1    | Epoch 2 Batch 0 Loss 0.9957
training_1    | Epoch 2 Batch 100 Loss 1.0288
training_1    | Epoch 2 Batch 200 Loss 0.8737
training_1    | Epoch 2 Batch 300 Loss 0.8971
training_1    | Epoch 2 Loss 0.9668
training_1    | Time taken for 1 epoch 211.0763041973114 sec
...
training_1    | Checkpoints saved in /checkpoints/eng-fra
training_1    | Requested translator service to reload its model,
response status: 200
translator_1  | 172.22.0.2 - - [18/Dec/2020 10:23:46] 
"GET /reload?lang=eng-fra 

Docker Compose deploys a container for each service and attaches us to their logs which allows us to follow the progress of the training service.

Every 10 cycles (epochs), the training service requests the translator to reload its model from the last checkpoint. If the translator is queried before the first training phase (10 cycles) is completed, we should get the following message. 

$ curl -d "text=hello" localhost:5000/
No trained model found / training may be in progress...

From the logs, we can see that each training cycle is resource-intensive and may take very long (depending on parameter setup in the ML algorithm).

The training service runs continuously and checkpoints the model periodically to a named volume shared between the two services. 

$ docker ps -a
CONTAINER ID   IMAGE            COMMAND                  CREATED          STATUS                     PORTS                    NAMES
f11fc947a90a   gpu_training     "python model.py"        14 minutes ago   Up 54 minutes                   gpu_training_1                           
baf147fbdf18   gpu_translator   "/bin/bash -c 'pytho..." 14 minutes ago   Up 54 minutes              0.0.0.0:5000->5000/tcp   gpu_translator_1

We can now query the translator service which uses the trained model:

$ $ curl -d "text=hello" localhost:5000/
salut !
$ curl -d "text=I want a vacation" localhost:5000/
je veux une autre . 
$ curl -d "text=I am a student" localhost:5000/
je suis etudiant .

Keep in mind that, for this exercise, we are not concerned about the accuracy of the translation but how to set up the entire process following a service approach that will make it easy to deploy with Docker Compose.

During development, we may have to re-run the training process and evaluate it each time we tweak the algorithm. This is a very time consuming task if we do not use development machines built for high performance.

An alternative is to use on-demand cloud resources. For example, we could use cloud instances hosting GPU devices to run the resource-intensive components of our application. Running our sample application on a machine with access to a GPU will automatically switch to train the model on the GPU. This will speed up the process and significantly reduce the development time.

The first step to deploy this application to some faster cloud instances is to pack it as a Docker image and push it to Docker Hub, from where we can access it from cloud instances.

Build and Push images to Docker Hub

During the deployment with compose up, the application is packed as a Docker image which is then used to create the containers. We need to tag the built images and push them to Docker Hub.

 A simple way to do this is by setting the image property for services in the Compose file. Previously, we had only set the build property for our services, however we had no image defined. Docker Compose requires at least one of these properties to be defined in order to deploy the application.

We set the image property following the pattern <account>/<name>:<tag> where the tag is optional (default to ‘latest’). We take for example a Docker Hub account ID myhubuser and the application name gpudemo. Edit the compose file and set the image property for the two services as below:

docker-compose.yml

services:

 training:
   image: myhubuser/gpudemo
   build: backend
   command: python model.py
   volumes:
     - models:/checkpoints

 translator:
   image: myhubuser/gpudemo
   build: backend
   volumes:
     - models:/checkpoints
   ports:
     - 5000:5000

volumes:
 models:

To build the images run:

$ docker compose build
The new 'docker compose' command is currently experimental. To
provide feedback or request new features please open issues
 at https://github.com/docker/compose-cli
[+] Building 1.0s (10/10) FINISHED 
 => [internal] load build definition from Dockerfile
0.0s 
=> => transferring dockerfile: 206B
...
 => exporting to image
0.8s 
 => => exporting layers    
0.8s  
 => => writing image sha256:b53b564ee0f1986f6a9108b2df0d810f28bfb209
4743d8564f2667066acf3d1f
0.0s
 => => naming to docker.io/myhubuser/gpudemo

$ docker images | grep gpudemo
myhubuser/gpudemo  latest   b53b564ee0f1   2 minutes ago 
  5.83GB   

Notice the image has been named according to what we set in the Compose file.

Before pushing this image to Docker Hub, we need to make sure we are logged in. For this we run:

$ docker login
...
Login Succeeded

Push the image we built:

$ docker compose push
Pushing training (myhubuser/gpudemo:latest)...
The push refers to repository [docker.io/myhubuser/gpudemo]
c765bf51c513: Pushed
9ccf81c8f6e0: Layer already exists
...
latest: digest: sha256:c40a3ca7388d5f322a23408e06bddf14b7242f9baf7fb
e7201944780a028df76 size: 4306

The image pushed is public unless we set it to private in Docker Hub’s repository settings. The Docker documentation covers this in more detail.

With the image stored in a public image registry, we will look now at how we can use it to deploy our application on Amazon ECS and how we can use GPUs to accelerate it.

2. Deploy to Amazon ECS for GPU-acceleration

To deploy the application to Amazon ECS, we need to have credentials for accessing an AWS account and to have Docker CLI set to target the platform.

Let’s assume we have a valid set of AWS credentials that we can use to connect to AWS services.  We need now to create an ECS Docker context to redirect all Docker CLI commands to Amazon ECS.

Create an ECS context

To create an ECS context run the following command:

$ docker context create ecs cloud
? Create a Docker context using:  [Use arrows to move, type
to filter]
> AWS environment variables 
  An existing AWS profile
  A new AWS profile

This prompts users with 3 options, depending on their familiarity with the AWS credentials setup.

For this exercise, to skip the details of  AWS credential setup, we choose the first option. This requires us to have the AWS_ACCESS_KEY and AWS_SECRET_KEY set in our environment,  when running Docker commands that target Amazon ECS.

We can now run Docker commands and set the context flag for all commands targeting the platform, or we can switch it to be the context in use to avoid setting the flag on each command.

Set Docker CLI to target ECS

Set the context we created previously as the context in use by running:

$ docker context use cloud

$ docker context ls
NAME                TYPE                DESCRIPTION                               DOCKER ENDPOINT               KUBERNETES ENDPOINT   ORCHESTRATOR
default             moby                Current DOCKER_HOST based configuration   unix:///var/run/docker.sock                         swarm
cloud *             ecs                 credentials read from environment

Starting from here, all the subsequent Docker commands are going to target Amazon ECS. To switch back to the default context targeting the local environment, we can run the following:

$ docker context use default

For the following commands, we keep ECS context as the current context in use. We can now run a command to check we can successfully access ECS.

$ AWS_ACCESS_KEY="*****" AWS_SECRET_KEY="******" docker compose ls
NAME                                STATUS 

Before deploying the application to Amazon ECS, let’s have a look at how to update the Compose file to request GPU access for the training service. This blog post describes a way to define GPU reservations. In the next section, we cover the new format supported in the local compose and the legacy docker-compose.

Define GPU reservation in the Compose file

Tensorflow can make use of NVIDIA GPUs with CUDA compute capabilities to speed up computations. To reserve NVIDIA GPUs,  we edit the docker-compose.yaml  that we defined previously and add the deploy property under the training service as follows:

...
 training:
   image: myhubuser/gpudemo
   command: python model.py eng-fra
   volumes:
     - models:/checkpoints
   deploy:
     resources:
       reservations:
         memory:32Gb
         devices:
         - driver: nvidia
           count: 2
           capabilities: [gpu]
...

For this example we defined a reservation of 2 NVIDIA GPUs and 32GB memory dedicated to the container. We can tweak these parameters according to the resources of the machine we target for deployment. If our local dev machine hosts an NVIDIA GPU, we can tweak the reservation accordingly and deploy the Compose file locally.  Ensure you have installed the NVIDIA container runtime and set up the Docker Engine to use it before deploying the Compose file.

We focus in the next part on how to make use of GPU cloud instances to run our sample application.

Note: We assume the image we pushed to Docker Hub is public. If so, there is no need to authenticate in order to pull it (unless we exceed the pull rate limit). For images that need to be kept private, we need to define the x-aws-pull_credentials property with a reference to the credentials to use for authentication. Details on how to set it can be found in the documentation.

Deploy to Amazon ECS

Export the AWS credentials to avoid setting them for every command.

$ export AWS_ACCESS_KEY="*****" 
$ export AWS_SECRET_KEY="******"

When deploying the Compose file, Docker Compose will also reserve an EC2 instance with GPU capabilities that satisfies the reservation parameters. In the example we provided, we ask to reserve an instance with 32GB and 2 Nvidia GPUs. Docker Compose matches this reservation with the instance that satisfies this requirement. Before setting the reservation property in the Compose file, we recommend to check the Amazon GPU instance types and set your reservation accordingly. Ensure you are targeting an Amazon region that contains such instances.

WARNING: Aside from ECS containers, we will have a `g4dn.12xlarge` EC2 instance reserved. Before deploying to the cloud, check the Amazon documentation for the resource cost this will incur.

To deploy the application, we run the same command as in the local environment.

$ docker compose up     
[+] Running 29/29
 ⠿ gpu                 CreateComplete          423.0s  
 ⠿ LoadBalancer        CreateComplete          152.0s
 ⠿ ModelsAccessPoint   CreateComplete            6.0s
 ⠿ DefaultNetwork      CreateComplete            5.0s
...
 ⠿ TranslatorService   CreateComplete          205.0s
 ⠿ TrainingService     CreateComplete          161.0s

Check the status of the services:

$ docker compose ps
NAME                                        SERVICE             STATE               PORTS
task/gpu/3311e295b9954859b4c4576511776593   training            Running             
task/gpu/78e1d482a70e47549237ada1c20cc04d   translator          Running             gpu-LoadBal-6UL1B4L7OZB1-d2f05c385ceb31e2.elb.eu-west-3.amazonaws.com:5000->5000/tcp

Query the exposed translator endpoint. We notice the same behaviour as in the local deployment (the model reload has not been triggered yet by the training service).

$ curl -d "text=hello" gpu-LoadBal-6UL1B4L7OZB1-d2f05c385ceb31e2.elb.eu-west-3.amazonaws.com:5000/
No trained model found / training may be in progress...

Check the logs for the GPU device’s tensorflow detected. We can easily identify the 2 GPU devices we reserved and how the training is almost 10X faster than our CPU-based local training.

$ docker compose logs
...
training    | 2021-01-08 20:50:51.595796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
training    | pciBusID: 0000:00:1c.0 name: Tesla T4 computeCapability: 7.5
training    | coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
...
training    | 2021-01-08 20:50:51.596743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 1 with properties: 
training    | pciBusID: 0000:00:1d.0 name: Tesla T4 computeCapability: 7.5
training    | coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s
...

training      | Epoch 1 Batch 300 Loss 1.2269
training      | Epoch 1 Loss 1.4794
training      | Time taken for 1 epoch 42.98397183418274 sec
...
training      | Epoch 2 Loss 0.9750
training      | Time taken for 1 epoch 35.13995909690857 sec
...
training      | Epoch 9 Batch 0 Loss 0.1375
...
training      | Epoch 9 Loss 0.1558
training      | Time taken for 1 epoch 32.444278955459595 sec
...
training      | Epoch 10 Batch 300 Loss 0.1663
training      | Epoch 10 Loss 0.1383
training      | Time taken for 1 epoch 35.29659080505371 sec
training      | Checkpoints saved in /checkpoints/eng-fra
training      | Requested translator service to reload its model, response status: 200.

The training service runs continuously and triggers the model reload on the translation service every 10 cycles (epochs). Once the translation service has been notified at least once, we can stop and remove the training service and release the GPU instances at any time we choose. 

We can easily do this by removing the service from the Compose file:

services:
 translator:
   image: myhubuser/gpudemo
   build: backend
   volumes:
     - models:/checkpoints
   ports:
     - 5000:5000
volumes:
 models:

and then run docker compose up again to update the running application. This will apply the changes and remove the training service.

$ docker compose up      
[+] Running 0/0
 ⠋ gpu                  UpdateInProgress User Initiated    
 ⠋ LoadBalancer         CreateComplete      
 ⠋ ModelsAccessPoint    CreateComplete     
...
 ⠋ Cluster              CreateComplete     
 ⠋ TranslatorService    CreateComplete   

We can list the services running to see the training service has been removed and we only have the translator one:

$ docker compose ps
NAME                                        SERVICE             STATE               PORTS
task/gpu/78e1d482a70e47549237ada1c20cc04d   translator          Running             gpu-LoadBal-6UL1B4L7OZB1-d2f05c385ceb31e2.elb.eu-west-3.amazonaws.com:5000->5000/tcp

Query the translator:

$ curl -d "text=hello" gpu-LoadBal-6UL1B4L7OZB1-d2f05c385ceb31e2.elb.eu-west-3.amazonaws.com:5000/
salut ! 

To remove the application from Amazon ECS run:

$ docker compose down

Summary

We discussed how to setup a resource-intensive ML application to make it easily deployable in different environments with Docker Compose. We have exercised how to define the use of GPUs in a Compose file and how to deploy it on Amazon ECS.

Resources:

]]>