Use of Docker in Artificial Intelligence Projects

Quote

Docker is an open-source platform used to run software applications in isolated, portable containers. Unlike virtual machines, containers are isolated at the operating system level and require fewer resources. In particular, Docker provides significant advantages in artificial intelligence (AI) projects by enhancing portability, reproducibility, and dependency management. For instance, Docker resolves common issues such as “environment reproducibility” and dependency conflicts in machine learning workflows, making collaboration and deployment easier. This means that code running without issues in one location can be executed identically on other systems using a Docker image.
Docker Image and Container Concepts
Docker Image: A ready-made package containing all files, code, libraries, and configurations required for an application to run. For example, an image with Python and necessary libraries installed ensures that your machine learning code runs consistently across any environment. Images are stored in registries such as Docker Hub.
Docker Container: A running, isolated instance of an image. The code and dependencies within the image become active inside the container. Containers share resources with the host operating system but operate independently from other containers and applications. In other words, a container is a live, executable unit that launches the application.

In addition to these concepts, a Dockerfile is a text file that defines how an image is built. Commands such as FROM, RUN, and COPY in the Dockerfile construct the image layer by layer.
Dependency and Environment Challenges in AI Projects
Dependency conflicts are a common issue in AI projects. Different projects may require various components—such as Python, CUDA, and C/C++ libraries—in different versions. For example, one model may be developed using TensorFlow 2, while another requires PyTorch, scikit-learn, specific GPU drivers, or OpenCV. These discrepancies lead to the recurring question: “Why does code that works in one environment fail in another?” Docker resolves these problems by providing each project with its own isolated container environment. Within the container, all dependencies and environment settings—such as the Python version—are fixed. This ensures consistency across development, testing, and production environments.

Commonly encountered problems include:

Library version incompatibilities (e.g., the same library in different versions).
CUDA/GPU driver incompatibilities due to differing NVIDIA driver and CUDA versions.
System-specific path or permission differences (e.g., file paths dependent on local directory structures).

Docker resolves these issues internally. For example, all Python packages required by a project can be included in the image via a requirements.txt or environment.yml file. For applications requiring CUDA, NVIDIA’s pre-built CUDA-based images can be used (e.g. nvidia/cuda:12.8.1-cudnn). This ensures that the dependency chain remains consistent every time. As a result, the guarantee of “same code, same results” is achieved across different machines.
Example: Dockerfile for a Python Machine Learning Project
A sample Dockerfile for a simple Python-based machine learning project is shown below. In this example, the official Python image is used; first, required packages are installed, followed by copying the project files and executing them:


The following steps are performed in this Dockerfile:

The python:3.9-slim image is pulled; a minimal Python environment is set up. (The slim version provides a lightweight image by removing unnecessary tools.)
The working directory is set to /app.
Python packages listed in requirements.txt (such as torch, numpy, scikit-learn, etc.) are installed using pip. The --no-cache-dir flag prevents pip from storing cache, reducing image size.
Source code files (.py files, model definitions) are copied into the image.
When the container is started, the train.py file is executed to begin model training.
docker build and docker run Commands
Building the image: After the Dockerfile is prepared, the image is built using the following command in the directory containing the file:


Here, the -t flag assigns a name and tag to the image (e.g., benim-yz-projem:latest). The period (.) points to the Dockerfile in the current directory. The --no-cache flag can be used to perform a clean build each time, ensuring old layers are not cached and dependencies remain up to date.

Running the container: To launch a container from the built image:


This command starts a container in detached mode (-d) from the benim-yz-projem image and names it yz-container. If port mapping is required, it can be used as follows:


In this example, the -p flag maps port 8888 inside the container to port 8888 on the host machine, while the -v flag mounts the current directory to the /app directory inside the container.

When the container is started with docker run, the CMD instruction (e.g., python train.py) is automatically executed. Tasks such as training or inference are initiated during container runtime.

Bibliographies

Docker, Inc. "Best Practices for Writing Dockerfiles." Docker Documentation. Accessed May 1, 2025. https://docs.docker.com/build/building/best-practices/.

Medium. "Chapter 16: Docker for Data Science and Machine Learning." Accessed May 1, 2025. https://praneethreddybilakanti.medium.com/chapter-16-docker-for-data-science-and-machine-learning-8379cbe1d1db.

Medium. “End-to-End ML Pipeline Using Docker.” Accessed May 1, 2025. https://medium.com/@sauravpattnaik2011/end-to-end-ml-pipeline-using-docker-fa4878abcc33.

Use of Docker in Artificial Intelligence Projects

Docker Image and Container Concepts

Dependency and Environment Challenges in AI Projects

Example: Dockerfile for a Python Machine Learning Project

docker build and docker run Commands

Bibliographies

Tags

Contents