Local LLM Messenger: Chat with GenAI on Your iPhone

In this AI/ML Hackathon post, we want to share another winning project from last year’s Docker AI/ML Hackathon. This time we will dive into Local LLM Messenger, an honorable mention winner created by Justin Garrison.

Developers are pushing the boundaries to bring the power of artificial intelligence (AI) to everyone. One exciting approach involves integrating Large Language Models (LLMs) with familiar messaging platforms like Slack and iMessage. This isn’t just about convenience; it’s about transforming these platforms into launchpads for interacting with powerful AI tools.

Imagine this: You need a quick code snippet or some help brainstorming solutions to coding problems. With LLMs integrated into your messaging app, you can chat with your AI assistant directly within the familiar interface to generate creative ideas or get help brainstorming solutions. No more complex commands or clunky interfaces — just a natural conversation to unlock the power of AI.

Integrating with messaging platforms can be a time-consuming task, especially for macOS users. That’s where Local LLM Messenger (LoLLMM) steps in, offering a streamlined solution for connecting with your AI via iMessage. 

What makes LoLLM Messenger unique?

The following demo, which was submitted to the AI/ML Hackathon, provides an overview of LoLLM Messenger (Figure 1).

Figure 1: Demo of LoLLM Messenger as submitted to the AI/ML Hackathon.

The LoLLM Messenger bot allows you to send iMessages to Generative AI (GenAI) models running directly on your computer. This approach eliminates the need for complex setups and cloud services, making it easier for developers to experiment with LLMs locally.

Key features of LoLLM Messenger

LoLLM Messenger includes impressive features that make it a standout among similar projects, such as:

Local execution: Runs on your computer, eliminating the need for cloud-based services and ensuring data privacy.

Scalability: Handles multiple AI models simultaneously, allowing users to experiment with different models and switch between them easily.

User-friendly interface: Offers a simple and intuitive interface, making it accessible to users of all skill levels.

Integration with Sendblue: Integrates seamlessly with Sendblue, enabling users to send iMessages to the bot and receive responses directly in their inbox.

Support for ChatGPT: Supports the GPT-3.5 Turbo and DALL-E 2 models, providing users with access to powerful AI capabilities.

Customization: Allows users to customize the bot’s behavior by modifying the available commands and integrating their own AI models.

How does it work?

The architecture diagram shown in Figure 2 provides a high-level overview of the components and interactions within the LoLLM Messenger project. It illustrates how the main application, AI models, messaging platform, and external APIs work together to enable users to send iMessages to AI models running on their computers.

Figure 2: Overview of the components and interactions in the LoLLM Messenger project.

By leveraging Docker, Sendblue, and Ollama, LoLLM Messenger offers a seamless and efficient solution for those seeking to explore AI models without the need for cloud-based services. LoLLM Messenger utilizes Docker Compose to manage the required services. 

Docker Compose simplifies the process by handling the setup and configuration of multiple containers, including the main application, ngrok (for creating a secure tunnel), and Ollama (a server that bridges the gap between messaging apps and AI models).

Technical stack

The LoLLM Messenger tech stack includes:

Lollmm service: This service is responsible for running the main application. It handles incoming iMessages, processing user requests, and interacting with the AI models. The lollmm service communicates with the Ollama model, which is a powerful AI model for text and image generation.

Ngrok: This service is used to expose the main application’s port 8000 to the internet using ngrok. It runs in the Alpine image and forwards traffic from port 8000 to the ngrok tunnel. The service is set to run in the host network mode.

Ollama: This service runs the Ollama model, which is a powerful AI model for text and image generation. It listens on port 11434 and mounts a volume from ./run/ollama to /home/ollama. The service is set to deploy with GPU resources, ensuring that it can utilize an NVIDIA GPU if available.

Sendblue: The project integrates with Sendblue to handle iMessages. You can set up Sendblue by adding your API Key and API Secret in the app/.env file and adding your phone number as a Sendblue contact.

Getting started

To get started, ensure that you have installed and set up the following components:

Install the latest Docker Desktop.

Register for Sendblue https://app.sendblue.co/auth/login. 

Create an ngrok account using your preferred way and get authtoken https://dashboard.ngrok.com/signup.

Clone the repository

Open a terminal window and run the following command to clone this sample application:

git clone https://github.com/dockersamples/local-llm-messenger

You should now have the following files in your local-llm-messenger directory:

.
├── LICENSE
├── README.md
├── app
│ ├── Dockerfile
│ ├── Pipfile
│ ├── Pipfile.lock
│ ├── default.ai
│ ├── log_conf.yaml
│ └── main.py
├── docker-compose.yaml
├── img
│ ├── banner.png
│ ├── lasers.gif
│ └── lollm-demo-1.gif
├── justfile
└── test
├── msg.json
└── ollama.json

4 directories, 15 files

The script main.py file under the /app directory is a Python script that uses the FastAPI framework to create a web server for an AI-powered messaging application. The script interacts with OpenAI’s GPT-3 model and an Ollama endpoint for generating responses. It uses Sendblue’s API for sending messages.

The script first imports necessary libraries, including FastAPI, requests, logging, and other required modules.

from dotenv import load_dotenv
import os, requests, time, openai, json, logging
from pprint import pprint
from typing import Union, List

from fastapi import FastAPI
from pydantic import BaseModel

from sendblue import Sendblue

This section sets up configuration variables, such as API keys, callback URL, Ollama API endpoint, and maximum context and word limits.

SENDBLUE_API_KEY = os.environ.get("SENDBLUE_API_KEY")
SENDBLUE_API_SECRET = os.environ.get("SENDBLUE_API_SECRET")
openai.api_key = os.environ.get("OPENAI_API_KEY")
OLLAMA_API = os.environ.get("OLLAMA_API_ENDPOINT", "http://ollama:11434/api")
# could also use request.headers.get('referer') to do dynamically
CALLBACK_URL = os.environ.get("CALLBACK_URL")
MAX_WORDS = os.environ.get("MAX_WORDS")

Next, the script performs the logging configuration, setting the log level to INFO. Creates a file handler for logging messages to a file named app.log.

It then defines various functions for interacting with the AI models, managing context, sending messages, handling callbacks, and executing slash commands.

def set_default_model(model: str):
try:
with open("default.ai", "w") as f:
f.write(model)
f.close()
return
except IOError:
logger.error("Could not open file")
exit(1)

def get_default_model() -> str:
try:
with open("default.ai") as f:
default = f.readline().strip("n")
f.close()
if default != "":
return default
else:
set_default_model("llama2:latest")
return ""
except IOError:
logger.error("Could not open file")
exit(1)

def validate_model(model: str) -> bool:
available_models = get_model_list()
if model in available_models:
return True
else:
return False

def get_ollama_model_list() -> List[str]:
available_models = []

tags = requests.get(OLLAMA_API + "/tags")
all_models = json.loads(tags.text)
for model in all_models["models"]:
available_models.append(model["name"])
return available_models

def get_openai_model_list() -> List[str]:
return ["gpt-3.5-turbo", "dall-e-2"]

def get_model_list() -> List[str]:
ollama_models = []
openai_models = []
all_models = []
if "OPENAI_API_KEY" in os.environ:
# print(openai.Model.list())
openai_models = get_openai_model_list()

ollama_models = get_ollama_model_list()
all_models = ollama_models + openai_models
return all_models

DEFAULT_MODEL = get_default_model()

if DEFAULT_MODEL == "":
# This is probably the first run so we need to install a model
if "OPENAI_API_KEY" in os.environ:
print("No default model set. openai is enabled. using gpt-3.5-turbo")
DEFAULT_MODEL = "gpt-3.5-turbo"
else:
print("No model found and openai not enabled. Installing llama2:latest")
pull_data = '{"name": "llama2:latest","stream": false}'
try:
pull_resp = requests.post(OLLAMA_API + "/pull", data=pull_data)
pull_resp.raise_for_status()
except requests.exceptions.HTTPError as err:
raise SystemExit(err)
set_default_model("llama2:latest")
DEFAULT_MODEL = "llama2:latest"

if validate_model(DEFAULT_MODEL):
logger.info("Using model: " + DEFAULT_MODEL)
else:
logger.error("Model " + DEFAULT_MODEL + " not available.")
logger.info(get_model_list())

pull_data = '{"name": "' + DEFAULT_MODEL + '","stream": false}'
try:
pull_resp = requests.post(OLLAMA_API + "/pull", data=pull_data)
pull_resp.raise_for_status()
except requests.exceptions.HTTPError as err:
raise SystemExit(err)

def set_msg_send_style(received_msg: str):
"""Will return a style for the message to send based on matched words in received message"""
celebration_match = ["happy"]
shooting_star_match = ["star", "stars"]
fireworks_match = ["celebrate", "firework"]
lasers_match = ["cool", "lasers", "laser"]
love_match = ["love"]
confetti_match = ["yay"]
balloons_match = ["party"]
echo_match = ["what did you say"]
invisible_match = ["quietly"]
gentle_match = []
loud_match = ["hear"]
slam_match = []

received_msg_lower = received_msg.lower()
if any(x in received_msg_lower for x in celebration_match):
return "celebration"
elif any(x in received_msg_lower for x in shooting_star_match):
return "shooting_star"
elif any(x in received_msg_lower for x in fireworks_match):
return "fireworks"
elif any(x in received_msg_lower for x in lasers_match):
return "lasers"
elif any(x in received_msg_lower for x in love_match):
return "love"
elif any(x in received_msg_lower for x in confetti_match):
return "confetti"
elif any(x in received_msg_lower for x in balloons_match):
return "balloons"
elif any(x in received_msg_lower for x in echo_match):
return "echo"
elif any(x in received_msg_lower for x in invisible_match):
return "invisible"
elif any(x in received_msg_lower for x in gentle_match):
return "gentle"
elif any(x in received_msg_lower for x in loud_match):
return "loud"
elif any(x in received_msg_lower for x in slam_match):
return "slam"
else:
return

Two classes, Msg and Callback, are defined to represent the structure of incoming messages and callback data. The code also includes various functions and classes to handle different aspects of the messaging platform, such as setting default models, validating models, interacting with the Sendblue API, and processing messages. It also includes functions to handle slash commands, create messages from context, and append context to a file.

class Msg(BaseModel):
accountEmail: str
content: str
media_url: str
is_outbound: bool
status: str
error_code: int | None = None
error_message: str | None = None
message_handle: str
date_sent: str
date_updated: str
from_number: str
number: str
to_number: str
was_downgraded: bool | None = None
plan: str

class Callback(BaseModel):
accountEmail: str
content: str
is_outbound: bool
status: str
error_code: int | None = None
error_message: str | None = None
message_handle: str
date_sent: str
date_updated: str
from_number: str
number: str
to_number: str
was_downgraded: bool | None = None
plan: str

def msg_openai(msg: Msg, model="gpt-3.5-turbo"):
"""Sends a message to openai"""
message_with_context = create_messages_from_context("openai")

# Add the user's message and system context to the messages list
messages = [
{"role": "user", "content": msg.content},
{"role": "system", "content": "You are an AI assistant. You will answer in haiku."},
]

# Convert JSON strings to Python dictionaries and add them to messages
messages.extend(
[
json.loads(line) # Convert each JSON string back into a dictionary
for line in message_with_context
]
)

# Send the messages to the OpenAI model
gpt_resp = client.chat.completions.create(
model=model,
messages=messages,
)

# Append the system context to the context file
append_context("system", gpt_resp.choices[0].message.content)

# Send a message to the sender
msg_response = sendblue.send_message(
msg.from_number,
{
"content": gpt_resp.choices[0].message.content,
"status_callback": CALLBACK_URL,
},
)

return

def msg_ollama(msg: Msg, model=None):
"""Sends a message to the ollama endpoint"""
if model is None:
logger.error("Model is None when calling msg_ollama")
return # Optionally handle the case more gracefully

ollama_headers = {"Content-Type": "application/json"}
ollama_data = (
'{"model":"' + model +
'", "stream": false, "prompt":"' +
msg.content +
" in under " +
str(MAX_WORDS) + # Make sure MAX_WORDS is a string
' words"}'
)
ollama_resp = requests.post(
OLLAMA_API + "/generate", headers=ollama_headers, data=ollama_data
)
response_dict = json.loads(ollama_resp.text)
if ollama_resp.ok:
send_style = set_msg_send_style(msg.content)
append_context("system", response_dict["response"])
msg_response = sendblue.send_message(
msg.from_number,
{
"content": response_dict["response"],
"status_callback": CALLBACK_URL,
"send_style": send_style,
},
)
else:
msg_response = sendblue.send_message(
msg.from_number,
{
"content": "I'm sorry, I had a problem processing that question. Please try again.",
"status_callback": CALLBACK_URL,
},
)
return

Navigate to the app/ directory and create a new file for adding environment variables.

touch .env
SENDBLUE_API_KEY=your_sendblue_api_key
SENDBLUE_API_SECRET=your_sendblue_api_secret
OLLAMA_API_ENDPOINT=http://host.docker.internal:11434/api
OPENAI_API_KEY=your_openai_api_key

Next, add the ngrok authtoken to the Docker Compose file. You can get the authtoken from this link.

services:
lollm:
build: ./app
# command:
# – sleep
# – 1d
ports:
– 8000:8000
env_file: ./app/.env
volumes:
– ./run/lollm:/run/lollm
depends_on:
– ollama
restart: unless-stopped
network_mode: "host"
ngrok:
image: ngrok/ngrok:alpine
command:
– "http"
– "8000"
– "–log"
– "stdout"
environment:
– NGROK_AUTHTOKEN=2i6iXXXXXXXXhpqk1aY1
network_mode: "host"
ollama:
image: ollama/ollama
ports:
– 11434:11434
volumes:
– ./run/ollama:/home/ollama
network_mode: "host"

Running the application stack

Next, you can run the application stack, as follows:

$ docker compose up

You will see output similar to the following:

[+] Running 4/4
✔ Container local-llm-messenger-ollama-1 Create… 0.0s
✔ Container local-llm-messenger-ngrok-1 Created 0.0s
✔ Container local-llm-messenger-lollm-1 Recreat… 0.1s
! lollm Published ports are discarded when using host network mode 0.0s
Attaching to lollm-1, ngrok-1, ollama-1
ollama-1 | 2024/06/20 03:14:46 routes.go:1011: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_TMPDIR:]"
ollama-1 | time=2024-06-20T03:14:46.308Z level=INFO source=images.go:725 msg="total blobs: 0"
ollama-1 | time=2024-06-20T03:14:46.309Z level=INFO source=images.go:732 msg="total unused blobs removed: 0"
ollama-1 | time=2024-06-20T03:14:46.309Z level=INFO source=routes.go:1057 msg="Listening on [::]:11434 (version 0.1.44)"
ollama-1 | time=2024-06-20T03:14:46.309Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama2210839504/runners
ngrok-1 | t=2024-06-20T03:14:46+0000 lvl=info msg="open config file" path=/var/lib/ngrok/ngrok.yml err=nil
ngrok-1 | t=2024-06-20T03:14:46+0000 lvl=info msg="open config file" path=/var/lib/ngrok/auth-config.yml err=nil
ngrok-1 | t=2024-06-20T03:14:46+0000 lvl=info msg="starting web service" obj=web addr=0.0.0.0:4040 allow_hosts=[]
ngrok-1 | t=2024-06-20T03:14:46+0000 lvl=info msg="client session established" obj=tunnels.session
ngrok-1 | t=2024-06-20T03:14:46+0000 lvl=info msg="tunnel session started" obj=tunnels.session
ngrok-1 | t=2024-06-20T03:14:46+0000 lvl=info msg="started tunnel" obj=tunnels name=command_line addr=http://localhost:8000 url=https://94e1-223-185-128-160.ngrok-free.app
ollama-1 | time=2024-06-20T03:14:48.602Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cuda_v11]"
ollama-1 | time=2024-06-20T03:14:48.603Z level=INFO source=types.go:71 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="7.7 GiB" available="3.9 GiB"
lollm-1 | INFO: Started server process [1]
lollm-1 | INFO: Waiting for application startup.
lollm-1 | INFO: Application startup complete.
lollm-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
ngrok-1 | t=2024-06-20T03:16:58+0000 lvl=info msg="join connections" obj=join id=ce119162e042 l=127.0.0.1:8000 r=[2401:4900:8838:8063:f0b0:1866:e957:b3ba]:54384
lollm-1 | OLLAMA API IS http://host.docker.internal:11434/api
lollm-1 | INFO: 2401:4900:8838:8063:f0b0:1866:e957:b3ba:0 – "GET / HTTP/1.1" 200 OK

If you’re testing it on a system without an NVIDIA GPU, then you can skip the deploy attribute of the Compose file. 

Watch the output for your ngrok endpoint. In our case, it shows: https://94e1-223-185-128-160.ngrok-free.app/

Next, append /msg to the following ngrok webhooks URL: https://94e1-223-185-128-160.ngrok-free.app/

Then, add it under the webhooks URL section on Sendblue and save it (Figure 3).  The ngrok service is configured to expose the lollmm service on port 8000 and provide a secure tunnel to the public internet using the ngrok.io domain. 

The ngrok service logs indicate that it has started the web service and established a client session with the tunnels. They also show that the tunnel session has started and has been successfully established with the lollmm service.

The ngrok service is configured to use the specified ngrok authentication token, which is required to access the ngrok service. Overall, the ngrok service is running correctly and is able to establish a secure tunnel to the lollmm service.

Figure 3: Adding ngrok authentication token to webhooks.

Ensure that there are no error logs when you run the ngrok container (Figure 4).

Figure 4: Checking the logs for errors.

Ensure that the LoLLM Messenger container is actively up and running (Figure 5).

Figure 5: Ensure the LoLLM Messenger container is running.

The logs show that the Ollama service has opened the specified port (11434) and is listening for incoming connections. The logs also indicate that the Ollama service has mounted the /home/ollama directory from the host machine to the /home/ollama directory within the container.

Overall, the Ollama service is running correctly and is ready to provide AI models for inference.

Testing the functionality

To test the functionality of the lollm service, you first need to add your contact number to the Sendblue dashboard. Then you should be able to send messages to the Sendblue number and observe the responses from the lollmm service (Figure 6).

Figure 6: Testing functionality of lollm service.

The Sendblue platform will send HTTP requests to the /msg endpoint of your lollmm service, and your lollmm service will process these requests and return the appropriate responses.

The lollmm service is set up to listen on port 8000.

The ngrok tunnel is started and provides a public URL, such as https://94e1-223-185-128-160.ngrok-free.app.

The lollmm service receives HTTP requests from the ngrok tunnel, including GET requests to the root path (/) and other paths, such as /favicon.ico, /predict, /mdg, and /msg.

The lollmm service responds to these requests with appropriate HTTP status codes, such as 200 OK for successful requests and 404 Not Found for requests to paths that do not exist.

The ngrok tunnel logs the join connections, indicating that clients are connecting to the lollmm service through the ngrok tunnel.

Figure 7: Sending requests and receiving responses.

The first time you chat with LLM by typing /list (Figure 7), you can check the logs as shown:

ngrok-1 | t=2024-07-09T02:34:30+0000 lvl=info msg="join connections" obj=join id=12bd50a8030b l=127.0.0.1:8000 r=18.223.220.3:44370
lollm-1 | OLLAMA API IS http://host.docker.internal:11434/api
lollm-1 | INFO: 18.223.220.3:0 – "POST /msg HTTP/1.1" 200 OK
ngrok-1 | t=2024-07-09T02:34:53+0000 lvl=info msg="join connections" obj=join id=259fda936691 l=127.0.0.1:8000 r=18.223.220.3:36712
lollm-1 | INFO: 18.223.220.3:0 – "POST /msg HTTP/1.1" 200 OK

Next, let’s install the codellama model by typing /install codellama:latest (Figure 8).

Figure 8: Installing the `codellama` model.

You can see the following container logs once you set the default model to codellama:latest as shown:

ngrok-1 | t=2024-07-09T03:39:23+0000 lvl=info msg="join connections" obj=join id=026d8fad5c87 l=127.0.0.1:8000 r=18.223.220.3:36282
lollm-1 | setting default model
lollm-1 | INFO: 18.223.220.3:0 – "POST /msg HTTP/1.1" 200 OK

The lollmm service is running correctly and can handle HTTP requests from the ngrok tunnel. You can use the ngrok tunnel URL to test the functionality of the lollmm service by sending HTTP requests to the appropriate paths (Figure 9).

Figure 9: Testing the messaging functionality.

Conclusion

LoLLM Messenger is a valuable tool for developers and enthusiasts looking to push the boundaries of LLM integration within messaging apps. It allows developers to craft custom chatbots for specific needs, add real-time sentiment analysis to messages, or explore entirely new AI features in your messaging experience. 

To get started, you can explore the LoLLM Messenger project on GitHub and discover the potential of local LLM.

Learn more

Subscribe to the Docker Newsletter. 

Read the AI/ML Hackathon collection.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Docker Desktop 4.32: Beta Releases of Compose File Viewer, Terminal Shell Integration, and Volume Backups to Cloud Providers

In this post:

Compose File Viewer (Beta)

Terminal integration (Beta)

Enterprise-grade Volume Backup to cloud providers (Beta) 

Docker Desktop MSI Installer and new login enforcement alternatives (Early Access)

Docker Desktop 4.32 includes a series of powerful enhancements designed to streamline data workflows and elevate user productivity. The latest Docker Desktop release enhances the experience across development teams of all sizes and emphasizes our commitment to providing a secure hybrid development platform that enables efficient building, sharing, and running of innovative applications anywhere. 

Key features of the Docker Desktop 4.32 release include: 

Improving developer’s experience with Compose File Viewer (Beta)

Enhancing developer productivity with Terminal in Docker Desktop (Beta)

Simplifying data management with Volume Backups to Cloud Providers (Beta) 

Streamlining administration with Docker Desktop’s MSI Installer (Early Access) 

Compose File Viewer (Beta) now available

Launched as Beta rolled out to limited customers during our Docker Desktop 4.31 release, Compose File Viewer has now been rolled out to all customers. Users can now see enhanced instructions for setting up Compose Watch when launching the viewer from the Compose CLI. 

Configuring multi-container applications can be complex, so  Compose File Viewer helps developers see their Docker Compose configuration file in Docker Desktop with information about each section a click away. This makes it simpler for developers to orient on basic Compose concepts and learn to set up Compose Watch, making it easier to sync code changes into running containers. 

Check out this new File Viewer through the View Configuration option in the Compose command line or by viewing a Compose stack in the Containers tab, then clicking the View Configuration button.

These enhancements are another step forward as we continue improving Compose to help you get the benefits of containerized development faster.

Terminal experience in Docker Desktop  (Beta)

We are excited to introduce the new terminal feature in Docker Desktop. This enhancement integrates a terminal directly within the Docker Desktop GUI, enabling seamless transitions between CLI and GUI interactions within a single window. By incorporating a terminal shell into the Docker Desktop interface, we significantly reduce the friction associated with context switching for developers. 

This functionality is designed to streamline workflows, accelerate delivery times, and enhance overall developer productivity.

Figure 2: Terminal integrated in Docker Desktop.

Enterprise-grade Volume Backup to cloud providers (Beta) 

We are pleased to announce the release of an advanced Beta feature for interacting with volumes data within Docker Desktop. Building on our previously introduced Volumes Backup & Share functionalities, we are now introducing the capability to back up volumes to multiple cloud providers. 

With a Docker Business subscription, users can seamlessly back up their volumes to various cloud storage services, including AWS, Azure, and GCP. 

This new Volume Backup to cloud providers feature represents the latest enhancement in our ongoing efforts to streamline data management capabilities within Docker Desktop.

Figure 3: Quickly export data to external cloud storage.

Docker Desktop MSI Installer and new login enforcement alternatives (Early Access)

We have made it easier to enforce login for your organization and deploy using the MSI Installer, available for early access. These key enhancements aim to streamline administration, improve security, and enhance the user experience for Docker Business subscribers.

Docker is committed to helping enterprises of all sizes with enhanced Docker sign-in enforcement across Windows and macOS to help increase user logins, simplify administration, and reduce learning curves for IT administrators.

The Docker Desktop MSI Installer helps with mass deployments and customizations with standardized silent install parameters. 

Figure 4: Where to download the new MSI Installer in the Docker Admin Console.

Although these updates are currently available only for early access, they reflect Docker’s commitment to simplifying deployment and streamlining administration for organizations of all sizes. With more of these administrative offerings becoming available soon, we encourage IT teams and administrators to start planning for these changes to enhance their Docker experience.

Conclusion 

The Docker Desktop 4.32 release brings significant improvements aimed at streamlining workflows and boosting productivity for development teams of all sizes. With features like the Compose File Viewer, Terminal integration, and volume backups to cloud providers, Docker Desktop continues to simplify and enhance the developer experience. The new MSI Installer for easier administration also underlines our commitment to streamlining administration.

We look forward to seeing how these enhancements will help you build, share, and run innovative applications more effectively.

Learn more

Authenticate and update to receive your subscription level’s newest Docker Desktop features.

New to Docker? Create an account .

Subscribe to the Docker Newsletter.

Quelle: https://blog.docker.com/feed/

How an AI Assistant Can Help Configure Your Project’s Git Hooks

This ongoing Docker Labs GenAI series will explore the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real-time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing things as open source so you can play, explore, and hack with us, too.

Can an AI assistant help configure your project’s Git hooks? 

Git hooks can save manual work during repetitive tasks, such as authoring commit messages or running checks before committing an update. But they can also be hard to configure, and are project dependent. Can generative AI make Git hooks easier to configure and use?

Simple prompts

From a high level, the basic prompt is simple:

How do I set up git hooks?

Although this includes no details about the actual project, the response from many foundation models is still useful. If you run this prompt in ChatGPT, you’ll see that the response contains details about how to use the .git/hooks folder, hints about authoring hook scripts, and even practical next steps for what you’ll need to learn about next. However, the advice is general. It has not been grounded by your project.

Project context

Your project itself is an important source of information for an assistant. Let’s start by providing information about types of source code in a project. Fortunately, there are plenty of existing tools for extracting project context, and these tools are often already available in Docker containers. 

For example, here’s an image that will analyze any Git repository and return a list of languages being used. Let’s update our prompt with this new context.

How do I set up git hooks?

{{# linguist }}
This project contains code from the language {{ language }} so if you have any
recommendations pertaining to {{ language }}, please include them.
{{/linguist}}

In this example, we use moustache templates to bind the output of our “linguist” analysis into this prompt.

The response from an LLM-powered assistant will change dramatically. Armed with specific advice about what kinds of files might be changed, the LLM will generate sample scripts and make suggestions about specific tools that might be useful for the kinds of code developed in this project. It might even be possible to cut and paste code out of the response to try setting up hooks yourself. 

The pattern is quite simple. We already have tools to analyze projects, so let’s plug these in locally and give the LLM more context to make better suggestions (Figure 1).

Figure 1: Adding tools to provide context for LLM.

Expertise

Generative AI also offers new opportunities for experts to contribute knowledge that AI assistants can leverage to become even more useful. For example, we have learned that pre-commit can be helpful to organize the set of tools used to implement Git hooks. 

To represent this learning, we add this prompt:

When configuring git hooks, our organization uses a tool called
[pre-commit](https://github.com/pre-commit/pre-commit).

There’s also a base configuration that we have found useful in all projects. We also add that to our assistant’s knowledge base.

If a user wants to configure git hooks, use this template which will need to be written to pre-commit-config.yaml
in the root of the user's project.

Start with the following code block:

“`yaml
repos:
– repo: http://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
– id: check-yaml
– id: trailing-whitespace
– id: check-merge-conflict
– repo https://github.com/jorisroovers/gitlint
rev: main
hooks:
– id: gitlint
– repo: local
hooks:
“`

Finally, as we learn about new tools that are useful for certain projects, we describe this information. For example, as an expert, I might want to suggest that teams using Golang include a particular linting tool in the Git hooks configuration.

If we detect `Go` in the project, add the following hook to the hooks entry in the `local` repo entry.

“`yaml
id: golangcli-lint
name: golang cli
entry: golangci/golangci-lint
files ".go$"
“`

With these additions, the response from our assistant becomes precise. We have found that our assistant can now write hooks scripts and write complete YAML configuration files that are project-specific and ready to copy directly into a project. 

Somewhat surprisingly, the assistant can also now recommend tools not mentioned explicitly in our prompts but that use the same syntax established for other tools. Using these examples, the LLM appears to be capable of extending the assistant’s capabilities to other tools. Using our examples as guidance, the LLM suggests new tools but still configures them using our suggested framework and syntax.

Most importantly, the response from the assistant is now not only actionable to the developer, saving them time, but it is also specific enough that we could pass the response to a simple agent to take the action automatically.

Adding tools

For this example, the only tool we really need is a file-writer. The change to our prompt is to add one instruction to go ahead and write the configuration into the project.

Write the final yaml content to our project at the path pre-commit-config.yaml. Write both the `pre-commit` and `commit-message` scripts to `git/hooks` and make them executable.

Besides the prompt, there is another crucial step that we are skipping at this point. The assistant must be told that it is capable of writing content into files. However, this is really just a registration step. 

The important thing is that we can give our agent the tools it needs to perform tasks. In doing so, the response from the LLM undergoes a transition. Instead of text output, the LLM responds with instructions for our agent. If we’re using an OpenAI function call, we’ll see a request that looks something like the following .json file. It’s not meant to be read by us, of course. It’s an instruction to the agent that knows how to update your project for you.

{
"id": "call_4LCo0CQqCHCGGZea3qlaTg5h",
"type": "function",
"function": {
"name": "write_file",
"arguments": "{n "path": "pre-commit-config.yaml",n "content": "repos:n – repo: http://github.com/pre-commi
t/pre-commit-hooksn rev: v2.3.0n hooks:n – id: check-yamln – id: trailing-whitespacen – id
: check-merge-conflictn – repo https://github.com/jorisroovers/gitlintn rev: mainn hooks:n – id: gitlintn
– repo: localn hooks:n – id: markdownlintn name: markdown lintern entry: markdownlint/mar
kdownlintn files: \"\.md$\"n – id: python-blackn name: python black formattern e
ntry: blackn files: \"\.py$\""n}"
}
}

A more sophisticated version of the file-writer function might communicate with an editor agent capable of presenting recommended file changes to a developer using native IDE concepts, like editor quick-fixes and hints. In other words, tools can help generative AI to meet developers where they are. And the answer to the question:

How do I set up git hooks?

becomes, “Let me just show you.”

Docker as tool engine

The tools mentioned in the previous sections have all been delivered as Docker containers.  One goal of this work has been to verify that an assistant can bootstrap itself starting from a Docker-only environment. Docker is important here because it has been critical in smoothing over many of the system/environment gaps that LLMs struggle with. 

We have observed that a significant barrier to activating even simple local assistants is the complexity of managing a safe and reliable environment for running these tools. Therefore, we are constraining ourselves to use only tools that can be lazily pulled from public registries.

For AI assistants to transform how we consume tools, we believe that both tool distribution and knowledge distribution are key factors. In the above example, we can see how LLM responses can be transformed by tools from unactionable and vague to hyper-project-focused and actionable. The difference is tools.

To follow along with this effort, check out the GitHub repository for this project.

Learn more

Subscribe to the Docker Newsletter.

Read the Docker Labs GenAI series.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Docker Best Practices: Choosing Between RUN, CMD, and ENTRYPOINT

Docker’s flexibility and robustness as a containerization tool come with a complexity that can be daunting. Multiple methods are available to accomplish similar tasks, and users must understand the pros and cons of the available options to choose the best approach for their projects.

One confusing area concerns the RUN, CMD, and ENTRYPOINT Dockerfile instructions. In this article, we will discuss the differences between these instructions and describe use cases for each.

RUN

The RUN instruction is used in Dockerfiles to execute commands that build and configure the Docker image. These commands are executed during the image build process, and each RUN instruction creates a new layer in the Docker image. For example, if you create an image that requires specific software or libraries installed, you would use RUN to execute the necessary installation commands.

The following example shows how to instruct the Docker build process to update the apt cache and install Apache during an image build:

RUN apt update && apt -y install apache2

RUN instructions should be used judiciously to keep the image layers to a minimum, combining related commands into a single RUN instruction where possible to reduce image size.

CMD

The CMD instruction specifies the default command to run when a container is started from the Docker image. If no command is specified during the container startup (i.e., in the docker run command), this default is used. CMD can be overridden by supplying command-line arguments to docker run.

CMD is useful for setting default commands and easily overridden parameters. It is often used in images as a way of defining default run parameters and can be overridden from the command line when the container is run. 

For example, by default, you might want a web server to start, but users could override this to run a shell instead:

CMD ["apache2ctl", "-DFOREGROUND"]

Users can start the container with docker run -it <image> /bin/bash to get a Bash shell instead of starting Apache.  

ENTRYPOINT

The ENTRYPOINT instruction sets the default executable for the container. It is similar to CMD but is overridden by the command-line arguments passed to docker run. Instead, any command-line arguments are appended to the ENTRYPOINT command.

Note: Use ENTRYPOINT when you need your container to always run the same base command, and you want to allow users to append additional commands at the end. 

ENTRYPOINT is particularly useful for turning a container into a standalone executable. For example, suppose you are packaging a custom script that requires arguments (e.g., “my_script extra_args”). In that case, you can use ENTRYPOINT to always run the script process (“my_script”) and then allow the image users to specify the “extra_args” on the docker run command line. You can do the following:

ENTRYPOINT ["my_script"]

Combining CMD and ENTRYPOINT

The CMD instruction can be used to provide default arguments to an ENTRYPOINT if it is specified in the exec form. This setup allows the entry point to be the main executable and CMD to specify additional arguments that can be overridden by the user.

For example, you might have a container that runs a Python application where you always want to use the same application file but allow users to specify different command-line arguments:

ENTRYPOINT ["python", "/app/my_script.py"]
CMD ["–default-arg"]

Running docker run myimage –user-arg executes python /app/my_script.py –user-arg.

The following table provides an overview of these commands and use cases.

Command description and use cases

CommandDescriptionUse CaseCMDDefines the default executable of a Docker image. It can be overridden by docker run arguments.Utility images allow users to pass different executables and arguments on the command line.ENTRYPOINTDefines the default executable. It can be overridden by the “–entrypoint”  docker run arguments.Images built for a specific purpose where overriding the default executable is not desired.RUNExecutes commands to build layers.Building an image

What is PID 1 and why does it matter?

In the context of Unix and Unix-like systems, including Docker containers, PID 1 refers to the first process started during system boot. All other processes are then started by PID 1, which in the process tree model is the parent of every process in the system. 

In Docker containers, the process that runs as PID 1 is crucial, because it is responsible for managing all other processes inside the container. Additionally, PID 1 is the process that reviews and handles signals from the Docker host. For example, a SIGTERM into the container will be caught and processed by PID 1, and the container should gracefully shut down.

When commands are executed in Docker using the shell form, a shell process (/bin/sh -c) typically becomes PID 1. Still, it does not properly handle these signals, potentially leading to unclean shutdowns of the container. In contrast, when using the exec form, the command runs directly as PID 1 without involving a shell, which allows it to receive and handle signals directly. 

This behavior ensures that the container can gracefully stop, restart, or handle interruptions, making the exec form preferable for applications that require robust and responsive signal handling.

Shell and exec forms

In the previous examples, we used two ways to pass arguments to the RUN, CMD, and ENTRYPOINT instructions. These are referred to as shell form and exec form. 

Note: The key visual difference is that the exec form is passed as a comma-delimited array of commands and arguments with one argument/command per element. Conversely, shell form is expressed as a string combining commands and arguments. 

Each form has implications for executing commands within containers, influencing everything from signal handling to environment variable expansion. The following table provides a quick reference guide for the different forms.

Shell and exec form reference

FormDescriptionExampleShell FormTakes the form of <INSTRUCTION> <COMMAND>.CMD echo TEST or ENTRYPOINT echo TESTExec FormTakes the form of <INSTRUCTION> [“EXECUTABLE”, “PARAMETER”].CMD [“echo”, “TEST”] or ENTRYPOINT [“echo”, “TEST”]

In the shell form, the command is run in a subshell, typically /bin/sh -c on Linux systems. This form is useful because it allows shell processing (like variable expansion, wildcards, etc.), making it more flexible for certain types of commands (see this shell scripting article for examples of shell processing). However, it also means that the process running your command isn’t the container’s PID 1, which can lead to issues with signal handling because signals sent by Docker (like SIGTERM for graceful shutdowns) are received by the shell rather than the intended process.

The exec form does not invoke a command shell. This means the command you specify is executed directly as the container’s PID 1, which is important for correctly handling signals sent to the container. Additionally, this form does not perform shell expansions, so it’s more secure and predictable, especially for specifying arguments or commands from external sources.

Putting it all together

To illustrate the practical application and nuances of Docker’s RUN, CMD, and ENTRYPOINT instructions, along with the choice between shell and exec forms, let’s review some examples. These examples demonstrate how each instruction can be utilized effectively in real-world Dockerfile scenarios, highlighting the differences between shell and exec forms. 

Through these examples, you’ll better understand when and how to use each directive to tailor container behavior precisely to your needs, ensuring proper configuration, security, and performance of your Docker containers. This hands-on approach will help consolidate the theoretical knowledge we’ve discussed into actionable insights that can be directly applied to your Docker projects.

RUN instruction

For RUN, used during the Docker build process to install packages or modify files, choosing between shell and exec form can depend on the need for shell processing. The shell form is necessary for commands that require shell functionality, such as pipelines or file globbing. However, the exec form is preferable for straightforward commands without shell features, as it reduces complexity and potential errors.

# Shell form, useful for complex scripting
RUN apt-get update && apt-get install -y nginx

# Exec form, for direct command execution
RUN ["apt-get", "update"]
RUN ["apt-get", "install", "-y", "nginx"]

CMD and ENTRYPOINT

These instructions control container runtime behavior. Using exec form with ENTRYPOINT ensures that the container’s main application handles signals directly, which is crucial for proper startup and shutdown behavior.  CMD can provide default parameters to an ENTRYPOINT defined in exec form, offering flexibility and robust signal handling.

# ENTRYPOINT with exec form for direct process control
ENTRYPOINT ["httpd"]

# CMD provides default parameters, can be overridden at runtime
CMD ["-D", "FOREGROUND"]

Signal handling and flexibility

Using ENTRYPOINT in exec form and CMD to specify parameters ensures that Docker containers can handle operating system signals gracefully, respond to user inputs dynamically, and maintain secure and predictable operations. 

This setup is particularly beneficial for containers that run critical applications needing reliable shutdown and configuration behaviors. The following table shows key differences between the forms.

Key differences between shell and exec

Shell FormExec FormFormCommands without [] brackets. Run by the container’s shell, e.g., /bin/sh -c.Commands with [] brackets. Run directly, not through a shell.Variable SubstitutionInherits environment variables from the shell, such as $HOME and $PATH.Does not inherit shell environment variables but behaves the same for ENV instruction variables.Shell FeaturesSupports sub-commands, piping output, chaining commands, I/O redirection, etc.Does not support shell features.Signal Trapping & ForwardingMost shells do not forward process signals to child processes.Directly traps and forwards signals like SIGINT.Usage with ENTRYPOINTCan cause issues with signal forwarding.Recommended due to better signal handling.CMD as ENTRYPOINT ParametersNot possible with the shell form.If the first item in the array is not a command, all items are used as parameters for the ENTRYPOINT.

Figure 1 provides a decision tree for using RUN, CMD, and ENTRYPOINT in building a Dockerfile.

Figure 1: Decision tree — RUN, CMD, ENTRYPOINT.

Figure 2 shows a decision tree to help determine when to use exec form or shell form.

Figure 2: Decision tree — exec vs. shell form.

Examples

The following section will walk through the high-level differences between CMD and ENTRYPOINT. In these examples, the RUN command is not included, given that the only decision to make there is easily handled by reviewing the two different formats.

Test Dockerfile

# Use syntax version 1.3-labs for Dockerfile
# syntax=docker/dockerfile:1.3-labs

# Use the Ubuntu 20.04 image as the base image
FROM ubuntu:20.04

# Run the following commands inside the container:
# 1. Update the package lists for upgrades and new package installations
# 2. Install the apache2-utils package (which includes the 'ab' tool)
# 3. Remove the package lists to reduce the image size
#
# This is all run in a HEREDOC; see
# https://www.docker.com/blog/introduction-to-heredocs-in-dockerfiles/
# for more details.
#
RUN <<EOF
apt-get update;
apt-get install -y apache2-utils;
rm -rf /var/lib/apt/lists/*;
EOF

# Set the default command
CMD ab

First build

We will build this image and tag it as ab.

$ docker build -t ab .

[+] Building 7.0s (6/6) FINISHED docker:desktop-linux
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 730B 0.0s
=> [internal] load metadata for docker.io/library/ubuntu:20.04 0.4s
=> CACHED [1/2] FROM docker.io/library/ubuntu:20.04@sha256:33a5cc25d22c45900796a1aca487ad7a7cb09f09ea00b779e 0.0s
=> [2/2] RUN <<EOF (apt-get update;…) 6.5s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:99ca34fac6a38b79aefd859540f88e309ca759aad0d7ad066c4931356881e518 0.0s
=> => naming to docker.io/library/ab

Run with CMD ab

Without any arguments, we get a usage block as expected.

$ docker run ab
ab: wrong number of arguments
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
-n requests Number of requests to perform
-c concurrency Number of multiple requests to make at a time
-t timelimit Seconds to max. to spend on benchmarking
This implies -n 50000
-s timeout Seconds to max. wait for each response
Default is 30 seconds
<– SNIP –>

However, if I run ab and include a URL to test, I initially get an error:

$ docker run –rm ab https://jayschmidt.us
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "https://jayschmidt.us": stat https://jayschmidt.us: no such file or directory: unknown.

The issue here is that the string supplied on the command line — https://jayschmidt.us — is overriding the CMD instruction, and that is not a valid command, resulting in an error being thrown. So, we need to specify the command to run:

$ docker run –rm ab ab https://jayschmidt.us/
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking jayschmidt.us (be patient)…..done

Server Software: nginx
Server Hostname: jayschmidt.us
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-ECDSA-AES256-GCM-SHA384,256,256
Server Temp Key: X25519 253 bits
TLS Server Name: jayschmidt.us

Document Path: /
Document Length: 12992 bytes

Concurrency Level: 1
Time taken for tests: 0.132 seconds
Complete requests: 1
Failed requests: 0
Total transferred: 13236 bytes
HTML transferred: 12992 bytes
Requests per second: 7.56 [#/sec] (mean)
Time per request: 132.270 [ms] (mean)
Time per request: 132.270 [ms] (mean, across all concurrent requests)
Transfer rate: 97.72 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 90 90 0.0 90 90
Processing: 43 43 0.0 43 43
Waiting: 43 43 0.0 43 43
Total: 132 132 0.0 132 132

Run with ENTRYPOINT

In this run, we remove the CMD ab instruction from the Dockerfile, replace it with ENTRYPOINT [“ab”], and then rebuild the image.

This is similar to but different from the CMD command — when you use ENTRYPOINT, you cannot override the command unless you use the –entrypoint flag on the docker run command. Instead, any arguments passed to docker run are treated as arguments to the ENTRYPOINT.

$ docker run –rm ab "https://jayschmidt.us/"
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking jayschmidt.us (be patient)…..done

Server Software: nginx
Server Hostname: jayschmidt.us
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-ECDSA-AES256-GCM-SHA384,256,256
Server Temp Key: X25519 253 bits
TLS Server Name: jayschmidt.us

Document Path: /
Document Length: 12992 bytes

Concurrency Level: 1
Time taken for tests: 0.122 seconds
Complete requests: 1
Failed requests: 0
Total transferred: 13236 bytes
HTML transferred: 12992 bytes
Requests per second: 8.22 [#/sec] (mean)
Time per request: 121.709 [ms] (mean)
Time per request: 121.709 [ms] (mean, across all concurrent requests)
Transfer rate: 106.20 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 91 91 0.0 91 91
Processing: 31 31 0.0 31 31
Waiting: 31 31 0.0 31 31
Total: 122 122 0.0 122 122

What about syntax?

In the example above, we use ENTRYPOINT [“ab”] syntax to wrap the command we want to run in square brackets and quotes. However, it is possible to specify ENTRYPOINT ab (without quotes or brackets). 

Let’s see what happens when we try that.

$ docker run –rm ab "https://jayschmidt.us/"
ab: wrong number of arguments
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
-n requests Number of requests to perform
-c concurrency Number of multiple requests to make at a time
-t timelimit Seconds to max. to spend on benchmarking
This implies -n 50000
-s timeout Seconds to max. wait for each response
Default is 30 seconds
<– SNIP –>

Your first thought will likely be to re-run the docker run command as we did for CMD ab above, which is giving both the executable and the argument:

$ docker run –rm ab ab "https://jayschmidt.us/"
ab: wrong number of arguments
Usage: ab [options] [http[s]://]hostname[:port]/path
Options are:
-n requests Number of requests to perform
-c concurrency Number of multiple requests to make at a time
-t timelimit Seconds to max. to spend on benchmarking
This implies -n 50000
-s timeout Seconds to max. wait for each response
Default is 30 seconds
<– SNIP –>

This is because ENTRYPOINT can only be overridden if you explicitly add the –entrypoint argument to the docker run command. The takeaway is to always use ENTRYPOINT when you want to force the use of a given executable in the container when it is run.

Wrapping up: Key takeaways and best practices

The decision-making process involving the use of RUN, CMD, and ENTRYPOINT, along with the choice between shell and exec forms, showcases Docker’s intricate nature. Each command serves a distinct purpose in the Docker ecosystem, impacting how containers are built, operate, and interact with their environments. 

By selecting the right command and form for each specific scenario, developers can construct Docker images that are more reliable, secure, and optimized for efficiency. This level of understanding and application of Docker’s commands and their formats is crucial for fully harnessing Docker’s capabilities. Implementing these best practices ensures that applications deployed in Docker containers achieve maximum performance across various settings, enhancing development workflows and production deployments.

Learn more

CMD

ENTRYPOINT

RUN

Stay updated on the latest Docker news! Subscribe to the Docker Newsletter.

Get the latest release of Docker Desktop.

New to Docker? Get started.

Have questions? The Docker community is here to help.

Quelle: https://blog.docker.com/feed/

How to Run Hugging Face Models Programmatically Using Ollama and Testcontainers

Hugging Face now hosts more than 700,000 models, with the number continuously rising. It has become the premier repository for AI/ML models, catering to both general and highly specialized needs.

As the adoption of AI/ML models accelerates, more application developers are eager to integrate them into their projects. However, the entry barrier remains high due to the complexity of setup and lack of developer-friendly tools. Imagine if deploying an AI/ML model could be as straightforward as spinning up a database. Intrigued? Keep reading to find out how.

Introduction to Ollama and Testcontainers

Recently, Ollama announced support for running models from Hugging Face. This development is exciting because it brings the rich ecosystem of AI/ML components from Hugging Face to Ollama end users, who are often developers. 

Testcontainers libraries already provide an Ollama module, making it straightforward to spin up a container with Ollama without needing to know the details of how to run Ollama using Docker:

import org.testcontainers.ollama.OllamaContainer;

var ollama = new OllamaContainer("ollama/ollama:0.1.44");
ollama.start();

These lines of code are all that is needed to have Ollama running inside a Docker container effortlessly.

Running models in Ollama

By default, Ollama does not include any models, so you need to download the one you want to use. With Testcontainers, this step is straightforward by leveraging the execInContainer API provided by Testcontainers:

ollama.execInContainer("ollama", "pull", "moondream");

At this point, you have the moondream model ready to be used via the Ollama API. 

Excited to try it out? Hold on for a bit. This model is running in a container, so what happens if the container dies? Will you need to spin up a new container and pull the model again? Ideally not, as these models can be quite large.

Thankfully, Testcontainers makes it easy to handle this scenario, by providing an easy-to-use API to commit a container image programmatically:

public void createImage(String imageName) {
var ollama = new OllamaContainer("ollama/ollama:0.1.44");
ollama.start();
ollama.execInContainer("ollama", "pull", "moondream");
ollama.commitToImage(imageName);
}

This code creates an image from the container with the model included. In subsequent runs, you can create a container from that image, and the model will already be present. Here’s the pattern:

var imageName = "tc-ollama-moondream";
var ollama = new OllamaContainer(DockerImageName.parse(imageName)
.asCompatibleSubstituteFor("ollama/ollama:0.1.44"));
try {
ollama.start();
} catch (ContainerFetchException ex) {
// If image doesn't exist, create it. Subsequent runs will reuse the image.
createImage(imageName);
ollama.start();
}

Now, you have a model ready to be used, and because it is running in Ollama, you can interact with its API:

var image = getImageInBase64("/whale.jpeg");
String response = given()
.baseUri(ollama.getEndpoint())
.header(new Header("Content-Type", "application/json"))
.body(new CompletionRequest("moondream:latest", "Describe the image.", Collections.singletonList(image), false))
.post("/api/generate")
.getBody().as(CompletionResponse.class).response();

System.out.println("Response from LLM " + response);

Using Hugging Face models

The previous example demonstrated using a model already provided by Ollama. However, with the ability to use Hugging Face models in Ollama, your available model options have now expanded by thousands. 

To use a model from Hugging Face in Ollama, you need a GGUF file for the model. Currently, there are 20,647 models available in GGUF format. How cool is that?

The steps to run a Hugging Face model in Ollama are straightforward, but we’ve simplified the process further by scripting it into a custom OllamaHuggingFaceContainer. Note that this custom container is not part of the default library, so you can copy and paste the implementation of OllamaHuggingFaceContainer and customize it to suit your needs.

To run a Hugging Face model, do the following:

public void createImage(String imageName, String repository, String model) {
var model = new OllamaHuggingFaceContainer.HuggingFaceModel(repository, model);
var huggingFaceContainer = new OllamaHuggingFaceContainer(hfModel);
huggingFaceContainer.start();
huggingFaceContainer.commitToImage(imageName);
}

By providing the repository name and the model file as shown, you can run Hugging Face models in Ollama via Testcontainers. 

You can find an example using an embedding model and an example using a chat model on GitHub.

Customize your container

One key strength of using Testcontainers is its flexibility in customizing container setups to fit specific project needs by encapsulating complex setups into manageable containers. 

For example, you can create a custom container tailored to your requirements. Here’s an example of TinyLlama, a specialized container for spinning up the DavidAU/DistiLabelOrca-TinyLLama-1.1B-Q8_0-GGUF model from Hugging Face:

public class TinyLlama extends OllamaContainer {

private final String imageName;

public TinyLlama(String imageName) {
super(DockerImageName.parse(imageName)
.asCompatibleSubstituteFor("ollama/ollama:0.1.44"));
this.imageName = imageName;
}

public void createImage(String imageName) {
var ollama = new OllamaContainer("ollama/ollama:0.1.44");
ollama.start();
try {
ollama.execInContainer("apt-get", "update");
ollama.execInContainer("apt-get", "upgrade", "-y");
ollama.execInContainer("apt-get", "install", "-y", "python3-pip");
ollama.execInContainer("pip", "install", "huggingface-hub");
ollama.execInContainer(
"huggingface-cli",
"download",
"DavidAU/DistiLabelOrca-TinyLLama-1.1B-Q8_0-GGUF",
"distilabelorca-tinyllama-1.1b.Q8_0.gguf",
"–local-dir",
"."
);
ollama.execInContainer(
"sh",
"-c",
String.format("echo '%s' > Modelfile", "FROM distilabelorca-tinyllama-1.1b.Q8_0.gguf")
);
ollama.execInContainer("ollama", "create", "distilabelorca-tinyllama-1.1b.Q8_0.gguf", "-f", "Modelfile");
ollama.execInContainer("rm", "distilabelorca-tinyllama-1.1b.Q8_0.gguf");
ollama.commitToImage(imageName);
} catch (IOException | InterruptedException e) {
throw new ContainerFetchException(e.getMessage());
}
}

public String getModelName() {
return "distilabelorca-tinyllama-1.1b.Q8_0.gguf";
}

@Override
public void start() {
try {
super.start();
} catch (ContainerFetchException ex) {
// If image doesn't exist, create it. Subsequent runs will reuse the image.
createImage(imageName);
super.start();
}
}
}

Once defined, you can easily instantiate and utilize your custom container in your application:

var tinyLlama = new TinyLlama("example");
tinyLlama.start();
String response = given()
.baseUri(tinyLlama.getEndpoint())
.header(new Header("Content-Type", "application/json"))
.body(new CompletionRequest(tinyLlama.getModelName() + ":latest", List.of(new Message("user", "What is the capital of France?")), false))
.post("/api/chat")
.getBody().as(ChatResponse.class).message.content;
System.out.println("Response from LLM " + response);

Note how all the implementation details are under the cover of the TinyLlama class, and the end user doesn’t need to know how to actually install the model into Ollama, what GGUF is, or that to get huggingface-cli you need to pip install huggingface-hub.

Advantages of this approach

Programmatic access: Developers gain seamless programmatic access to the Hugging Face ecosystem.

Reproducible configuration: All configuration, from setup to lifecycle management is codified, ensuring reproducibility across team members and CI environments.

Familiar workflows: By using containers, developers familiar with containerization can easily integrate AI/ML models, making the process more accessible.

Automated setups: Provides a straightforward clone-and-run experience for developers.

This approach leverages the strengths of both Hugging Face and Ollama, supported by the automation and encapsulation provided by the Testcontainers module, making powerful AI tools more accessible and manageable for developers across different ecosystems.

Conclusion

Integrating AI models into applications need not be a daunting task. By leveraging Ollama and Testcontainers, developers can seamlessly incorporate Hugging Face models into their projects with minimal effort. This approach not only simplifies the setup of the development environment process but also ensures reproducibility and ease of use. With the ability to programmatically manage models and containerize them for consistent environments, developers can focus on building innovative solutions without getting bogged down by complex setup procedures.

The combination of Ollama’s support for Hugging Face models and Testcontainers’ robust container management capabilities provides a powerful toolkit for modern AI development. As AI continues to evolve and expand, these tools will play a crucial role in making advanced models accessible and manageable for developers across various fields. So, dive in, experiment with different models, and unlock the potential of AI in your applications today.

Stay current on the latest Docker news. Subscribe to the Docker Newsletter.

Learn more

Visit the Testcontainers website.

Get started with Testcontainers Cloud by creating a free account.

Read LLM Everywhere: Docker for Local and Hugging Face Hosting.

Learn how to Effortlessly Build Machine Learning Apps with Hugging Face’s Docker Spaces.

Get the latest release of Docker Desktop.

Quelle: https://blog.docker.com/feed/

Using Generative AI to Create Runnable Markdown

This ongoing GenAI Docker Labs series will explore the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real-time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing things as open source so you can play, explore, and hack with us, too.

Generative AI (GenAI) is changing how we interact with tools. Today, we might experience this predominantly through the use of new AI-powered chat assistants, but there are other opportunities for generative AI to improve the life of a developer.

When developers start working on a new project, they need to get up to speed on the tools used in that project. A common practice is to document these practices in a project README.md and to version that documentation along with the project. 

Can we use generative AI to generate this content? We want this content to represent best practices for how tools should be used in general but, more importantly, how tools should be used in this particular project.

We can think of this as a kind of conversation between developers, agents representing tools used by a project, and the project itself. Let’s look at this for the Docker tool itself.

Generating Markdown in VSCode

For this project, we have written a VSCode extension that adds one new command called “Generate a runbook for this project.” Figure 1 shows it in action:

Figure 1: VSCode extension to generate a runbook.

This approach combines prompts written by tool experts with knowledge about the project itself. This combined context improves the LLM’s ability to generate documentation (Figure 2).

Figure 2: This approach combines expert prompts with knowledge about the project itself.

Although we’re illustrating this idea on a tool that we know very well (Docker!), the idea of generating content in this manner is quite generic. The prompts we used for getting started with the Docker build, run, and compose are available from GitHub. There is certainly an art to writing these prompts, but we think that tool experts have the right knowledge to create prompts of this kind, especially if AI assistants can then help them make their work easier to consume.

There is also an essential point here. If we think of the project as a database from which we can retrieve context, then we’re effectively giving an LLM the ability to retrieve facts about the project. This allows our prompts to depend on local context. For a Docker-specific example, we might want to prompt the AI to not talk about compose if the project has no compose.yaml files. 

“I am not using Docker Compose in this project.”

That turns out to be a transformative user prompt if it’s true. This is what we’d normally learn through a conversation. However, there are certain project details that are always useful. This is why having our assistants right there in the local project can be so helpful.

Runnable Markdown

Although Markdown files are mainly for reading, they often contain runnable things. LLMs converse with us in text that often contains code blocks that represent actual runnable commands. And, in VSCode, developers use the embedded terminal to run commands against the currently open project. Let’s short-circuit this interaction and make commands runnable directly from these Markdown runbooks.

In the current extension, we’ve added a code action to every code block that contains a shell command so that users can launch that command in the embedded terminal. During our exploration of this functionality, we have found that treating the Markdown file as a kind of REPL (read-eval-print-loop) can help to refine the output from the LLM and improve the final content. Figure 3 what this looks like in action:

Figure 3: Adding code to allow users to launch the command in the embedded terminal.

Markdown extends your editor

In the long run, nobody is going to navigate to a Markdown file in order to run a command. However, we can treat these Markdown files as scripts that create commands for the developer’s edit session. We can even let developers bind them to keystrokes (e.g., type ,b to run the build code block from your project runbook).

In the end, this is just the AI Assistant talking to itself. The Assistant recommends a command. We find the command useful. We turn it into a shortcut. The Assistant remembers this shortcut because it’s in our runbook, and then makes it available whenever we’re developing this project.

Figure 4: The Assistant in action.

Figure 4 shows a real feedback loop between the Assistant, the generated content, and the developer that is actually running these commands. 

As developers, we tend to vote with our keyboards. If this command is useful, let’s make it really easy to run! And if it’s useful for me, it might be useful for other members of my team, too.

The GitHub repository and install instructions are ready for you to try today.

For more, see this demo: VSCode Walkthrough of Runnable Markdown from GenAI.

Subscribe to Docker Navigator to stay current on the latest Docker news.

Learn more

Subscribe to the Docker Newsletter.

Read Docker, Putting the AI in Containers.

Read the AI Trends Report 2024: AI’s Growing Role in Software Development.

Quelle: https://blog.docker.com/feed/

ReadMeAI: An AI-powered README Generator for Developers

This post was written in collaboration with Docker AI/ML Hackathon participants Gitanshu Sankhla and Vijay Barma.

In this AI/ML Hackathon post, we’ll share another interesting winning project from last year’s Docker AI/ML Hackathon. This time, we will dive into ReadMeAI, one of the honorable mention winners. 

For many developers, planning and writing code is the most enjoyable part of the process. It’s where creativity meets logic, and lines of code transform into solutions. Although some developers find writing documentation equally fulfilling, crafting clear and concise code instructions isn’t for everyone.

Imagine you’re a developer working on a complex project with a team. You just pushed your final commit with a sign of relief, but the clock is ticking on your deadline. You know that clear documentation is crucial. Your teammates need to understand your code’s intricacies for smooth integration, but writing all that documentation feels like another project entirely, stealing your precious time from bug fixes and testing. That’s where ReadMeAI, an AI-powered README generator fits in. 

What makes ReadMeAI unique?

The following demo, which was submitted to the AI/ML Hackathon, provides an overview of ReadMeAI (Figure 1).

Figure 1: Demo of the ReadMeAI as submitted to the AI/ML Hackathon.

The ReadMeAI tool allows users to upload a code file and describe their project. The tool generates Markdown code, which can be edited in real-time using a code editor, and the changes are previewed instantly.

The user interface of ReadmeAI is designed to be clean and modern, making the application easy to use for all users.

Benefits of ReadMeAI include:

Effortless documentation: Upload your code, provide a brief description, and let ReadMeAI generate a comprehensive markdown file for your README seamlessly.

Seamless collaboration: ReadMeAI promotes well-structured READMEs with essential sections, making it easier for your team to understand and contribute to the codebase, fostering smoother collaboration.

Increased efficiency: Stop wasting time on boilerplate documentation. ReadMeAI automates the initial draft of your README, freeing up valuable developer time for coding, testing, and other crucial project tasks.

Use cases include:

API documentation kick-off: ReadMeAI provides a solid foundation for your API documentation. It generates an initial draft outlining API endpoints, parameters, and expected responses. This jumpstarts your process and lets you focus on the specifics of your API’s functionality.

Rapid prototyping and documentation: During rapid prototyping, functionality often takes priority over documentation. ReadMeAI bridges this gap. It quickly generates a basic README with core information, allowing developers to have documentation in place while focusing on building the prototype.

Open source project kick-off: ReadMeAI can jumpstart the documentation process for your open source project. Simply provide your codebase and a brief description, and ReadMeAI generates a well-structured README file with essential sections like installation instructions, usage examples, and contribution guidelines. This saves you time and ensures consistent documentation across your projects.

Focus on what you do best — coding. Let ReadMeAI handle the rest.

How does it work?

ReadMeAI converts code and description into a good-looking README file. Users can upload code files and describe their code in a few words, and ReadMeAI will generate Markdown code for your README. You will get a built-in editor to format your README according to your needs, and then you can download your README in Markdown and HTML format. 

Figure 2 shows an overview of the ReadMeAI architecture.

Figure 2: Architecture of the ReadMeAI tool displaying frontend and backend.

Technical stack

The ReadMeAI tech stack includes:

Node.js: A server-side runtime that handles server-side logic and interactions.

Express: A popular Node.js framework that handles routing, middleware, and request handling.

Google PaLM API: Google’s Pathways Language Model (PaLM) is a 540-billion parameter transformer-based large language model. It is used in the ReadMeAI project to generate a Markdown README based on the uploaded code and user description.

Embedded JavaScript (EJS): A templating engine that allows you to render and add dynamic content to the HTML on the server side.

Cascading Style Sheets (CSS): Add styling to the generated Markdown content.

JavaScript: Add interactivity to the front end, handle client-side logic, and communicate with the server side.

AI integration and markdown generation

The AI integration is handled by the controllers/app.js file (as shown below), specifically in the postApp function. The uploaded code and user description are passed to the AI integration, which uses the Google Palm API to generate a Markdown README. 

The Markdown generator is implemented in the postApp function. The AI-generated content is converted into Markdown format using the showdown library.

const fs = require('fs');
const path = require('path');

const showdown = require('showdown');
const multer = require('multer');
const zip = require('express-zip');

const palmApi = require('../api/fetchPalm');

// showdown converter
const converter = new showdown.Converter();
converter.setFlavor('github');

// getting template
let template;
fs.readFile('./data/template.txt', 'utf8', (err, data) => {
if (err) {
console.error(err)
return
}
template = data;
});

// getting '/'
exports.getApp = (req, res)=>{
res.render('home', {
pageTitle: 'ReadMeAI – Home'
})
}

exports.getUpload = (req, res)=>{
res.render('index', {
pageTitle: 'ReadMeAI – Upload'
})
}

// controller to sent generate readme from incoming data
exports.postApp = (req, res)=>{
let html, dt;
const code = req.file.filename;
const description = req.body.description;

try {
dt = fs.readFileSync(`uploads/${code}`, 'utf8');
} catch (err) {
console.error("read error",err);
}

palmApi.getData(template, dt, description)
.then(data => {
html = converter.makeHtml(data);
res.render('editor', {
pageTitle: 'ReadMeAI – Editor',
html: html,
md: data
});
//deleting files from upload folder
fs.unlink(`uploads/${code}`, (err) => {
if (err) {
console.error(err);
return;
}
console.log('File deleted successfully');
});

}).catch(err => console.log('error occured',err));

}

exports.postDownload = (req, res) => {
const html = req.body.html;
const md = req.body.markdown;

const mdFilePath = path.join(__dirname, '../downloads/readme.md');
const htmlFilePath = path.join(__dirname, '../downloads/readme.html');

fs.writeFile(mdFilePath, md, (err) => {
if (err) console.error(err);
else console.log('Created md file successfully');
});

fs.writeFile(htmlFilePath, html, (err) => {
if (err) console.error(err);
else console.log('Created html file successfully');
});

res.zip([
{ path: mdFilePath, name: 'readme.md' },
{ path: htmlFilePath, name: 'readme.html' }
]);
}

The controller functions (gettApp, getUpload, postApp, postDownload) handle the incoming requests and interact with the AI integration, markdown generator, and views. After generating the Markdown content, the controllers pass the generated content to the appropriate views.

These controller functions are then exported and used in the routes defined in the routes/app.js file.

Views 

The views are defined in the views/ directory. The editor.ejs file is an Embedded JavaScript (EJS) file that is responsible for rendering the editor view. It is used to generate HTML markup that is sent to the client.

<%- include('includes/head.ejs') %>
<!– google fonts –>
<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Material+Symbols+Outlined:opsz,wght,FILL,GRAD@24,400,0,0" />
<!– stylesheets –>
<link rel="stylesheet" href="/css/edistyles.css">
<link rel="stylesheet" href="/css/output.css">

</head>
<body>
<header class="header-nav">
<h1 class="logo">ReadMeAI</h1>
<div class="light-container">
<div class="phone">
<span class="material-symbols-outlined" id="rotate-item">
phone_iphone</span>
</div>
<div class="tubelight">
<div class="bulb"></div>
</div>
</div>
</header>
<main class="main">
<div class="mobile-container">
<p>Sorry but the editor is disable on mobile device's, but it's best experienced on a PC or Tablet </p>
…..
<button class="btn-containers" id="recompile">
<span class="material-symbols-outlined">bolt</span>
</button>
</header>
<textarea name="textarea" id="textarea" class="sub-container output-container container-markdown" ><%= md %></textarea>
</div>
…..
<!– showdown cdn –>
<script src="https://cdnjs.cloudflare.com/ajax/libs/showdown/2.1.0/showdown.min.js" integrity="sha512-LhccdVNGe2QMEfI3x4DVV3ckMRe36TfydKss6mJpdHjNFiV07dFpS2xzeZedptKZrwxfICJpez09iNioiSZ3hA==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<!– ionicons cdn –>
<script type="module" src="https://unpkg.com/ionicons@7.1.0/dist/ionicons/ionicons.esm.js"></script>
<script nomodule src="https://unpkg.com/ionicons@7.1.0/dist/ionicons/ionicons.js"></script>

<script src="/scripts/edi-script.js"></script>
<script src="/scripts/tubelightBtn.js"></script>
</body>

Rendering the view

The controllers render the appropriate views with the generated content or serve API responses. The editor.ejs view is rendered with the generated Markdown content (html: html, md: data).

exports.postApp = (req, res) => {
//…
// Generate Markdown content
//…

res.render('editor', {
pageTitle: 'ReadMeAI – Editor',
html: html,
md: data
});
};

When the postApp function is called, the palmApi.getData function is used to fetch data from the Palm API based on the template, the incoming Markdown content, and the provided description. Once the data is fetched, the converter.makeHtml function is used to convert the Markdown content to HTML.

The res.render function is then used to render the editor view with the generated HTML content and Markdown content. The editor.ejs view should have the necessary code to display the HTML content and Markdown content in the desired format.

This approach allows for the dynamic generation of README content based on the incoming Markdown content and the provided template. The generated HTML content then gets rendered into the web page for the user to view.

Sending the response 

The rendered view is sent as a response to the client using the res.render function. This function is used to render a view. This process ensures that the generated Markdown content is dynamically rendered into a web page using the provided template, and the web page is then sent as a response to the client.

Getting started

To get started, ensure that you have installed the latest version of Docker Desktop.

Clone the repository

Open a terminal window and run the following command to clone the sample application:

git clone https://github.com/Gitax18/ReadMeAI

You should now have the following files in your ReadMeAI directory:

ReadMeAI
├── CONTRIBUTING.md
├── Dockerfile
├── LICENSE
├── README.md
├── api
│ └── fetchPalm.js
├── controllers
│ └── app.js
├── data
│ ├── output.md
│ └── template.txt
├── downloads
│ ├── readme.html
│ └── readme.md
├── package-lock.json
├── package.json
├── public
│ ├── css
│ │ ├── edistyles.css
│ │ ├── home.css
│ │ ├── index.css
│ │ └── output.css
│ ├── images
│ │ ├── PaLM_API_Graphics-02.width-1200.format-webp.webp
│ │ ├── logos
│ │ │ ├── dh.png
│ │ │ ├── dp.png
│ │ │ └── gh.png
│ │ ├── pre.png
│ │ └── vscode.jpg
│ └── scripts
│ ├── edi-script.js
│ ├── home.js
│ ├── index.js
│ └── tubelightBtn.js
├── routes
│ └── app.js
├── server.js
├── uploads
│ ├── 1699377702064#Gradient.js
│ └── important.md
└── views
├── 404.ejs
├── editor.ejs
├── home.ejs
├── includes
│ └── head.ejs
└── index.ejs

14 directories, 35 files

Understanding the project directory structure

Here’s an overview of the project directory structure and the purpose of each folder and file:

api/: Contains code to connect to third-party APIs, such as Google PaLM 2.

controllers/: Includes all the business logic for handling POST/GET requests.

views/: Contains files for rendering on the client side.

data/: Holds the ‘template’ for the output and ‘output.md’ for the generated markdown.

public/: Contains client-side CSS and scripts.

routes/: Manages routes and calls the respective controller functions for each route.

uploads/: Temporarily stores files received from the client side, which are deleted once the session ends.

server.js: The main Express server file, executed when starting the server.

Dockerfile: Contains the script to containerize the project.

Building the app

Run the following command to build the application.

docker build -t readmeai .

Run the app:

docker run -d -p 3333:3333 readmeai

You will see log output similar to the following:

> readme-ai-generator@1.0.0 start
> node server.js

server is listening at http://localhost:3333

Figure 3: Docker dashboard listing the running ReadMeAI container.

Alternatively, you can pull and run the ReadMeAI Docker image directly from Docker Hub (Figure 3) using the following command:

docker run -it -p 3333:3333 gitax18/readmeai

You should be able to access the application at http://localhost:3333 (Figure 4).

Figure 4: The landing page of the ReadMeAI tool.

Select Explore and upload your source code file by selecting Click to upload file (Figure 5).

Figure 5: The Main UI page that allows users to upload their project file.

Once you finish describing your project, select Generate (Figure 6).

Figure 6: Uploading the project file and creating a brief description of the code/project.

ReadMeAI utilizes Google’s Generative Language API to create draft README files based on user-provided templates, code snippets, and descriptions (Figure 7).

Figure 7: Initial output from ReadMeAI. The built-in editor makes minor changes simple.

What’s next?

ReadMeAI was inspired by a common problem faced by developers: the time-consuming and often incomplete task of writing project documentation. ReadMeAI was developed to streamline the process, allowing developers to focus more on coding and less on documentation. The platform transforms code and brief descriptions into comprehensive, visually appealing README files with ease.

We are inspired by the ingenuity of ReadMeAI, particularly in solving a fundamental issue in the developer community. 

Looking ahead, the creators plan to enhance ReadMeAI with features like GitHub integration, custom templates, and improved AI models such as Llama. By adopting newer technologies and architectures, they plan to make ReadMeAI even more powerful and efficient.

Join us in this journey to improve ReadMeAI making it an indispensable tool for developers worldwide.

Learn more

Subscribe to the Docker Newsletter.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Understanding the Docker USER Instruction

In the world of containerization, security and proper user management are crucial aspects that can significantly affect the stability and security of your applications. The USER instruction in a Dockerfile is a fundamental tool that determines which user will execute commands both during the image build process and when running the container. By default, if no USER is specified, Docker will run commands as the root user, which can pose significant security risks. 

In this blog post, we will delve into the best practices and common pitfalls associated with the USER instruction. Additionally, we will provide a hands-on demo to illustrate the importance of these practices. Understanding and correctly implementing the USER instruction is vital for maintaining secure and efficient Docker environments. Let’s explore how to manage user permissions effectively, ensuring that your Docker containers run securely and as intended.

Docker Desktop 

The commands and examples provided are intended for use with Docker Desktop, which includes Docker Engine as an integrated component. Running these commands on Docker Community Edition (standalone Docker Engine) is possible, but your output may not match that shown in this post. The blog post How to Check Your Docker Installation: Docker Desktop vs. Docker Engine explains the differences and how to determine what you are using.

UID/GID: A refresher

Before we discuss best practices, let’s review UID/GID concepts and why they are important when using Docker. This relationship factors heavily into the security aspects of these best practices.

Linux and other Unix-like operating systems use a numeric identifier to identify each discrete user called a UID (user ID). Groups are identified by a GID (group ID), which is another numeric identifier. These numeric identifiers are mapped to the text strings used for username and groupname, but the numeric identifiers are used by the system internally.

The operating system uses these identifiers to manage permissions and access to system resources, files, and directories. A file or directory has ownership settings including a UID and a GID, which determine which user and group have access rights to it. Users can be members of multiple groups, which can complicate permissions management but offers flexible access control.

In Docker, these concepts of UID and GID are preserved within containers. When a Docker container is run, it can be configured to run as a specific user with a designated UID and GID. Additionally, when mounting volumes, Docker respects the UID and GID of the files and directories on the host machine, which can affect how files are accessed or modified from within the container. This adherence to Unix-like UID/GID management helps maintain consistent security and access controls across both the host and containerized environments. 

Groups

Unlike USER, there is no GROUP directive in the Dockerfile instructions. To set up a group, you specify the groupname (GID) after the username (UID). For example, to run a command as the automation user in the ci group, you would write USER automation:ci in your Dockerfile.

If you do not specify a GID, the list of groups that the user account is configured as part of is used. However, if you do specify a GID, only that GID will be used. 

Current user

Because Docker Desktop uses a virtual machine (VM), the UID/GID of your user account on the host (Linux, Mac, Windows HyperV/WSL2) will almost certainly not have a match inside the Docker VM.

You can always check your UID/GID by using the id command. For example, on my desktop, I am UID 503 with a primary GID of 20:

$ id
uid=503(jschmidt) gid=20(staff) groups=20(staff),<–SNIP–>

Best practices

Use a non-root user to limit root access

As noted above, by default Docker containers will run as UID 0, or root. This means that if the Docker container is compromised, the attacker will have host-level root access to all the resources allocated to the container. By using a non-root user, even if the attacker manages to break out of the application running in the container, they will have limited permissions if the container is running as a non-root user. 

Remember, if you don’t set a USER in your Dockerfile, the user will default to root. Always explicitly set a user, even if it’s just to make it clear who the container will run as.

Specify user by UID and GID

Usernames and groupnames can easily be changed, and different Linux distributions can assign different default values to system users and groups. By using a UID/GID you can ensure that the user is consistently identified, even if the container’s /etc/passwd file changes or is different across distributions. For example:

USER 1001:1001

Create a specific user for the application

If your application requires specific permissions, consider creating a dedicated user for your application in the Dockerfile. This can be done using the RUN command to add the user. 

Note that when we are creating a user and then switching to that user within our Dockerfile, we do not need to use the UID/GID because they are being set within the context of the image via the useradd command. Similarly, you can add a user to a group (and create a group if necessary) via the RUN command.

Ensure that the user you set has the necessary privileges to run the commands in the container. For instance, a non-root user might not have the necessary permissions to bind to ports below 1024. For example:

RUN useradd -ms /bin/bash myuser
USER myuser

Switch back to root for privileged operations

If you need to perform privileged operations in the Dockerfile after setting a non-root user, you can switch to the root user and then switch back to the non-root user once those operations are complete. This approach adheres to the principle of least privilege; only tasks that require administrator privileges are run as an administrator. Note that it is not recommended to use sudo for privilege elevation in a Dockerfile. For example:

USER root
RUN apt-get update && apt-get install -y some-package
USER myuser

Combine USER with WORKDIR

As noted above, the UID/GID used within a container applies both within the container and with the host system. This leads to two common problems:

Switching to a non-root user and not having permissions to read or write to the directories you wish to use (for example, trying to create a directory under / or trying to write in /root.

Mounting a directory from the host system and switching to a user who does not have permission to read/write to the directory or files in the mount.

USER root
RUN mkdir /app&&chown ubuntu
USER ubuntu
WORKDIR /app

Example

The following example shows you how the UID and GID behave in different scenarios depending on how you write your Dockerfile. Both examples provide output that shows the UID/GID of the running Docker container. If you are following along, you need to have a running Docker Desktop installation and a basic familiarity with the docker command.

Standard Dockerfile

Most people take this approach when they first begin using Docker; they go with the defaults and do not specify a USER.

# Use the official Ubuntu image as the base
FROM ubuntu:20.04

# Print the UID and GID
CMD sh -c "echo 'Inside Container:' && echo 'User: $(whoami) UID: $(id -u) GID: $(id -g)'"

Dockerfile with USER

This example shows how to create a user with a RUN command inside a Dockerfile and then switch to that USER.

# Use the official Ubuntu image as the base
FROM ubuntu:20.04

# Create a custom user with UID 1234 and GID 1234
RUN groupadd -g 1234 customgroup &&
useradd -m -u 1234 -g customgroup customuser

# Switch to the custom user
USER customuser

# Set the workdir
WORKDIR /home/customuser

# Print the UID and GID
CMD sh -c "echo 'Inside Container:' && echo 'User: $(whoami) UID: $(id -u) GID: $(id -g)'"

Build the two images with:

$ docker build -t default-user-image -f Dockerfile1 .
$ docker build -t custom-user-image -f Dockerfile2 .

Default Docker image

Let’s run our first image, the one that does not provide a USER command. As you can see, the UID and GID are 0/0, so the superuser is root. There are two things at work here. First, we are not defining a UID/GID in the Dockerfile so Docker defaults to the superuser. But how does it become a superuser if my account is not a superuser account? This is because the Docker Engine runs with root permissions, so containers that are built to run as root inherit the permissions from the Docker Engine.

$ docker run –rm default-user-image
Inside Container:
User: root UID: 0 GID: 0
Custom User Docker Image

Custom Docker image

Let’s try to fix this — we really don’t want Docker containers running as root. So, in this version, we explicitly set the UID and GID for the user and group. Running this container, we see that our user is set appropriately.

$ docker run –rm custom-user-image
Inside Container:
User: customuser UID: 1234 GID: 1234

Enforcing best practices

Enforcing best practices in any environment can be challenging, and the best practices outlined in this post are no exception. Docker understands that organizations are continually balancing security and compliance against innovation and agility and is continually working on ways to help with that effort. Our Enhanced Container Isolation (ECI) offering, part of our Hardened Docker Desktop, was designed to address the problematic aspects of having containers running as root.

Enhanced Container Isolation mechanisms, such as user namespaces, help segregate and manage privileges more effectively. User namespaces isolate security-related identifiers and attributes, such as user IDs and group IDs, so that a root user inside a container does not map to the root user outside the container. This feature significantly reduces the risk of privileged escalations by ensuring that even if an attacker compromises the container, the potential damage and access scope remain confined to the containerized environment, dramatically enhancing overall security.

Additionally, Docker Scout can be leveraged on the user desktop to enforce policies not only around CVEs, but around best practices — for example, by ensuring that images run as a non-root user and contain mandatory LABELs.

Staying secure

Through this demonstration, we’ve seen the practical implications and benefits of configuring Docker containers to run as a non-root user, which is crucial for enhancing security by minimizing potential attack surfaces. As demonstrated, Docker inherently runs containers with root privileges unless specified otherwise. This default behavior can lead to significant security risks, particularly if a container becomes compromised, granting attackers potentially wide-ranging access to the host or Docker Engine.

Use custom user and group IDs

The use of custom user and group IDs showcases a more secure practice. By explicitly setting UID and GID, we limit the permissions and capabilities of the process running inside the Docker container, reducing the risks associated with privileged user access. The UID/GID defined inside the Docker container does not need to correspond to any actual user on the host system, which provides additional isolation and security.

User namespaces

Although this post extensively covers the USER instruction in Docker, another approach to secure Docker environments involves the use of namespaces, particularly user namespaces. User namespaces isolate security-related identifiers and attributes, such as user IDs and group IDs, between the host and the containers. 

With user namespaces enabled, Docker can map the user and group IDs inside a container to non-privileged IDs on the host system. This mapping ensures that even if a container’s processes break out and gain root privileges within the Docker container, they do not have root privileges on the host machine. This additional layer of security helps to prevent the escalation of privileges and mitigate potential damage, making it an essential consideration for those looking to bolster their Docker security framework further. Docker’s ECI offering leverages user namespaces as part of its security framework.

Conclusion

When deploying containers, especially in development environments or on Docker Desktop, consider the aspects of container configuration and isolation outlined in this post. Implementing the enhanced security features available in Docker Business, such as Hardened Docker Desktop with Enhanced Container Isolation, can further mitigate risks and ensure a secure, robust operational environment for your applications.

Learn more

Read the Dockerfile reference guide.

Get the latest release of Docker Desktop.

Explore Docker Guides.

New to Docker? Get started.

Subscribe to the Docker Newsletter.

Quelle: https://blog.docker.com/feed/

How to Measure DevSecOps Success: Key Metrics Explained

DevSecOps involves the integration of security throughout the entire software development and delivery lifecycle, representing a cultural shift where security is a collective responsibility for everyone building software. By embedding security at every stage, organizations can identify and resolve security issues earlier in the development process rather than during or after deployment.

Organizations adopting DevSecOps often ask, “Are we making progress?” To answer this, it’s crucial to implement metrics that provide clear insights into how an organization’s security posture evolves over time. Such metrics allow teams to track progress, pinpoint areas for improvement, and make informed decisions to drive continuous improvement in their security practices. By measuring the changing patterns in key indicators, organizations can better understand the impact of DevSecOps and make data-driven adjustments to strengthen their security efforts. 

Organizations commonly have many DevSecOps metrics that they can draw from. In this blog post, we explore two foundational metrics for assessing DevSecOps success. 

Key DevSecOps metrics

1. Number of security vulnerabilities over time

Vulnerability analysis is a foundational practice for any organization embarking on a software security journey. This metric tracks the volume of security vulnerabilities identified in a system or software project over time. It helps organizations spot trends in vulnerability detection and remediation, signaling how promptly security gaps are being remediated or mitigated. It can also be an indicator of the effectiveness of an org’s vulnerability management initiatives and their adoption, both of which are crucial to reducing the risk of cyberattacks and data breaches.

2. Compliance with security policies

Many industries are subject to cybersecurity frameworks and regulations that require organizations to maintain specific security standards. Policies provide a way for organizations to codify the rules for producing and using software artifacts. By tracking policy compliance over time, organizations can verify consistent adherence to established security requirements and best practices, promoting a unified approach to software development.

The above metrics are a good starting point for most organizations looking to measure their transformation from DevSecOps activities. The next step — once these metrics are implemented — is to invest in an observability system that enables relevant stakeholders, such as security engineering, to easily consume the data. 

DevSecOps insights with Docker Scout

Organizations interested in evaluating their container images against these metrics can get started in a few simple steps with Docker Scout. The Docker Scout web interface provides a comprehensive dashboard for CISOs, security teams, and software developers, offering an overview of vulnerability trends and policy compliance status (Figure 1). The web interface is a one-stop shop where users can drill down into specific images for deeper investigations and customize out-of-the-box policies to meet their specific needs.

Figure 1: Docker Scout dashboard.

Furthermore, the Docker Scout metrics exporter is a powerful addition to the Docker Scout ecosystem to bring vulnerability and policy compliance metrics into existing monitoring systems. This HTTP endpoint enables users to configure Prometheus-compatible tools to scrape Docker Scout data, allowing organizations to integrate with popular observability tools like Grafana and Datadog to achieve centralized security observability. 

Figures 2 and 3 show two sample Grafana dashboards illustrating the vulnerability trends and policy compliance insights that Docker Scout can provide.

Figure 2: Grafana Dashboard — Policy compliance.

Figure 2 displays a dashboard that illustrates the compliance posture for each policy configured within a Docker Scout organization. This visualization shows the proportion of images in a stream that complies with the defined policies. At the top of the dashboard, you can see the current compliance rate for each policy, while the bottom section shows compliance trends over the past 30 days.

Figure 3 shows a second Grafana dashboard illustrating the number of vulnerabilities by severity over time within a given stream. In this example, you can see notable spikes across all vulnerabilities, indicating the need for deeper investigation and prioritizing remediation.

Figure 3: Grafana Dashboard — Vulnerabilities by severity trends.

Conclusion

Docker Scout metrics exporter is designed to help security engineers improve containerized application security posture in an operationally efficient way. To get started, follow the instructions in the documentation. The instructions will get you up and running with the current public release of metrics exporter. 

Our product team is always open to feedback on social channels such as X and Slack and is looking for ways to evolve the product to align with our customers’ use cases.

Learn more

Visit the Docker Scout product page.

Looking to get up and running? Use our Docker Scout quickstart guide.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Subscribe to the Docker Newsletter.

Quelle: https://blog.docker.com/feed/

New Beta Feature: Deep Dive into GitHub Actions Docker Builds with Docker Desktop

We’re excited to announce the beta release of a new feature for inspecting GitHub Actions builds directly in Docker Desktop 4.31. 

Centralized CI/CD environments, such as GitHub Actions, are popular and useful, giving teams a single location to build, test, and verify their deployments. However, remote processes, such as builds in GitHub Actions, often lack visibility into what’s happening in your Docker builds. This means developers often need additional builds and steps to locate the root cause of issues, making diagnosing and resolving build issues challenging. 

To help, we’re introducing enhancements in GitHub Actions Summary and Docker Desktop to provide a deeper understanding of your Docker builds in GitHub Actions.

Get a high-level view with Docker Build Summary in GitHub Actions 

We now provide Docker Build Summary, a GitHub Actions Summary that displays reports and aggregates build information. The Docker Build Summary offers additional details about your builds in GitHub Actions, including a high-level summary of performance metrics, such as build duration, and cache utilization (Figure 1). Users of docker/build-push-action and docker/bake-action will automatically receive Docker Build Summaries. 

Key benefits

Identify build failures: Immediate access to error details eliminates the need to sift through logs.

Performance metrics: See exactly how long the Docker Build stage took and assess if it met expectations.

Cache utilization: View the percentage of the build that used the cache to identify performance impacts.

Configuration details: Access information on build inputs to understand what ran during build time.

Figure 1: Animated view of Docker Build Summary in GitHub Actions, showing Build details, including Build status, error message, metrics, Build inputs, and more.

If further investigation is needed, we package your build results in a .dockerbuild archive file. This file can be imported to the Build View in Docker Desktop, providing comprehensive build details, including timings, dependencies, logs, and traces.

Import and inspect GitHub Actions Builds in Docker Desktop

Initially announced last year, the Build View in Docker Desktop now supports importing the dockerbuild archive from GitHub Actions, providing greater insight into your Docker builds. 

In Docker Desktop, navigate to the Builds View tab and use the new Import Builds button. Select the .dockerbuild file you downloaded to access all the details about your remote build as if you ran it locally (Figure 2). 

Figure 2: Animated view of Docker Desktop, showing steps to navigate to the Builds View tab and use the new Import Builds button.

You can view in-depth information about your build execution, including error lines in your Dockerfile, build timings, cache utilization, and OpenTelemetry traces. This comprehensive view helps diagnose complex builds efficiently.

For example, you can see the stack trace right next to the Dockerfile command that is causing the issues, which is useful for understanding the exact step and attributes that caused the error (Figure 3).

Figure 3: Inspecting a build error in Builds View.

You can even see the commit and source information for the build and easily locate who made the change for more help in resolving the issue, along with other useful info you need for diagnosing even the most complicated builds (Figure 4). 

Figure 4: Animated view of Docker Desktop showing info for inspecting an imported build, such as source details, build timing, dependencies, configuration, and more.

Enhance team collaboration

We aim to enhance team collaboration, allowing you to share and work together on Docker builds and optimizing the build experience for your team. These .dockerbuild archives are self-contained and don’t expire, making them perfect for team collaboration. Share the .dockerbuild file via Slack or email or attach it to GitHub issues or Jira tickets to preserve context for when your team investigates.

Get started

To start using Docker Build Summary and the .dockerbuild archive in Docker Desktop, update your Docker Build GitHub Actions configuration to:

uses: docker/build-push-action@v6

uses: docker/bake-action@v5

Then, update to Docker Desktop 4.31 to inspect build archives from GitHub Actions. Learn more in the documentation.

We are incredibly excited about these new features, which will help you and your team diagnose and resolve build issues quickly. Please try them out and let us know what you think!

Learn more

Subscribe to the Docker Newsletter.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/