Docker Hardened Images: crafted by humans, protected by AI

At Docker, we are building our hardened images to exacting standards. That means carefully crafting by hand, because humans are still the best security architects. That said, we understand the value of AI and deploy it as an extra set of eyes at critical junctures in our Docker Hardened Image (DHI) build process. With this dual approach, our users get better coverage and more secure products. Humans aren’t perfect, but AI can help them become better. Here’s a recent example of how the AI tools we use for DHI caught a bug, made a product more secure, and contributed back to the community.

How the AI guardrail stepped in

Our upstream release automation opened a routine version bump for nginx-exporter. As part of our release process, the DHI AI guardrail fetched the upstream diffs that the bump would pull in and scanned them with language-aware checks. It spotted a classic logic inversion in the exporter’s new proxy-protocol path and immediately blocked our PR from auto-merging. I reproduced the issue, confirmed the diagnosis, and submitted a small fix upstream. Because the guardrail held the line, customers never saw the bug.

That sequence matters. A normal dependency update would have sailed through and shipped a regression. Instead, Docker’s AI guardrail acted like a release bouncer at the door of a bar. The AI guardrail looked at what was about to enter, recognized a risk pattern, and stopped it from coming in and causing trouble. A human Docker engineer then read the flagged issue, proved the failure and sent the fix. The fix was then accepted by the project, improving their code base. So, this fix not only improved DHI but also improved the project for everyone. 

While the fix pull request has been reviewed and approved by the upstream maintainers, our DHI build pipeline applied the patch and shipped the new, patched version to our customers. Making sure that customers never saw the regression in their environments. 

The AI assisted DHI process

Ironically, standard AI coding assistants didn’t spot the flaw. This is why having our own internal AI guardrails is so critical. They provide the extra layer of support and the specialization that is hard to get from general-purpose coding assistants.

This is how we want AI to show up in our pipeline. It is not a replacement for engineering judgment, but a force multiplier for safety. The guardrail focuses on a narrow set of high-leverage issues that can cause big problems such as inverted error checks, ignored failures, and resource mishandling or suspicious contributor activity. 

The example also shows the value of layered safeguards. We don’t just accept an upstream version and hope for the best. The AI guardrail scrutinizes what changed. Our policy treats high-confidence findings as a hard stop. Humans then verify, reproduce the behavior, and apply the smallest correct patch. Only then does the release move forward. The best security is proactive, not reactive. 

As I alluded to above, there’s an important broader open source benefit to our AI guardrails. DHI depends on hundreds of community projects, some of which are downloaded millions or even billions of times per month. When the DHI AI guardrail surfaces an issue, our default is to fix it upstream rather than carry a private patch. That keeps our images clean, reduces long-term maintenance, and gives every downstream user a better baseline. It also helps the upstream projects, which benefit from our AI anomaly detection, our human judgment, and our subsequent code fixes.

So back to the title. DHI is crafted by engineers, but it is protected by AI. Our guardrail is an active and continuously improving AI that reads upstream diffs with context, recognizes risky patterns, scores confidence, and blocks merges that don’t pass muster. Well-applied AI helps the human work better with faster reviews, tighter patches, and fewer regressions.

This is the partnership we’re optimizing for. Humans set intent, exercise design judgment, and ship features. AI enforces discipline at critical gates. Best of all, every catch feeds the model signals for future scans, so protections improve as our ecosystem evolves. Collectively, this adds to the security of the entire open source ecosystem. 

It’s a win for Docker, for our customers, and the community.
Quelle: https://blog.docker.com/feed/

Join Us in Rebooting the Docker Model Runner Community!

We’re thrilled to announce that we’re breathing new life into the Docker Model Runner community, and we want you to be a part of it! Our goal is to make it easier than ever for you to contribute, collaborate, and help shape the future of running AI models with Docker.

From a Limited Beta to a Universe of Possibilities

When we first announced Docker Model Runner, it was in its beta phase, exclusively available on Docker Desktop and limited to Apple and Nvidia hardware. We received a ton of valuable feedback, and we’ve been hard at work making it more accessible and powerful.

Today, we’re proud to say that Docker Model Runner is now Generally Available (GA) and can be used in all versions of Docker! But that’s not all. We’ve added Vulkan support, which means you can now run your models on virtually any GPU. This is a huge leap forward, and it’s all thanks to the incredible potential we see in this project and the community that surrounds it.

Making Contributions a Breeze

We’ve listened to your feedback about the contribution process, and we’ve made some significant changes to make it as smooth as possible.

To start, we’ve consolidated all the repositories into a single, unified home. This makes it much easier to find everything you need in one place.

We have also invested a lot of effort in updating our documentation for contributors. Whether you’re a seasoned open-source veteran or a first-time contributor, you’ll find the information you need to get started.

Your Mission, Should You Choose to Accept It

The success of Docker Model Runner depends on you, our amazing community. We’re calling on you to help us make this project the best it can be. Here’s how you can get involved:

Star our repository: Show your support and help us gain visibility by starring the Docker Model Runner repo.

Fork and Contribute: Have an idea for a new feature or a bug fix? Fork the repository, make your changes, and submit a pull request. We’re excited to see what you come up with!

Spread the word: Tell your friends, colleagues, and anyone else who might be interested in running AI models with Docker.

We’re incredibly excited about this new chapter for Docker Model Runner, and we can’t wait to see what we can build together. Let’s get to work!

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Docker Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Build a Multi-Agent System in 5 Minutes with cagent

Models are advancing quickly. GPT-5, Claude Sonnet, Gemini. Each release gives us more capabilities. But most real work isn’t solved by a single model.

Developers are realizing they need a system of agents: different types of agents working together to accomplish more complex tasks. For example, a researcher to find information, a writer to summarize, a planner to coordinate, and a reviewer to check accuracy.

The challenge is that today, building a multi-agent system is harder than it should be. Context doesn’t flow cleanly between agents. Tools require custom integration. Sharing with a teammate means sending instructions and hoping they can re-create your setup.

That’s the problem cagent solves.

In this blog, we’ll walk you through the basics, how to create a multi-agent AI system in minutes, and how cagent makes this possible. 

What’s a multi-agent system?

A multi-agent system is a coordinated group of AI agents that collaborate to complete complex tasks. Using cagent, you can build and run these systems declaratively, no complex wiring or reconfiguration needed.

Meet cagent: The best (and open source) way to build multi-agent systems

Figure 1: cagent workflow for multi-agent orchestration. 

cagent is an open-source tool for building agents and a part of Docker’s growing ecosystem of AI tools. 

Instead of writing glue code to wire up models, tools, and workflows, describe an agent (or a team of agents) in a single YAML file:

Which model the agent uses (OpenAI, Anthropic, Gemini, or a local one)

What its role or instructions are

Which tools it can use (like GitHub, search, or the filesystem)

And, if needed, which sub-agents it delegates to

This turns agents into portable, reproducible artifacts you can run anywhere and share with anyone. 

Multi-agent challenges that cagent is solving

Create, run, and share multi-agent AI systems more easily with cagent.

Orchestrate agents (and sub-agents) more easily – Define roles and delegation (sub-agents). cagent manages calls and context.

Let agents use tools with guardrails – Grant capabilities with MCP: search, GitHub, files, databases. Each agent gets only the tools you list and is auditable.

Use (and swap) models – OpenAI, Anthropic, Gemini, or local models through Docker Model Runner. Swap providers without rewriting your system.

Treat agents like artifacts – Package, version, and share agents like containers.

How to build a multi-agent system with Docker cagent

Here’s what that looks like in practice.

Step 1: Define your multi-agent system

version: "2"

agents:
root:
model: anthropic/claude-sonnet-4-0
instruction: |
Break down a user request.
Ask the researcher to gather facts, then pass them to the writer.
sub_agents: ["researcher", "writer"]

researcher:
model: openai/gpt-5-mini
description: Agent to research and gather information.
instruction: Collect sources and return bullet points with links.
toolsets:
– type: mcp
ref: docker:duckduckgo

writer:
model: dmr/ai/qwen3
description: Agent to summarize notes.

instruction: Write a concise, clear summary from the researcher’s notes.

Step 2: Run the YAML file

cagent run team.yaml

The coordinator delegates, the researcher gathers, and the writer drafts. You now have a functioning team of agents.

Step 3: Share it on Docker Hub

cagent push ./team.yaml org/research-writer

Now, anyone on your team can run the exact same setup with:

cagent run docker.io/org/research-writer

That’s a full multi-agent workflow, built and shared in under 5 minutes.

First principles: Why cagent works

These principles keep cagent an easy-to-use and customizable multi-agent runtime to orchestrate AI agents.

Declarative > imperative. Multi-agent systems are mostly wiring: roles, tools, and topology. YAML keeps that wiring declarative, making it easy to define, read, and review.

Agents as artifacts. Agents become portable artifacts you can pull, pin, and trust.

Small surface area. A thin runtime that does one job well: coordinate agents.

What developers are building with cagent

Developers are already exploring different multi-agent use cases with cagent. Here are some examples:

1. PR and issue triaging

Collector reads PRs/issues, labels, failing checks

Writer drafts comments or changelogs

Coordinator enforces rules, routes edge cases

2. Research summarizing

Researcher finds and cites sources

Writer produces a clean summary

Reviewer checks for hallucinations and tone

3. Knowledge routing

Router classifies requests

KB agent queries internal docs

Redactor strips PII before escalation

Each one starts the same way: a YAML file and an idea. And they can be pushed to a registry and run by anyone.

Get started

cagent gives you the fastest path forward to build multi-agent systems. It’s open-source, easy to use, and built for the way developers already work. Define your agents, run them locally, and share them, all in a few lines of YAML.

YAML in, agents out.

Run the following to get started:

brew install cagent
cagent new
cagent run agent.yaml

Learn more

Get the technical details from our cagent documentation. 

We’d love to hear what you think. Join us in the Docker Community Slack. 

Dive into more topics about AI and Docker. 

Subscribe to the Docker Navigator Newsletter.

Quelle: https://blog.docker.com/feed/

100% Transparency and Five Pillars

How to Do Hardened Images (and Container Security) Right

Container security is understandably a hot topic these days, with more and more workloads running atop this mainstay of the cloud native landscape. While I might be biased because I work at Docker, it is safe to say that containers are the dominant form factor for running applications today. Equally important, the next generation of applications focused on AI are already running on containers. Because the world runs on containers, getting container security right is of paramount importance.

I am sad to say that most organizations who claim to be delivering container security are not. Particularly troubling are the growing ranks of hardened image providers who claim to be providing highly secure containers but are missing important components of what makes a container secure. Granted, we have a strong opinion on container security. We run the world’s largest repository and infrastructure for container hosting and management. And to be clear, our company’s future fate depends on the continued perception that containers are secure. So we have real skin in this game. 

The Essential Elements of Container Security

All of this being said, as the lead security engineer at Docker, and someone with a very long history with containers, I want to lay down our vision for container security. That vision is actually uncomplicated. There are five essential ingredients of maximum container security and hardened images. Those ingredients are:

Minimal Attack Surface: A proper hardened image only includes absolutely necessary software in the container. This means stripping out the majority of libraries, agents, and modules that may deliver useful functionality but are put into software distributions by default and add both complexity and CVE exposure. Our hardening process on average eliminates over 98% of the CVE exposure of a container. 

A 100% Complete Software Bills of Materials. This is the baseline and must be 100% complete (as per CISA guidance) with no minimum depth. provides accurate inventory including direct dependencies, transitive dependencies, and explicit relationships. SBOMs must be fully verifiable back to source, through open standards like SPDX or CycloneDX, standard component identifiers like PURLs, and honest gap disclosure.

Verifiable Build Provenance establishes chain of custody from source code to deployed artifact. SLSA Build Level 3 provenance provides non-falsifiable attestations about what was built, where, and by what process. If you don’t know how or where it was built and who built it, you can’t be sure it’s not tainted.

Standardized Exploitability Assessment clarifies which vulnerabilities affect specific deployment contexts. OpenVEX provides machine-readable statements about vulnerability status, enabling auditors and compliance tools to process assessments independently and properly leverage SBOMs. VEX statement transparency and interoperability make container security viable and allow teams to focus only on real risks.

Cryptographic Verification proves authenticity and integrity. Modern approaches like Sigstore and Cosign enable signing with public verification, allowing anyone to verify signatures without proprietary infrastructure. The signature and provenance chain must be transparent and easy to produce or query.

100% Transparency to Bind These Pillars Together. All of the above five elements must be transparent, not just in what they produce but in how they produce attestations, evidence, and any data or statements. This means using public sources for vulnerability intelligence (National Vulnerability Database or NVD, distribution security feeds, language ecosystem advisories, GitHub Security Advisories) with visible synchronization cadence. When CVEs listed in the KEV (Known Exploited Vulnerabilities) catalog  appear, transparency ensures alignment without negotiation. This means making the CVE selection process and criteria public and allowing users to see the process. This means making the SBOM creation process transparent so users can understand how the manifests are built. Ultimately, radical transparency transforms security from a trust exercise into a verification process where you can prove your posture, auditors can validate your evidence, and customers can independently assess your claims.

Of course container security also extends into the container runtimes to execute containers with highest security standards as well as continuous observability and enforcement of organizational policies across the entire container lifecycle. I’ll cover Docker’s activities in this area in a later post.

Why You Need to Verify All Vendor Claims on “Hardened Images”

For enterprises looking to better secure containers, I want to be very, very clear. Any “hardened” container image that cannot meet these requirements is a lie. Unfortunately, a number of hardened image vendors cannot meet these requirements. Here are some of the problems we have seen with competitors’ hardened images that our users and customers have brought us for comparison:

SBOMs that don’t pass the sniff test: A Node server with no npm packages is an oxymoron. Yet, that’s what we saw. Did they rewrite Node.js to remove any need for npm? I don’t think so. This means they left key elements from their SBOMs.

SBOMs missing transitive dependencies: CISA guidance is clear. Every SBOM must contain 100% of all dependencies. Not including them may be convenient because it hand waves the problem of securing those dependencies. But it’s not right.

Proprietary and opaque CVE designation: A vendor doesn’t get to decide whether a CVE is relevant and what its severity level is. That’s what public, transparent CVE feeds are for. Any vendor that won’t reveal their exact methodology or process for CVE assessment and provide it, on demand, is hiding something.

Incomplete SLSA Build claims: SLSA Build Level 3 is a binary. You either are or you are not meeting the requirements. Calling a build “transitional” is the same as checking the “no” box.

Why We’re Flipping the Table (and Resetting Expectations) on Container Security

It’s not news to say that supply chain attacks on the open source ecosystem are out of control. The smartest Black Hat minds in the world at the most advanced APTs are laser-focused on compromising supply chains because these are among the best ways to compromise entire ecosystems. Supply chain attacks can expose a huge swath of organizations to critical breaches leading to data exfiltration, ransomware and extortion, and espionage. Because we sit at a central position in the container ecosystem, we are also exposed any time the container supply chain is compromised. 

That’s why I’m writing this post. Docker has designed our hardened images explicitly to deliver on all five of the core pillars while also providing 100% transparency into process, inputs and outputs. I want to make it very easy for any platform, team, security team, CISO, or even CEO or business leader to be able to ask the right questions to determine whether their container security posture is valid, and whether the hardened images they are buying are actually delivering on their promise. (As a side note, container security is so important that we also think hardened images should be affordable to all. That’s why we’re now offering them at extremely reasonable prices, making them accessible even to two-person startups.) 

Container security is not hard. Container security is not rocket science. Container security is about radical transparency, honesty, and doing right for your users. In a perfect world, everyone would be doing container security the right way, and every organization would have easy access to rock-solid containers that are properly hardened by default and completely transparent. 

In this perfect world, Docker as a company is better off, the users are better off, the enterprises are better off, and the world is better off. Frankly, our competitors are also better off and their products are better. That’s a good thing. This is more than a sales pitch or an engineering rant. I guess you can call it a mission. Making the technology world safer is of fundamental importance and that’s the outcome we seek.
Quelle: https://blog.docker.com/feed/

Docker Model Runner on the new NVIDIA DGX Spark: a new paradigm for developing AI locally

We’re thrilled to bring NVIDIA DGX™ Spark support to Docker Model Runner. The new NVIDIA DGX Spark delivers incredible performance, and Docker Model Runner makes it accessible. With Model Runner, you can easily run and iterate on larger models right on your local machine, using the same intuitive Docker experience you already trust.

In this post, we’ll show how DGX Spark and Docker Model Runner work together to make local model development faster and simpler, covering the unboxing experience, how to set up Model Runner, and how to use it in real-world developer workflows.

What is NVIDIA DGX Spark

NVIDIA DGX Spark is the newest member of the DGX family: a compact, workstation-class AI system, powered by the Grace Blackwell GB10 Superchip  that delivers incredible  performance for local model development. Designed for researchers and developers, it makes prototyping, fine-tuning, and serving large models fast and effortless, all without relying on the cloud.

Here at Docker, we were fortunate to get a preproduction version of  DGX Spark. And yes, it’s every bit as impressive in person as it looks in NVIDIA’s launch materials.

Why Run Local AI Models and How Docker Model Runner and NVIDIA DGX Spark Make It Easy 

Many of us at Docker and across the broader developer community are experimenting with local AI models. Running locally has clear advantages:

Data privacy and control: no external API calls; everything stays on your machine

Offline availability: work from anywhere, even when you’re disconnected

 Ease of customization: experiment with prompts, adapters, or fine-tuned variants without relying on remote infrastructure

But there are also familiar tradeoffs:

Local GPUs and memory can be limiting for large models

Setting up CUDA, runtimes, and dependencies often eats time

Managing security and isolation for AI workloads can be complex

This is where DGX Spark and Docker Model Runner (DMR) shine. DMR provides an easy and secure way to run AI models in a sandboxed environment, fully integrated with Docker Desktop or Docker Engine. When combined with the DGX Spark’s NVIDIA AI software stack and large 128GB unified memory, you get the best of both worlds: plug-and-play GPU acceleration and Docker-level simplicity.

Unboxing NVIDIA DGX Spark

The device arrived well-packaged, sleek, and surprisingly small, resembling more a mini-workstation than a server.

Setup was refreshingly straightforward: plug in power, network, and peripherals, then boot into NVIDIA DGX OS, which includes NVIDIA drivers, CUDA, and AI software stack pre-installed.

Once on the network, enabling SSH access makes it easy to integrate the Spark into your existing workflow.

This way, the DGX Spark becomes an AI co-processor for your everyday development environment, augmenting, not replacing, your primary machine.

Getting Started with Docker Model Runner on NVIDIA DGX Spark

Installing Docker Model Runner on the DGX Spark is simple and can be done in a matter of minutes.

1. Verify Docker CE is Installed

DGX OS comes with Docker Engine (CE) preinstalled. Confirm you have it:

docker version

If it’s missing or outdated, install following the regular Ubuntu installation instructions.

2. Install the Docker Model CLI Plugin

The Model Runner CLI is distributed as a Debian package via Docker’s apt repository. Once the repository is configured (see linked instructions above) install via the following commands:

sudo apt-get update
sudo apt-get apt-get install docker-model-plugin

Or use Docker’s handy installation script:

curl -fsSL https://get.docker.com | sudo bash

You can confirm it’s installed with:

docker model version

3. Pull and Run a Model

Now that the plugin is installed, let’s pull a model from the Docker Hub AI Catalog. For example, the Qwen 3 Coder model:

docker model pull ai/qwen3-coder

The Model Runner container will automatically expose an OpenAI-compatible endpoint at:

http://localhost:12434/engines/v1

You can verify it’s live with a quick test:

# Test via API

curl http://localhost:12434/engines/v1/chat/completions   -H 'Content-Type: application/json'   -d
'{"model":"ai/qwen3-coder","messages":[{"role":"user","content":"Hello!"}]}'

# Or via CLI
docker model run ai/qwen3-coder

GPUs are allocated to the Model Runner container via nvidia-container-runtime and the Model Runner will take advantage of any available GPUs automatically. To see GPU usage:

nvidia-smi

4. Architecture Overview

Here’s what’s happening under the hood:

[ DGX Spark Hardware (GPU + Grace CPU) ]

             │

     (NVIDIA Container Runtime)

             │

     [ Docker Engine (CE) ]

             │

     [ Docker Model Runner Container ]

             │

     OpenAI-compatible API :12434

The NVIDIA Container Runtime bridges the NVIDIA GB10 Grace Blackwell Superchip drivers and Docker Engine, so containers can access CUDA directly. Docker Model Runner then runs inside its own container, managing the model lifecycle and providing the standard OpenAI API endpoint. (For more info on Model Runner architecture, see this blog).

From a developer’s perspective, interact with models similarly to any other Dockerized service — docker model pull, list, inspect, and run all work out of the box.

Using Local Models in Your Daily Workflows

If you’re using a laptop or desktop as your primary machine, the DGX Spark can act as your remote model host. With a few SSH tunnels, you can both access the Model Runner API and monitor GPU utilization via the DGX dashboard, all from your local workstation.

1. Forward the DMR Port (for Model Access)

To access the DGX Spark via SSH first set up an SSH server:

Using Local Models in Your Daily WorkflowsIf you’re using a laptop or desktop as your primary machine, the DGX Spark can act as your remote model host. With a few SSH tunnels, you can both access the Model Runner API and monitor GPU utilization via the DGX dashboard, all from your local workstation.

sudo apt install openssh-server
sudo systemctl enable –now ssh

Run the following command to access Model Runner via your local machine. Replace user with the username you configured when you first booted the DGX Spark and replace dgx-spark.local with the IP address of the DGX Spark on your local network or a hostname configured in /etc/hosts. 

ssh -N -L localhost:12435:localhost:12434 user@dgx-spark.local

This forwards the Model Runner API from the DGX Spark to your local machine.Now, in your IDE, CLI tool, or app that expects an OpenAI-compatible API, just point it to:

http://localhost:12435/engines/v1/models

Set the model name (e.g. ai/qwen3-coder) and you’re ready to use local inference seamlessly.

2. Forward the DGX Dashboard Port (for Monitoring)

The DGX Spark exposes a lightweight browser dashboard showing real-time GPU, memory, and thermal stats, usually served locally at:

http://localhost:11000

You can forward it through the same SSH session or a separate one:

ssh -N -L localhost:11000:localhost:11000 user@dgx-spark.local

Then open http://localhost:11000 in your browser on your main workstation to monitor the DGX Spark performance while running your models.

This combination makes the DGX Spark feel like a remote, GPU-powered extension of your development environment. Your IDE or tools still live on your laptop, while model execution and resource-heavy workloads happen securely on the Spark.

Example application: Configuring Opencode with Qwen3-Coder

Let’s make this concrete.

Suppose you use OpenCode, an open-source, terminal-based AI coding agent.

Once your DGX Spark is running Docker Model Runner with ai/qwen3-coder pulled and the port is forwarded, you can configure OpenCode by adding the following to ~/.config/opencode/opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "dmr": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Docker Model Runner",
      "options": {
        "baseURL": "http://localhost:12435/engines/v1"   // DMR’s OpenAI-compatible base
      },
      "models": {
        "ai/qwen3-coder": { "name": "Qwen3 Coder" }
      }
    }
  },
  "model": "ai/qwen3-coder"
}

Now run opencode and select Qwen3 Coder with the /models command:

That’s it! Completions and chat requests will be routed through Docker Model Runner on your DGX Spark, meaning Qwen3-Coder now powers your agentic development experience locally.

You can verify that the model is running by opening http://localhost:11000 (the DGX dashboard) to watch GPU utilization in real time while coding.This setup lets you:

Keep your laptop light while leveraging the DGX Spark GPUs

Experiment with custom or fine-tuned models through DMR

Stay fully within your local environment for privacy and cost-control

Summary

Running Docker Model Runner on the NVIDIA DGX Spark makes it remarkably easy to turn powerful local hardware into a seamless extension of your everyday Docker workflow.You install one plugin and use familiar Docker commands (docker model pull, docker model run).You get full GPU acceleration through NVIDIA’s container runtime.You can forward both the model API and monitoring dashboard to your main workstation for effortless development and visibility.This setup bridges the gap between developer productivity and AI infrastructure, giving you the speed, privacy, and flexibility of local execution with the reliability and simplicity Docker provides.As local model workloads continue to grow, the DGX Spark + Docker Model Runner combo represents a practical, developer-friendly way to bring serious AI compute to your desk — no data center or cloud dependency required.Learn more:

Read the official announcement of DGX Spark launch on NVIDIA newsroom

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo. Docker Model Runner is open-source, and we welcome collaboration and contributions from the community! Star, fork and contribute.

Quelle: https://blog.docker.com/feed/

How to Add MCP Servers to Claude Code with Docker MCP Toolkit

AI coding assistants have evolved from simple autocomplete tools into full development partners. Yet even the best of them, like Claude Code, can’t act directly on your environment. Claude Code can suggest a database query, but can’t run it. It can draft a GitHub issue, but can’t create it. It can write a Slack message, but can’t send it. You’re still copying, pasting, and context-switching between tools.

That’s where Model Context Protocol (MCP) and Docker MCP Toolkit come in. MCP connects Claude Code to your real tools, databases, repositories, browsers, and APIs, while Docker MCP Toolkit makes setup effortless and secure. We recently added Claude Code as a client that you can easily enable with one click in Docker Desktop.

In this guide, you’ll learn how to:

Set up Claude Code and connect it to Docker MCP Toolkit.

Configure the Atlassian MCP server for Jira integration.  

Configure the GitHub MCP server to access repository history and run git commands.

Configure the Filesystem MCP server to scan and read your local codebase.

Automate tech debt tracking by converting 15 TODO comments into tracked Jira tickets.

See how Claude Code can query git history, categorize issues, and create tickets — all without leaving your development environment.

With more than 200 pre-built, containerized MCP servers, one-click deployment in Docker Desktop, and automatic credential handling, developers can connect Claude Code to trusted environments in minutes — not hours. No dependency issues, no manual configuration, just a consistent, secure workflow across Mac, Windows, and Linux.

Why Claude Code and Docker MCP Toolkit work better together 

While MCP provides the protocol, Docker MCP Toolkit makes it practical. Without containerization, setting up MCP servers means managing Node.js versions, Python dependencies, credentials in plaintext config files, and different configurations for every developer’s machine. The setup that should take 2 minutes takes 2-6 hours per developer.

Docker MCP Toolkit eliminates this friction:

200+ pre-built MCP servers in the catalog

One-click deployment through Docker Desktop

Secure credential management via OAuth or encrypted storage

Consistent configuration across Mac, Windows, and Linux

Automatic updates when new server versions release

We built Docker MCP Toolkit to meet developers where they are. If you’re using Claude Code, you should be able to connect it to your tools without wrestling with infrastructure.

Setting up Claude Code in Docker MCP Toolkit

Prerequisites

Install Docker Desktop 4.40 or later

Enable MCP Toolkit

Step 1. Install Claude Code

To install Claude Code, run the following command:

# Install Claude Code
curl -fsSL https://claude.ai/install.sh | sh

# Verify installation
claude –version # Should show 2.0.5+

Step 2. Connect Claude Code to Docker MCP Toolkit

Option 1: One-Click Connection (Recommended)

Open Docker Desktop

Navigate to MCP Toolkit in the sidebar

Click the Clients tab

Find “Claude Code” in the list.

Click Connect

Docker Desktop automatically configures the MCP Gateway connection.

Option 2: Manual Command Line Setup

If you prefer a command-line setup or need to configure a specific project:

Navigate to your project folder in the terminal

Run this command:

docker mcp client connect claude-code

You’ll see output like this:

=== Project-wide MCP Configurations (/your/project/path) ===
● claude-code: connected
MCP_DOCKER: Docker MCP Catalog (gateway server) (stdio)
● cursor: no mcp configured
● vscode: no mcp configured
You might have to restart 'claude-code'.

The connected status confirms Claude Code is linked to the Docker MCP Gateway.

What’s happening under the hood?

The connection command creates a .mcp.json file in your project directory:

{
"mcpServers": {
"MCP_DOCKER": {
"command": "docker",
"args": ["mcp", "gateway", "run"],
"type": "stdio"
}
}
}

This configuration tells Claude Code to use Docker’s MCP Gateway for all MCP server access. The gateway handles routing to your containerized servers.

Step 3. Restart Claude Code

# Exit Claude Code if running, then restart
claude code

Step 4. Verify the Connection

Inside Claude Code, type /mcp to see available MCP servers.

You should see the Docker MCP Gateway listed, which provides access to all enabled MCP servers. The /MCP_DOCKER tools indicate a successful connection. As you enable more MCP servers in Docker Desktop, they’ll appear here automatically.

First Run: What to Expect

When you start Claude Code for the first time after connecting to Docker MCP Toolkit, you’ll see a prompt about the new MCP server:

New MCP server found in .mcp.json: MCP_DOCKER

MCP servers may execute code or access system resources. All tool calls require approval.
Learn more in the MCP documentation (https://docs.claude.com/s/claude-code-mcp).

❯ 1. Use this and all future MCP servers in this project
2. Use this MCP server
3. Continue without using this MCP server

Enter to confirm · Esc to reject

Choose Option 1 (recommended). This configures your project to automatically use Docker MCP Toolkit and any MCP servers you enable in Docker Desktop. You won’t need to approve MCP servers individually each time.

After confirming, you’ll see the Claude Code home screen:

Claude Code v2.0.5

Welcome back!

Sonnet 4.5 · API Usage Billing
/…/your/project/path

Tips for getting started
Run /init to create a CLAUDE.md file with…
Run /terminal-setup to set up terminal in…
Use claude to help with file analysis, ed…
Be as specific as you would with another …

Recent activity
No recent activity

You’re now ready to use Claude Code with MCP servers from Docker Desktop.

Real-World Demo: TODO-to-Ticket Automation Demo

Now that you’ve connected Claude Code to Docker MCP Toolkit, let’s see it in action with a practical example. We’ll automatically convert TODO comments in a real codebase into tracked Jira tickets — complete with git history, priority categorization, and proper linking.

Configuring the required MCP Servers

For this automation, we’ll orchestrate three MCP servers:

Filesystem MCP – to scan your codebase and read source files

GitHub MCP – to run git blame and extract author information

Atlassian (Jira) MCP – to create and manage Jira issues

We’ll walk through enabling and configuring all three MCP servers. 

What makes this realistic?

Uses actual codebase (catalog-service-node) 

Extracts git blame info to identify code authors 

Categorizes by business priority using keyword analysis 

Creates properly formatted Jira issues with context 

Links back to exact file/line numbers for easy navigation

Time investment: 

Manual process: ~20-30 minutes 

Automated with Claude Code + MCP: ~2 minutes total 

Let’s walk through it step-by-step.

1. Configure the Atlassian MCP Server

In Docker Desktop → MCP Toolkit → Catalog:

Search “Atlassian”

Click + Add

Go to Configuration tab

Add your Atlassian credentials:

atlassian.jira.url: https://yourcompany.atlassian.net

atlassian.jira.username: your email

API tokens in the Secrets section

Important notes:

For Atlassian API authentication, the “username” is always your Atlassian account email address, which you use together with the API token for basic authentication

Click Start Server

As shown in the screenshot, the Atlassian MCP provides 37 tools, including:

jira_create_issue – Create Jira issues

jira_add_comment – Add comments

jira_batch_create_issues – Bulk create

And many more Jira operations

For this demonstration, I created a new JIRA project called “TODO Demo” with a project key “TD”.

2. Configure GitHub MCP Server

The GitHub MCP server supports two authentication methods. We recommend OAuth for the easiest setup.

Option A: OAuth Authentication (Recommended – Easiest)

Open Docker Desktop → MCP Toolkit → Catalog

Search for “GitHub”

Find GitHub Official and click + Add

Go to the Configuration tab

Select OAuth as the authentication method

Click the “authorize with the GitHub OAuth provider” link

You’ll be redirected to GitHub to authorize the connection

After authorization, return to Docker Desktop

Click Start Server

Advantage: No manual token creation needed. Authorization happens through GitHub’s secure OAuth flow.

Option B: Personal Access Token (PAT)

If you prefer to use a Personal Access Token or need more granular control:

Step 1: Create a GitHub Personal Access Token

Go to GitHub.com and sign in to your account

Click your profile picture in the top-right corner

Select “Settings”

Scroll down to “Developer settings” in the left sidebar

Click on “Personal access tokens” → “Tokens (classic)”

Click “Generate new token” → “Generate new token (classic)”

Give your token a descriptive name like “Docker MCP GitHub Access”

Select the following scopes (permissions):

repo (Full control of private repositories)

workflow (if you need workflow actions)

read:org (if you need organization access)

Click “Generate token” and copy the token immediately (you won’t see it again!)

Step 2: Configure in Docker Desktop

In Docker Desktop → MCP Toolkit → Catalog

Find GitHub Official and click + Add

Go to the Configuration tab

Select Personal Access Token as the authentication method

Paste your token in the provided field

Click Start Server

Or via CLI:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_YOUR_TOKEN_HERE

This gives users the flexibility to choose the method that works best for their workflow, with OAuth being the simpler path for most users.

3. Configure Filesystem MCP Server

The Filesystem MCP server allows Claude Code to read files from your local system. You need to specify which directories it can access.

Step 1: Enable Filesystem MCP Server

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Filesystem”

Find Filesystem (Reference) and click + Add

Step 2: Configure Allowed Paths

Go to the Configuration tab

Under filesystem.paths, add the directories Claude Code should access

For this demo, add your project directory (e.g., /Users/your_username/ or your project path)

You can add multiple paths by clicking the + button

Click Save

Click Start Server

Important: Only grant access to directories you’re comfortable with Claude Code reading. The Filesystem MCP server is scoped to these specific paths for security.

Run the Automation

Clone the repository

git clone https://github.com/ajeetraina/catalog-service-node

Change directory to catalog-service-node and run the following command:

claude code

Paste this instruction into Claude Code:

Scan this codebase for all TODO and FIXME comments.
For each one:
1. Extract the comment and surrounding code context (5 lines before/after)
2. Use git blame to identify who wrote it and when
3. Determine priority based on keywords:
– High: "race condition", "data loss", "security", "failure", "crash"
– Medium: "performance", "slow", "cache", "optimization"
– Low: "documentation", "refactor", "cleanup"
4. Create a Jira issue with:
– Project: TD
– Issue Type: Task
– Summary: Extract from the TODO/FIXME comment
– Description: Include code context and explanation
– Priority: Based on categorization above (use additional_fields: {"priority": {"name": "High"}})
– Labels: ["tech-debt"]
– Add a comment with link to exact GitHub file and line number

Provide a summary showing:
– Total TODOs/FIXMEs found
– Breakdown by priority (High/Medium/Low)
– List of created Jira issues with links

Watch Claude Code Work

Claude Code will orchestrate multiple MCP servers to complete this task:

Filesystem MCP to scan your src/ directory

GitHub MCP to run git blame and get author info

Atlassian MCP to create Jira issues in your TODO Demo project

Here’s what the output looks like:

I'll scan the codebase for TODO and FIXME comments, analyze them, and create Jira issues for tracking.

⏺ Search(pattern: "(TODO|FIXME|XXX|HACK):", output_mode: "content")
⎿ Found 22 lines (ctrl+o to expand)

⏺ Now let me read each file to get the full context around the TODO/FIXME comments (excluding git hooks):

⏺ Read(src/services/StorageService.js)
⎿ Read 70 lines (ctrl+o to expand)

⏺ Read(src/services/InventoryService.js)
⎿ Read 55 lines (ctrl+o to expand)

⏺ Read(src/services/ProductService.js)
⎿ Read 117 lines (ctrl+o to expand)

⏺ Read(src/services/PublisherService.js)
⎿ Read 67 lines (ctrl+o to expand)

⏺ Read(src/index.js)
⎿ Read 100 lines (ctrl+o to expand)

Next, Claude Code runs git blame to identify who created each TODO:

⏺ Now let me run git blame on each file to identify authors and dates:

⏺ Bash(git blame -L 1,15 src/services/StorageService.js)
⎿ 4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 1) // FIXME: No error handling for S3 upload failures
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 2) // Failed uploads leave products without images
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 3) // Need to rollback product creation or retry upload
… +12 lines (ctrl+o to expand)

⏺ Bash(git blame -L 1,15 src/services/InventoryService.js)
⎿ 4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 1) // FIXME: No retry logic for inventory service failures
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 2) // When WireMock/external service is down, requests fail immediately
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 3) // Need exponential backoff retry mechanism
… +12 lines (ctrl+o to expand)

⏺ Bash(git blame -L 1,13 src/services/ProductService.js)
⎿ 4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 1)
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 2) // TODO: Add pagination support for product listing
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 3) // Currently returns all products – will cause performance issues with large
… +11 lines (ctrl+o to expand)

⏺ Bash(git blame -L 1,16 src/services/PublisherService.js)
⎿ 4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 1) // FIXME: Kafka publish failures are silently ignored
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 2) // If Kafka is unavailable, product updates aren't tracked
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 3) // Need dead letter queue or retry mechanism
… +13 lines (ctrl+o to expand)

⏺ Bash(git blame -L 1,12 src/index.js)
⎿ 4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 1) // TODO: Add request rate limiting
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 2) // API has no throttling, vulnerable to abuse
4bdb4454 (Ajeet Raina 2025-10-03 20:08:01 +0530 3) // Need per-IP rate limiting
… +9 lines (ctrl+o to expand)

⏺ Now let me categorize by priority and create Jira issues. First, let me get the GitHub remote URL to include in the Jira issues:

⏺ Bash(git remote get-url origin)
⎿ https://github.com/ajeetraina/catalog-service-node

Claude Code then categorizes issues by priority and creates Jira tickets:

Now I'll create Jira issues for all the TODO/FIXME comments. Based on my analysis:

HIGH Priority (contains keywords: race condition, data loss, security, failure, crash):
1. Inventory sync race condition
2. S3 upload failures (data loss potential)
3. Kafka publish failures (data loss)
4. Kafka acknowledgment handling (data loss)
5. Error responses expose internal details (security)
6. Inventory service failures

MEDIUM Priority (performance, slow, cache, optimization):
1. Cache inventory data (performance)
2. Add pagination support (performance)
3. Image resizing/optimization (performance)
4. Batch publishing (performance)

LOW Priority (documentation, refactor, cleanup):
1. LocalStack configuration not documented
2. Product deletion doesn't clean up S3 images (cleanup)
3. No validation for duplicate SKUs
4. Request rate limiting
5. Health check endpoint
6. Event schema versioning
7. Multiple images per product
8. Search functionality
9. Inventory low-stock alerts

In under 2 minutes, Claude Code has:

Scanned the entire codebase

Identified 15+ TODO/FIXME comments

Retrieved git history for each one

Categorized them by business priority

Created properly formatted Jira issues with full context

Linked each issue back to the exact file and line number

What would have taken 20-30 minutes of manual work is now automated and consistent. This Jira automation is just one example. Here’s how MCP transforms other common development workflows:

Workflow Comparison: Before vs. After MCP

Task

Before MCP (Manual)

After MCP (Automated)

Time Saved

Debug Checkout Failures

1. Ask Claude for SQL query

2. Copy query to database client

3. Run query and copy results

4. Paste results back to Claude

5. Get analysis

6. Ask Claude to draft GitHub issue

7. Manually create issue in GitHub

8. Notify team in Slack

You: “Why are checkouts failing? Investigate and create a GitHub issue.” 

Claude Code: Queries production database, finds 23% payment timeouts, identifies root cause as connection pool exhaustion, creates GitHub issue #1847, posts to #backend-alerts on Slack. Result: Done. Issue link provided.

~15 min → ~2 min

Investigate Performance Issue

1. Check multiple monitoring dashboards

2. Export slow query logs

3. Analyze locally

4. Document findings in Google Doc

5. Create Jira ticket manually

6. Add links between doc and ticket

7. Notify team

You: “Why are API response times spiking?”

Claude Code:

• Queries slow_queries log

• Finds: 127 queries taking >2s

• Identifies missing index on users.email

• Creates migration file

• Opens PR with fix

• Posts summary to #backend

~20 min → ~3 min

Code Review for Security

1. Review PR manually

2. Run security scanner separately

3. Document findings in notepad

4. Post review comments one by one

5. Create tracking ticket for issues

6. Update security dashboard

You: “Review PR #234 for security issues”

Claude Code:

• Analyzes 12 changed files

• Finds hardcoded API key (config.js:47)

• Finds: SQL injection risk in query build

• Posts inline review comments with fixes

• Creates security ticket SEC-445

• Updates security tracking board

~25 min → ~4 min

From 15 minutes of context switching to 2-3 minutes of continuous flow.

Conclusion

You’ve just seen how Docker MCP Toolkit transforms Claude Code from an isolated coding assistant into an integrated development partner. What used to require copying between terminals, web browsers, and documentation now happens in one continuous flow. 

Next steps:

Explore the 220+ MCP servers in the Docker MCP catalog

Connect Claude Code to your databases, APIs, and tools

Share your setup with your team for consistent workflows

The future of development isn’t about switching between tools — it’s about tools that work together seamlessly. Docker MCP Toolkit makes that future available today.

Learn more

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers

Download Docker Desktop to get started with the MCP Toolkit: Run MCP servers easily and securely

Check out our MCP Horror Stories series to see common MCP security pitfalls and how you can avoid them

Quelle: https://blog.docker.com/feed/

LoRA Explained: Faster, More Efficient Fine-Tuning with Docker

Fine-tuning a language model doesn’t have to be daunting. In our previous post on fine-tuning models with Docker Offload and Unsloth, we walked through how to train small, local models efficiently using Docker’s familiar workflows. This time, we’re narrowing the focus.

Instead of asking a model to be good at everything, we can specialize it: teaching it a narrow but valuable skill, like consistently masking personally identifiable information (PII) in text. Thanks to techniques like LoRA (Low-Rank Adaptation), this process is not only feasible on modest resources, it’s fast and efficient.

Even better, with Docker’s ecosystem the entire fine-tuning pipeline: training, packaging, and sharing, becomes approachable. You don’t need a bespoke ML setup or a research lab workstation. You can iterate quickly, keep your workflow portable, and publish results for others to try with the same Docker commands you already know.

In this post, I’ll walk through a hands-on fine-tuning experiment: adapting the Gemma 3 270M model into a compact assistant capable of reliably masking PII.

What’s Low-Rank Adaptation (LoRA)?

Fine-tuning starts with a pre-trained model, one that has already learned the general structure and patterns of language.

Instead of training it from scratch (which would consume massive amounts of compute and risk catastrophic forgetting, where the model loses its prior knowledge), we can use a more efficient method called LoRA (Low-Rank Adaptation).

LoRA allows us to teach the model new tasks or behaviors without overwriting what it already knows, by adding small, trainable adapter layers while keeping the base model frozen.

How does LoRA work?

At a high level, LoRA works like this:

Freeze the base model: The model’s original weights (its core knowledge of language) remain unchanged.

Add adapter layers: Small, trainable “side modules” are inserted into specific parts of the model. These adapters learn only the new behavior or skill you want to teach.

Train efficiently: During fine-tuning, only the adapter parameters are updated. The rest of the model stays static, which dramatically reduces compute and memory requirements.

LoRA experiment: Fine-tune Gemma 3 270M to mask PII

For this experiment, the model already knows how to read, write, and follow instructions. Our job is simply to teach it the specific pattern we care about, for example:

“Given some text, replace PII with standardized placeholders while leaving everything else untouched.”

The fine-tuning process consists of four steps:

Prepare the dataset

Prepare LoRA adapter

Train the model

Export the resulting model

Figure 1: Four steps of fine-tuning with LoRA

In this example, we use Supervised Fine-Tuning (SFT): each training example pairs raw text containing PII with its correctly redacted version. Over many such examples, the model internalizes the pattern and learns to generalize the redaction rules.

The quality of the dataset is critical, the cleaner and more representative your dataset, the better your fine-tuned model will perform.

Before we dive into the steps, it’s crucial to understand Chat Templates.

Understanding Chat Templates

When you send a request like below to Gemma 3 270M, the model doesn’t see this JSON structure directly.

"messages": [
{
"role": "user",
"content": "Mask all PII in the following text. Replace each entity with the exact UPPERCASE label in square brackets (e.g., [PERSON], [EMAIL], [PHONE], [USERNAME], [ADDRESS], [CREDIT_CARD], [TIME], etc.). Preserve all non-PII text, whitespace, ' ' and punctuation exactly. Return ONLY the redacted text. Text: This is an example of text that contains some data. The author of this text is Ignacio López Luna, but everybody calls him Ignasi. His ID number is 123456789. He has a son named Arnau López, who was born on 21-07-2021"
}
]

Instead, the input is transformed into a chat-formatted prompt with special tokens:

<start_of_turn>user Mask all PII in the following text. Replace each entity with the exact UPPERCASE label in square brackets (e.g., [PERSON], [EMAIL], [PHONE], [USERNAME], [ADDRESS], [CREDIT_CARD], [TIME], etc.). Preserve all non-PII text, whitespace, ' ' and punctuation exactly. Return ONLY the redacted text. Text: This is an example of text that contains some data. The author of this text is Ignacio López Luna, but everybody calls him Ignasi. His ID number is 123456789. He has a son named Arnau López, who was born on 21-07-2021<end_of_turn>

Notice how the message has been rewrapped and extra tokens like <start_of_turn> and <end_of_turn> have been inserted. These tokens are part of the model’s chat template, the standardized structure it expects at inference time.

Different models use different templates. For example, Gemma uses <start_of_turn> markers, while other models might rely on <bos> or others.

This is exactly why the first step is “Prepare the dataset.” When fine-tuning, you must format your training data with the same chat template that the model will use during inference. This alignment ensures the fine-tuned model is robust, because it has been trained on data that looks exactly like what it will encounter in production.

Prepare the dataset: Teaching through examples

The dataset is the bridge between general-purpose language ability and task-specific expertise. Each example is a demonstration of what we want the model to do: a prompt with raw text containing PII, and a response showing the redacted version.

In the script this is how the original Dataset is formatted using the Chat Template of the model (see the apply_chat_template function):

max_seq_length = 2048
model, tokenizer = FastModel.from_pretrained(
model_name="unsloth/gemma-3-270m-it",
max_seq_length=max_seq_length,
load_in_4bit=False,
load_in_8bit=False,
full_finetuning=False,
)

with open("pii_redaction_train.json", "r", encoding="utf-8") as f:
data = json.load(f)

ds = Dataset.from_list(data)

def to_text(ex):
resp = ex["response"]
if not isinstance(resp, str):
resp = json.dumps(resp, ensure_ascii=False)
msgs = [
{"role": "user", "content": ex["prompt"]},
{"role": "assistant", "content": resp},
]
return {
"text": tokenizer.apply_chat_template(
msgs, tokenize=False, add_generation_prompt=False
)
}

dataset = ds.map(to_text, remove_columns=ds.column_names)

You can print some of the pairs to see how it looks like via:

for i in range(3):
print(dataset[i]["text"])
print("=" * 80)

An example of a dataset entry:

<bos><start_of_turn>user
Mask all PII in the following text. Replace each entity with the exact UPPERCASE label in square brackets (e.g., [PERSON], [EMAIL], [PHONE], [USERNAME], [ADDRESS], [CREDIT_CARD], [TIME], etc.). Preserve all non-PII text, whitespace, and punctuation exactly. Return ONLY the redacted text.

Text:
<p>My child faozzsd379223 (DOB: May/58) will undergo treatment with Dr. faozzsd379223, office at Hill Road. Our ZIP code is 28170-6392. Consult policy M.UE.227995. Contact number: 0070.606.322.6244. Handle transactions with 6225427220412963. Queries? Email: faozzsd379223@outlook.com.</p><end_of_turn>
<start_of_turn>model
<p>My child [USERNAME_2] (DOB: [DATEOFBIRTH_1]) will undergo treatment with Dr. [USERNAME_1], office at [STREET_1]. Our ZIP code is [ZIPCODE_1]. Consult policy M.UE.227995. Contact number: [TELEPHONENUM_1]. Handle transactions with [CREDITCARDNUMBER_1]. Queries? Email: [EMAIL_1].</p><end_of_turn>

Prepare LoRA adapter: Standing on the shoulders of a base model

Instead of starting from a blank slate, we begin with Gemma-3 270M-IT, a small but capable instruction-tuned model. By loading both the weights and the tokenizer, we get not just a model that understands text, but also the exact rules it uses to split and reconstruct sentences.

Fine-tuning isn’t reinventing language, it’s layering task-specific expertise on top of a foundation that already knows how to read and write.

For that, we’ll use the LoRA technique. 

Why we use LoRA

Training a large language model from scratch is extremely costly, because it means adjusting billions of parameters.

But the good news is: you usually don’t need to change everything to teach the model a new skill.

That’s where LoRA comes in. Instead of re-training the entire model, LoRA adds a few small, extra components, like “add-ons.” When we fine-tune the model, we only adjust these add-ons, while the main model stays the same.

from peft import LoraConfig, get_peft_model

lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05
)

model = get_peft_model(base_model, lora_config)

These few lines tell the model: keep your parameters frozen, but learn through a small set of low-rank adapters. That’s why fine-tuning is efficient and affordable.

Train the model: Fine-tuning in practice

With the dataset ready and LoRA adapters in place, the actual training looks like classic supervised learning.

Feed in the input (a user prompt).

Compare the model’s output with the expected response.

Adjust the adapter weights to minimize the difference.

model = model,
tokenizer = tokenizer,
train_dataset = dataset,
eval_dataset = None, # Can set up evaluation!
args = SFTConfig(
dataset_text_field = "text",
per_device_train_batch_size = 8,
gradient_accumulation_steps = 1, # Use GA to mimic batch size!
warmup_steps = 5,
num_train_epochs = 1, # Set this for 1 full training run.
# max_steps = 100,
learning_rate = 5e-5, # Reduce to 2e-5 for long training runs
logging_steps = 1,
optim = "adamw_8bit",
weight_decay = 0.01,
lr_scheduler_type = "linear",
seed = 3407,
output_dir="outputs",
report_to = "none", # Use this for WandB etc
),
)

trainer_stats = trainer.train()

Over many iterations, the model internalizes the rules of PII masking, learning not only to replace emails with [EMAIL] but also to preserve punctuation, whitespace, and all non-PII content exactly as instructed.

What’s important here is that fine-tuning doesn’t overwrite the model’s general capabilities. The model still knows how to generate coherent text, we’re just biasing it toward one more skill.

Export the resulting model: Merging weights

Once training finishes, we have a base model plus a set of LoRA adapters. That’s useful for experimentation, but for deployment we often prefer a single consolidated model.

By merging the adapters back into the base weights, we produce a standalone checkpoint that behaves just like the original model, except it now has PII masking expertise built in.

model.save_pretrained_merged("result", tokenizer, save_method = "merged_16bit")

Try and share your model

After fine-tuning, the next natural step is to try your model in action and, if it works well, share it with others. With Docker Model Runner, you can package your fine-tuned model, push it to Docker Hub, and make it instantly runnable anywhere. No messy setup, no GPU-specific headaches, just a familiar Docker workflow for distributing and testing AI models.

So once your adapters are trained and merged, don’t stop there: run it, publish it, and let others try it too. In the previous post, I showed how easy it is to do that step-by-step.

Fine-tuning makes your model specialized, but Docker makes it accessible and shareable. Together, they turn small local models from curiosities into practical tools ready to be used, and reused, by the community.

We’re building this together!

Docker Model Runner is a community-friendly project at its core, and its future is shaped by contributors like you. If you find this tool useful, please head over to our GitHub repository. Show your support by giving us a star, fork the project to experiment with your own ideas, and contribute. Whether it’s improving documentation, fixing a bug, or a new feature, every contribution helps. Let’s build the future of model deployment together!

Learn more

Learn how to fine-tune local models with Docker Offload and Unsloth

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Unlocking Local AI on Any GPU: Docker Model Runner Now with Vulkan Support

Running large language models (LLMs) on your local machine is one of the most exciting frontiers in AI development. At Docker, our goal is to make this process as simple and accessible as possible. That’s why we built Docker Model Runner, a tool to help you download and run LLMs with a single command.

Until now, GPU-accelerated inferencing with Model Runner was limited to CPU, NVIDIA GPUs (via CUDA), and Apple Silicon (via Metal). Today, we’re thrilled to announce a major step forward in democratizing local AI: Docker Model Runner now supports Vulkan!

This means you can now leverage hardware acceleration for LLM inferencing on a much wider range of GPUs, including integrated GPUs and those from AMD, Intel, and other vendors that support the Vulkan API.

Why Vulkan Matters: AI for Everyone’s GPU

So, what’s the big deal about Vulkan?

Vulkan is a modern, cross-platform graphics and compute API. Unlike CUDA, which is specific to NVIDIA GPUs, or Metal, which is for Apple hardware, Vulkan is an open standard that works across a huge range of graphics cards. This means if you have a modern GPU from AMD, Intel, or even an integrated GPU on your laptop, you can now get a massive performance boost for your local AI workloads.

By integrating Vulkan (thanks to our underlying llama.cpp engine), we’re unlocking GPU-accelerated inferencing for a much broader community of developers and enthusiasts. More hardware, more speed, more fun!

Getting Started: It Just Works

The best part? You don’t need to do anything special to enable it. We believe in convention over configuration. Docker Model Runner automatically detects compatible Vulkan hardware and uses it for inferencing. If a Vulkan-compatible GPU isn’t found, it seamlessly falls back to CPU.

Ready to give it a try? Just run the following command in your terminal:

docker model run ai/gemma3

This command will:Pull the Gemma 3 model.Detect if you have a Vulkan-compatible GPU with the necessary drivers installed.Run the model, using your GPU to accelerate the process.It’s that simple. You can now chat with a powerful LLM running directly on your own machine, faster than ever.

Join Us and Help Shape the Future of Local AI!

Docker Model Runner is an open-source project, and we’re building it in the open with our community. Your contributions are vital as we expand hardware support and add new features.Head over to our GitHub repository to get involved:https://github.com/docker/model-runnerPlease star the repo to show your support, fork it to experiment, and consider contributing back with your own improvements.

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

From the Captain’s Chair: Pradumna Saraf

Docker Captains are leaders from the developer community that are both experts in their field and are passionate about sharing their Docker knowledge with others. “From the Captain’s Chair” is a blog series where we get a closer look at one Captain to learn more about them and their experiences. 

Today, we are interviewing Pradumna Saraf. He is an Open Source Developer with a passion for DevOps. He is also a Golang developer and loves educating people through social media and blogs about various DevOps tools like Docker, GitHub Actions, Kubernetes, etc. He has been a Docker Captain since 2024.

Can you share how you first got involved with Docker?

If I remember correctly, I was learning about databases, more specifically, MongoDB. Until that time, I had no idea there was something called Docker. I was trying to find a way to get the database up and running locally, and then I came to know from a YouTube video about how Docker is the most common and efficient way for running these kinds of applications locally, and then I skipped learning about databases and dived deep into learning Docker.

What inspired you to become a Docker Captain?

The community. Docker has always been working towards making the developer life easier and listening to the community and users, whether it’s an open source offering or an enterprise, and I wanted to be part of this community. Before even joining the Captains program, I was advocating for Docker by sharing my learning via social media, blogs, etc, and educating people because I was passionate and really loved the potential of Docker. Becoming a Captain felt natural, as I was already doing the stuff, so it was great to get the recognition.

What are some of your personal goals for the next year?

Writing more technical content, of course! Also, giving more in-person talks at international conferences. I also want to get back to contributing and helping open source projects grow.

If you weren’t working in tech, what would you be doing instead?

That’s an interesting question. I love tech. It’s hard to imagine my life without tech because getting into it was not a decision; it was a passion that was inside of me before I could spell technology. But still, if I were not in tech, I might be a Badminton or a Golf player.

Can you share a memorable story from collaborating with the Docker community?

Yes, there was a meetup in Docker Bangalore, India, where Ajeet (DevRel at Docker), a good friend of mine, and I collaborated, and he invited me to deliver a talk on Docker extensions. It was really nice meeting the community, having conversations over pizza about how various people and companies are using Docker in their workflow and bottlenecks.

What’s your favorite Docker product or feature right now, and why?

I am really biased towards Docker Compose. My favourite feature right now is being able to define models in a Docker Compose YAML file and start/stop an AI model with the same Docker Compose commands. Apart from that, I really like the standalone Docker Model Runner (DMR).

Can you walk us through a tricky technical challenge you solved recently?

I was working on an authorization project, where I was verifying users with the right set of permissions and letting them access the resource, and interestingly, Docker had a key role in that project. The role of Docker was a Policy Decision Point (PDP), which was running inside a container and listening to external requests, and was responsible for validating if the entity/user/request is authorized to access the particular resource with the right permissions. This was a particularly unique application of Docker, where I used it as a decision point. Docker made it easy to run, keeping it separate from the main app and making it scalable with almost zero downtime. It showed Docker can also be used for important services like authorization.

What’s one Docker tip you wish every developer knew?

Using multi-stage builds. It helps keep your images small, clean, secure, and production-ready. It’s such a simple thing, but it can make a huge difference. I have seen an image go from 1.7 GB to under 100 MB. Bonus: It will also make your pull and push faster, saving CI cost and making your overall deployment faster.

If you could containerize any non-technical object in real life, what would it be and why?

My age. I’d containerize age so I could choose how old I want to be. If I want to feel young, I will run Docker with an image with the age version of 20, and if I want to think more mature, I will run Docker with an image with the age version of 40.

Where can people find you online? (talks, blog posts, or open source projects, etc.)

People can find social media platforms like Twitter (X), LinkedIn, BlueSky, Threads, etc. For my open source work, people can find me on GitHub. I have many Docker-related projects. Apart from that, if people are more into blogs and conferences, they can find me on my blog and sessionize profile. Or just Google “Pradumna Saraf”.

Rapid Fire Questions

Cats or Dogs?

Cats

Morning person or night owl?

Night Owl

Favorite comfort food?

Dosa

One word friends would use to describe you?

Helpful

A hobby you picked up recently?

Learning more about aircraft and the aviation industry.
Quelle: https://blog.docker.com/feed/

Powered by Docker: How Open Source Genius Cut Entropy Debt with Docker MCP Toolkit and Claude Desktop

This is part of the Powered by Docker series, where we feature use cases and success stories from Docker partners and practitioners. This story was contributed by Ryan Wanner. Ryan has more than fifteen years of experience as an entrepreneur and 3 years in AI space developing software and is the founder of Open Source Genius.

Open Source Genius is a start-up that helps organizations navigate an AI-powered future by building practical, human-centered AI systems. In early 2025, OSG had a good problem: demand. With multiple ventures ramping up, they were shipping more AI solutions and generating more documentation, processes, and assets to manage. But as decks, PDFs, and SOPs piled up, so did the time-sucking chaos: files were misnamed, folders got messy, and teams were constantly hunting for the “real” version of a doc in Slack threads.

OSG called it entropy debt. Instead of hiring ops or layering on another SaaS tool, they built something smarter: an AI-powered file automation agent using Docker’s MCP Toolkit and Claude Desktop. It saves the team two hours per person every week! 

Here’s how they did it and what they gained.

The Challenge

OSG’s mission is ambitious – to create over $100M in small business value by 2027 through modular solutions deployed across multiple ventures. 

With active projects at Hideout Golf Club, Music City Energy Spa, MHCP USA, and more, the team was generating critical documents constantly: mission/vision frameworks, brand guardrails, operational SOPs (like their comprehensive 4-T Resource Manager tracking Talent, Tech, Tangible Assets, and Tides), business one-pagers, pitch decks, and reference materials. Each venture needed its own organized knowledge base, but the manual overhead was becoming a bottleneck.

This created what the team called entropy debt—time lost to: 

Renaming files to match ever‑shifting conventions 

Drag‑and‑dropping assets into ad‑hoc folders 

Ping‑ponging links in Slack to find “the latest” version

OSG’s team was drowning in files. With 15-30 documents landing in downloads folders daily across multiple business ventures and departments, the chaos was constant. Updated pitch decks, revised SOPs, new brand guidelines—each one needed attention, but nobody had time to properly organize them.

The team faced three equally frustrating options:

Take time to manually rename and organize—pulling people away from higher-value work

Delegate the task to someone else—adding to someone’s already-full plate

Use the search function and hope it works—often ending in Slack messages like “Can someone send me the latest Brand Guardrails?”

OSG tried various organizational systems, but they all required discipline and consistency across a distributed team. The team realized that if they needed to build a system robust enough for effective delegation, they could instead build a process to solve it with automation.

With this increase in lost time and productivity, the founders were looking to create a repeatable, low‑overhead fix, but also one that came with no extra SaaS fees or dedicated ops hire.

The Solution — The Entropy Agent Inside OSG

Enter the solution – OSG built the Entropy Agent in Claude Desktop and hosted supporting MCP servers on Docker Desktop. 

Why did OSG choose Docker MCP Catalog and Toolkit? 

As an existing Docker practitioner, the OSG Team was already familiar with the platform and had it running locally. When Docker announced MCP support, it was a natural fit—no new approvals needed, no cloud dependencies, and all processing stays on their local machines. For a team that values “In God we trust; all others bring data,” the ability to maintain complete control over sensitive operational documents—tracking everything from employee scorecards to $4.2M in real estate holdings to proprietary business methodologies—was non-negotiable.

OSG team chose the Docker MCP Catalog and Toolkit for its simple, secure approach to discovering, running, and managing MCP servers. True to their value of “Think Big, Act Small,” the setup process took under 1 hour from discovery to first successful file routing. With one-click integration into Claude Desktop, there’s no need for manual configuration files or complex networking—each MCP server is ready to go out of the box.

The architecture mirrors the diagram shared by the founders:

Claude Desktop AI – Classifies incoming files, validates they belong in the BOS, routes commands, and syncs updates.

Docker MCP Toolkit – Runs an MCP Gateway Server that exposes callable services. The first service available is the GitHub MCP Server. In the next phase, additional MCP Servers for Notion, Slack, and Google Drive will be introduced.

GitHub BOS – Nine‑domain folder framework (Foundation through Governance) where every validated file lands with a custom.md explainer.

Current Operational Flow

With the Entropy Agent in place, the workflow is simple: the operator drops a file into the local Entropy Inbox, and the agent takes care of the rest, classifying, renaming, and updating the correct BOS folder automatically.

Step

Action

Receive

Operator drops a file (e.g., updated Brand Guardrails) into an “Entropy Inbox” folder on their local machine.

Classify & Validate

Claude identifies the file type, confirms it matches one of the four foundational categories, and renames it using OSG’s convention.

Route

Claude sends a command through the MCP Gateway to the GitHub MCP Server.

Sync

GitHub MCP places the file into the correct BOS folder, commits, and pushes to the private repo.

Context

Claude appends or updates the custom.md in that folder so humans know why the file matters.

Results

OSG implemented the Entropy agent using Docker and Claude in less than an hour. Since then, the team has been saving an estimated two hours per person every week. 

Now files are automatically renamed, sorted, and organized into their BOS framework with zero manual effort. When the marketing consultant needs brand guardrails for a campaign, when the founder needs the latest 4-T Resource Manager for a board meeting, or when a virtual assistant needs to onboard a new contractor—the files are always in the same place, named the same way, every time. No searching. No guessing. No wasted time.

The clarity this creates goes beyond just finding files. It means anyone on the team—from engineers to consultants—can confidently access the exact information they need to do their best work, regardless of which venture or project they’re supporting.

OSG set up this first internal agent using the Docker MCP stack in under an hour. It wasn’t perfect, but it worked, and that kind of speed to value is huge. Further, the founder also appreciated the psychological difference. The downloads folder used to be a mess; now everything lives in a predictable structure. That clarity extends beyond just file management. As they build out more systems and internal agents, having everything organized has made their stack easier to navigate and far more AI-friendly.

OSG estimates they’re saving at least two hours a week per person just by cutting down on file-related tasks that weren’t really part of anyone’s job. But more importantly, the way this stack supports context-aware workflows is helping them build toward something bigger: a truly AI-centric organization.

Before OSG started using the Docker MCP stack, a lot of their internal communication was just about locating things. “Where’s that file?” “Who has the latest version?” Those kinds of questions were constant. Now, they’re mostly gone. If something isn’t where they expect it to be, they know it’s either not created yet—or someone needs to build it. That small shift alone has made a noticeable impact on how the OSG team works together.

Pain Point

Pre-Agent

Post-Agent

Manual renaming & filing

100% human

0 % — fully automated

Duplicate file versions

Common

Rare — single canonical copy in BOS

“Where’s that doc?” pings

Daily

< 1 per week

Lessons Learned

OSG’s experience shows that small, targeted automations can deliver big results, especially when built on a flexible, familiar platform like Docker. Other learnings include:

Choose a platform that’s flexible, secure, and familiar like Docker. Docker’s MCP Toolkit gave OSG the freedom to test, iterate, and refine their agent without wrestling with dependency conflicts or complex runtime setup. Because MCP servers run in isolated containers, OSG sidestepped the common pitfalls of environment inconsistencies and version conflicts that plague traditional Node.js or Python installations—just run the container, and it works. 

Security was non-negotiable for OSG. With sensitive operational documents—from employee scorecards to $4.2M in real estate holdings—the team needed more than basic access controls. Docker’s container-based approach provides built-in isolation, OAuth support, and secure credential storage, eliminating the risks of plaintext environment variables and full host access that plague many MCP tools. Docker also addresses emerging threats unique to MCP servers, like Tool Poisoning and Tool Rug Pulls, by leveraging its trusted position as both a provider of secure content and secure runtimes. For teams looking to build AI agents, picking a platform like Docker that combines security, simplicity, and extensibility is critical—it removes friction from the development process and lets you focus on solving real problems, not wrestling with infrastructure.

Single Canonical Home Beats Infinite Folders. A GitHub‑based BOS eliminates debate about “real” vs “working” files. When there’s only one authoritative location for each document, version confusion disappears and team confidence goes up. This architectural decision—paired with automated file routing—transformed how the entire team accesses information.

Modular MCP Servers Future‑Proof the Stack. Starting with the GitHub MCP server kept the system light and focused. As Docker expands the MCP ecosystem, OSG can easily add new integrations—Slack digests, Notion knowledge bases, Google Drive syncs—without refactoring their core workflow. Because each MCP server runs as an independent container, adding functionality is as simple as enabling a new service. This modular architecture means the agent can grow with the business.

Small Automation, Big Return. A few hundred lines of Claude prompts and MCP calls replaced repetitive, low-value tasks across every project. The lesson: you don’t need massive infrastructure to see meaningful impact. Start with one painful workflow, automate it completely, and build from there. The psychological shift alone—from “where’s that file?” to “if it’s not there, we haven’t built it yet”—has transformed how the team works together.

Conclusion

OSG’s Entropy Agent shows that you don’t need a heavyweight platform to kill entropy debt. A targeted Claude workflow plus one MCP server delivered immediate, measurable relief—and laid groundwork for richer automation down the road. Docker’s MCP solution makes it easy to extend your agents with MCP capabilities, no complex setup, just scalable, secure, and solid performance out of the box!

Learn more

Visit the Open Source Genius website and sister site to learn more about their services or reach out directly to Ryan. 

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers

Download Docker Desktop to get started with the MCP Toolkit: Run MCP servers easily and securely

Quelle: https://blog.docker.com/feed/