How to Use Multimodal AI Models With Docker Model Runner

One of the most exciting advances in modern AI is multimodal support, the ability for models to understand and generate multiple types of input, such as text, images, or audio. 

With multimodal models, you’re no longer limited to typing prompts; you can show an image or play a sound, and the model can understand it. This opens a world of new possibilities for developers building intelligent, local AI experiences.In this post, we’ll explore how to use multimodal models with Docker Model Runner, walk through practical examples, and explain how it all works under the hood.

What Is Multimodal AI?

Most language models only understand text, but multimodal models go further. They can analyze and combine text, image, and audio data. That means you can ask a model to:

Describe what’s in an image

Identify or reason about visual details

Transcribe or summarize an audio clip

This unlocks new ways to build AI applications that can see and listen, not just read.

How to use multimodal models

Not every model supports multimodal inputs, so your first step is to choose one that does.

In Docker Hub we indicate the inputs supported on each model on its model card, for example:

Moondream2, Gemma3, or Smolvlm models supports text and image as input, while GPT-OSS supports text only

The easiest way to start experimenting is with the CLI. Here’s a simple example that asks a multimodal model to describe an image:

docker model run gemma3 "What's in this image? /Users/ilopezluna/Documents/something.jpg"
The image shows the logo for **Docker**, a popular platform for containerization.

Here's a breakdown of what you see:

* **A Whale:** The main element is a stylized blue whale.
* **A Shipping Container:** The whale's body is shaped like a shipping container.
* **A Stack of Blocks:** Inside the container are a stack of blue blocks, representing the layers and components of an application.
* **Eye:** A simple, white eye is featured.

Docker uses this iconic whale-container image to represent the concept of packaging and running applications in isolated containers.

Using the Model Runner API for More Control

While the CLI is great for quick experiments, the API gives you full flexibility for integrating models into your apps. Docker Model Runner exposes an OpenAI-compatible API, meaning you can use the same client libraries and request formats you already know, just point them to Docker Model Runner.

Here’s an example of sending both text and image input:

curl –location 'http://localhost:12434/engines/llama.cpp/v1/chat/completions'
–header 'Content-Type: application/json'
–data '{
"model": "ai/gemma3",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "describe the image"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
}
}
]
}
]
}'

Run Multimodal models from Hugging Face

Thanks to our friends at Hugging Face (special shout-out to Adrien Carreria), you can also run multimodal models directly from Hugging Face in Docker Model Runner.

Here’s an example using a model capable of audio transcription:

curl –location 'http://localhost:12434/engines/llama.cpp/v1/chat/completions'
–header 'Content-Type: application/json'
–data '{
"model": "hf.co/ggml-org/ultravox-v0_5-llama-3_1-8b-gguf",
"temperature": 0,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "transcribe the audio, one word"
},
{
"type": "input_audio",
"input_audio": {
"data": "//PoxAB8RA5OX53xAkRAKBwMBQLRYMhmLDEXQQI0CT8QFQNMawxMYiQtFQClEDCgjDHhCzjtkQuCpb4tQY+IgZ3bGZtttcm+GnGYNIBBgRAAGB+AuYK4N4wICLCPmW0GqZsapZtCnzmq4V4AABgQA2YGoAqQ5gXgWDoBRgVBQCQWxgXAXgYEswTQHR4At2IpfL935ePAKpgGACAAvlkDP2ZrfMBkAdpq4kYTARAHCoABgWAROIBgDPhUGuotB/GkF1EII2i6BgGwwAEVAAMAoARpqC8TRUMFcD2Ly6AIPTnuLEMAkBgwVALjBsBeMEABxWAwUgSDAIAPMBEAMwLAPy65gHgDmBgBALAOPIYDYBYVARMB0AdoKYYYAwYAIAYNANTcMBQAoEAEmAcApBRg+g5mCmBGIgATAPBFMEsBUwTwMzAXAHfMwNQKTAPAPDAHwcCoCAGkHAwBNYRHhYBwWhhwBEPyQuuHAJwuSmAOAeEAKLBSmQR2zbhC81/ORKWnsDhhrjlrxWcBgI2+hCBiAOXzMGLoTHb681deaxoLMAUAFY5gHgCxaTQuIZUsmnpTXVsglpKonHlejAXAHXHOJ0QxnHJyafakpJ+CJAziA/izoImFwFIO/E37iEYij/0+8s8c/9YJAiAgADAHADa28k4sSA3vhE9GrcCw/lPpTEFNRQMACQgABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB6jBZABd3gAADj6KbWk8H+MCSbKw3Jgvlxg+JpjWF5uKl4QJgiEcw5EIyCSY3E4IyvS0wFHwz3E8wrG0yzIU8E7Q5zK8xTFwwbE0HDmYaXhvZCqGoKGEgIFRyZzQxmZ2mXBOY1Aw4LDDyIN/F47SVzdzIMqAowELgCszjgFMvmMxiHzE4hMLicyaGQUaTCoDfuaSaIhgLAsuAFQSYRC4sMxISiQtMGi0ymWTKYvCoHMyjUwAJDIBIMbAhPIKgsdACDpgkFoVGLB8Fg+YTHpkoEGFxCFh0DBeYeECPyEBgQEGDxSCRijDSLJYGwBE4wOBjDYABwWLAMC4fXCFiYHEsuGCQcY2BIcQBIqGAhGYjD5iAKGNwOYgLplAAv2OgJEsCBUwsHDBILBQuEAiMnBcDHIw4AgsACgIlAJDkGY6OICSsgEh2FBOYfCwMBLcJuHE/0elvMKaw1UHBNFB9IdQDxxxH2V/AvvK9cPSJonarWZcyeYd2XQ3BLhUD0yvrpQK0hscP0UesPM0FgDjoAEb1VntaO5MPzDYnJpn4fd9EnS5isDTQSGQoAAEFzAwhBQLTQAQIdi1Arwvo4z6t9FoCcdw2/biq9fDTQ4NrsKBCFwGRDYIAiB7PFPPczALAJS4UAK3G7Sle95iVl+qKL00NXaWsmIKaigYAEhAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxABxxBZIBua3UAy9KUzYdYeFxCZJEQOExhYDGHg4bB7J5W0GV0QYdAhig3G9IQbDT53SunFlCZmoh0DsGmVOc6bZmZqnK0Ga7ABrYqmWUsZSNZeMwgDhYJlULjAfFQGOlAwYfTGYBMMDUmZgYazW4aNBM8w0XD5JMPDxo1KQjLilMbBA24QLviy5lAxhwzWwaFaIk+YIKg5cIAQKKgw4bI6DjJsEqERhHZtCBCdNInOgRVnMAEWNARfMuyIAxnwAJGGlBA0YFiQYSFggKBmHDlAcxIUmQsEX9HF/R1YUeDNzJiKZKgMLBwsAhE5pSCQiDiK6bJfUOCCxswBgmKo4IjrwAoWCQ8wgtMpUjOYEZE/DAYEgwNGxIIMMAzBAAdAbK/qVDxv2wWN3WXNJX0opEXta2XUQBMrAACNAhh4IECTV4CRXaQzqUsScKOypSqiemQTMxelkY6/ucCu1QwxfuSajv1pSzmXrJRxZK4Hxb2Fr7dJR+H2mlYcXmFQEmCEzR6BFCAxxTDjIRDANCVLW3LR0MKaE2N41VmiIpO+UB4sFpfoK1TFB0HCiwKBgkqhx0YKCDQQjWXlXmBgQLg6mSLCbSv2Gs8i0OL4h56926SbxTEFNRQMACQgABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB19BY4BOd00gCrS9rgyDTAxEMTgsQDA0HNzlqJNvgM31dzOAbHRAYyXZqNHG8TwaPCBhMHmn1oaiThh8gmQXQatWOcLz8dTEgaBoaYkmwYvDcaECZBZg8FJiQQhekwbCcwtE8tUYSCeYRAkJECY9BoZQjoYgmEY3r8Y2hEYsnaZuryaPT4ba1aZqr8bjGkZCmSAQqMSALCAeBoFmCoFllgUEQdJB0cCuhaSYJcYowRIjkmjWizNGTDLTOjzQRUigKFb1TktU4iqIGCF6QI1CAIWDgEAgUZUYTJwoZDwhqCpsTpCFEA8s+utVJYcQNwaPMzTDI4hRmVAmICGXOm5FmDEIBCak2hg3Znx50Z5k4o07SAAAMFHBATAWIR8gpFNonBX8xH0zxcAhw40a5aaAqYQ+Y1CYdWHIk2n/SkVUWRLJAomXnZu8CKb+iwxE+Wui1JZZgRTvzzPonOOxYoYGgNmyuGTKnfSRxaTu3duS57aaNvtMSt4qaVxqYdWKwcytpaiDNbb4Sq1UoGwOU5bKJYoGmUwNEx3VzCMIoHSMMTnmHaL/Splj9MZZs3MOBgwWSDKhMYS0WFLvGADiEimQXbFCLuIcVGgsOgd9AcUTCfyOKFLEWLAsafeOQmnpbMWpUxBTUUDAAkIAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB0zAI0AO7fcaG1AIUAUDAI3ph2SpjEhxkiG4yVAGjcysGkaCkwaG4xgFcLA8BhFMSxnMAyVAoRiMkzGlnDKqtT66qTfpcjBtgTE2UjUxyzaRVmsmXA1GWASGAIbmA4cgkDDAMjjEkFDDMCAgdjFANjDEUjCcRzJUlRIrTW8/TIsDjwdgDKlbjetyTIw4TdoxRpojFwBDFwAwgIQqCQCBMwxEIxcvSzGFI1MuCzUUBpoQOI3QIQTAEEVOjZQUysGMYDDBgsoC2ENGGAFMsEAAJJQIEIC37MBHBwKCDcxJCNTOTBCF5DTg0i3zKQwRiJh4NfJAIwV1OTKjThszB+N4vwgCNfbDSyMxs+NFLjJV438TN2OwcsklwZovmLFRkqsQioiIwhTHZ8wQ0MihzXkU2iKFGF5gQaAwaMJAxIqMqSTGBYwZrNEMTHyBREaAwACLpkCXgGD08Q+gJoJEbxNlwyk9To0S2hXloiTaSYuS92ZGqdQKITZRsrgm0XROrCGZdztDUiI7o7Hpf08ex8P0LFiTByoa+P1SF0sgHBCFceqNS0N0bpYFcN8K8XNclxOsfQpSeNwviFGgk1KRAtiKJsW00VWnamIPEzblsJwfCsQs60gjPPi8sOcBrEpiCmooGABIQAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAByRBYwBt800Am7elzDkQl9GYCJm2CfcMHAmBjsIcZfGLogAETDRMwkkAJURaYURjMjkBX5lwqBl85jBzsAHNIK0w/RjyJSN8L804rDXpaMXBQx6fAERzN4XM8FkwiDDDYqQDGBgKBSsYPD4YOjArTN7Gkxq7TcJKNpKw+2GzsFXNEYU5cCzIoDMcDsDAZOYwaDEjgcAgSTM8LABMyAw1qZIY0wdBgmZJ7AkOEDSAaELUlF1sajKOyX6mCH4ECjoZLN+RCEMaBMCQBVMWWtbNjKUGDAQUBqNjgJfxsyB8AYOQgS8bW4RezPPwURRWRILoFgOUcTGljLtwFjAxoza446IEAwsLMYHFqJn1xwGJsD5j3CKaNAkBbQHElGF4CFAEGggs6TIH+f5yGXvCsKncX7JAqcjSFlszX9QqprCR6/Ik9oCUtjc1yJ138kr+P/QMfdymbpDLPJxrVPYhDouhDbPU7lAmpP68cWcVsqqqLC+s5Q5DWJtwlBl9LSUqLAJg6TrGVyEQtK8wgAqizFDEo1xLQW8vd2WMLte5xkqBoEwVZRVDqKIghCllhfZwyYIhCniDB4QRayY9IY4i5S1k5FIq3InM6aVLZbGWuwK2SVUl9MQU1FAwAJCAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
"format": "mp3"
}
}
]
}
]
}'

This model can listen and respond, literally.

Multimodal AI example: A real-time webcam vision model

We have created a couple of demos in the Docker Model Runner repository, and of course, we couldn’t miss a demo based on ngxson’s example:

You can run this same demo by following the instructions, spoiler alert: it’s just Docker Model Runner and an HTML page.

How multimodel AI works: Understanding audio and images

How are these large language models capable of understanding images or audio? The key is something called a multimodal projector file.

This file acts as an adapter, a small neural network layer that converts non-text inputs (like pixels or sound waves) into a token representation the language model can understand. Think of it as a translator that turns visual or auditory information into the same kind of internal “language” used for text.

In simpler terms:

The projector takes an image or audio input

It processes it into numerical embeddings (tokens)

The language model then interprets those tokens just like it would words in a sentence

This extra layer allows a single model architecture to handle multiple input types without retraining the entire model.

Inspecting the Projector in OCI Artifacts

In Docker Model Runner, models are packaged as OCI artifacts, so everything needed to run the model locally (weights, configuration, and extra layers) is contained in a reproducible format.

You can actually see the multimodal projector file by inspecting the model’s OCI layers. For example, take a look at ai/gemma3.

You’ll find a layer with the media type: “application/vnd.docker.ai.mmproj”.

This layer is the multimodal projector, the component that makes the model multimodal-capable. It’s what enables gemma3 to accept images as input in addition to text.

We’re Building This Together!

Docker Model Runner is a community-friendly project at its core, and its future is shaped by contributors like you. If you find this tool useful, please head over to our GitHub repository. Show your support by giving us a star, fork the project to experiment with your own ideas, and contribute. Whether it’s improving documentation, fixing a bug, or a new feature, every contribution helps. Let’s build the future of model deployment together!

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Docker Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Mr. Bones: A Pirate-Voiced Halloween Chatbot Powered by Docker Model Runner

My name is Mike Coleman, a staff solution architect at Docker. This year I decided to turn a Home Depot animatronic skeleton into an AI-powered,  live, interactive Halloween chatbot. The result: kids walk up to Mr. Bones, a spooky skeleton in my yard, ask it questions, and it answers back — in full pirate voice — with actual conversational responses, thanks to a local LLM powered by Docker Model Runner.

Why Docker Model Runner?

Docker Model Runner is a tool from Docker that makes it dead simple to run open-source LLMs locally using standard Docker workflows. I pulled the model like I’d pull any image, and it exposed an OpenAI-compatible API I could call from my app. Under the hood, it handled model loading, inference, and optimization.

For this project, Docker Model Runner offered a few key benefits:

No API costs for LLM inference — unlike OpenAI or Anthropic

Low latency because the model runs on local hardware

Full control over model selection, prompts, and scaffolding

API-compatible with OpenAI — switching providers is as simple as changing an environment variable and restarting the service

That last point matters: if I ever needed to switch to OpenAI or Anthropic for a particular use case, the change would take seconds.

System Overview

Figure 1: System overview of Mr. Bones answering questions in pirate language

Here’s the basic flow:

Kid talks to skeleton

Pi 5 + USB mic records audio

Vosk STT transcribes speech to text

API call to a Windows gaming PC with an RTX 5070 GPU

Docker Model Runner runs a local LLaMA 3.1 8B (Q4 quant) model

LLM returns a text response

ElevenLabs Flash TTS converts the text to speech (pirate voice)

Audio sent back to Pi

Pi sends audio to skeleton via Bluetooth, which moves the jaw in sync

Figure 2: The controller box that holds the Raspberry Pi that drives the pirate

That Windows machine isn’t a dedicated inference server — it’s my gaming rig. Just a regular setup running a quantized model locally.

The biggest challenge with this project was balancing response quality (in character and age appropriate) and response time. With that in mind, there were four key areas that needed a little extra emphasis: model selection, how to do text to speech (TTS) processing efficiently, fault tolerance, and setting up guardrails. 

Consideration 1: Model Choice and Local LLM Performance

I tested several open models and found LLaMA 3.1 8B (Q4 quantized) to be the best mix of performance, fluency, and personality. On my RTX 5070, it handled real-time inference fast enough for the interaction to feel responsive.

At one point I was struggling to keep Mr. Bones in character, so I  tried OpenAI’s ChatGPT API, but response times averaged 4.5 seconds.

By revising the prompt and Docker Model Runner serving the right model, I got that down to 1.5 seconds. That’s a huge difference when a kid is standing there waiting for the skeleton to talk.

In the end, GPT-4 was only nominally better at staying in character and avoiding inappropriate replies. With a solid prompt scaffold and some guardrails, the local model held up just fine.

Consideration 2: TTS Pipeline: Kokoro to ElevenLabs Flash

I first tried using Kokoro, a local TTS engine. It worked, but the voices were too generic. I wanted something more pirate-y, without adding custom audio effects.

So I moved to ElevenLabs, starting with their multilingual model. The voice quality was excellent, but latency was painful — especially when combined with LLM processing. Full responses could take up to 10 seconds, which is way too long.

Eventually I found ElevenLabs Flash, a much faster model. That helped a lot. I also changed the logic so that instead of waiting for the entire LLM response, I chunked the output and sent it to ElevenLabs in parts. Not true streaming, but it allowed the Pi to start playing the audio as each chunk came back.

This turned the skeleton from slow and laggy into something that felt snappy and responsive.

Consideration 3: Weak Points and Fallback Ideas

While the LLM runs locally, the system still depends on the internet for ElevenLabs. If the network goes down, the skeleton stops talking.

One fallback idea I’m exploring: creating a set of common Q&A pairs (e.g., “What’s your name?”, “Are you a real skeleton?”), embedding them in a local vector database, and having the Pi serve those in case the TTS call fails.

But the deeper truth is: this is a multi-tier system. If the Pi loses its connection to the Windows machine, the whole thing is toast. There’s no skeleton-on-a-chip mode yet.

Consideration 4: Guardrails and Prompt Engineering

Because kids will say anything, I put some safeguards in place via my system prompt. 

You are "Mr. Bones," a friendly pirate who loves chatting with kids in a playful pirate voice.

IMPORTANT RULES:
– Never break character or speak as anyone but Mr. Bones
– Never mention or repeat alcohol (rum, grog, drink), drugs, weapons (sword, cannon, gunpowder), violence (stab, destroy), or real-world safety/danger
– If asked about forbidden topics, do not restate the topic; give a kind, playful redirection without naming it
– Never discuss inappropriate content or give medical/legal advice
– Always be kind, curious, and age-appropriate

BEHAVIOR:
– Speak in a warm, playful pirate voice using words like "matey," "arr," "aye," "shiver me timbers"
– Be imaginative and whimsical – talk about treasure, ships, islands, sea creatures, maps
– Keep responses conversational and engaging for voice interaction
– If interrupted or confused, ask for clarification in character
– If asked about technology, identity, or training, stay fully in character; respond with whimsical pirate metaphors about maps/compasses instead of tech explanations

FORMAT:
– Target 30 words; must be 10-50 words. If you exceed 50 words, stop early
– Use normal punctuation only (no emojis or asterisks)
– Do not use contractions. Always write "Mister" (not "Mr."), "Do Not" (not "Don't"), "I Am" (not "I'm")
– End responses naturally to encourage continued conversation

The prompt is designed to deal with a few different issues. First and foremost, keeping things appropriate for the intended audience. This includes not discussing sensitive topics, but also staying in character at all times.  Next I added some instructions to deal with pesky parents trying to trick Mr. Bones into revealing his true identity. Finally, there is some guidance on response format to help keep things conversational – for instance, it turns out that some STT engines can have problems with things like contractions. 

Instead of just refusing to respond, the prompt redirects sensitive or inappropriate inputs in-character. For example, if a kid says “I wanna drink rum with you,” the skeleton might respond, “Arr, matey, seems we have steered a bit off course. How about we sail to smoother waters?”

This approach keeps the interaction playful while subtly correcting the topic. So far, it’s been enough to keep Mr. Bones spooky-but-family-friendly.

Figure 3: Mr. Bones is powered by AI and talks to kids in pirate-speak with built-in safety guardrails.

Final Thoughts

This project started as a Halloween goof, but it’s turned into a surprisingly functional proof-of-concept for real-time, local voice assistants.

Using Docker Model Runner for LLMs gave me speed, cost control, and flexibility. ElevenLabs Flash handled voice. A Pi 5 managed the input and playback. And a Home Depot skeleton brought it all to life.

Could you build a more robust version with better failover and smarter motion control? Absolutely. But even as he stands today, Mr. Bones has already made a bunch of kids smile — and probably a few grown-up engineers think, “Wait, I could build one of those.” 

Source code: github.com/mikegcoleman/pirate

Figure 4: Aye aye! Ye can build a Mr. Bones too and bring smiles to all the young mateys in the neighborhood!

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Docker Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Security Doesn’t Have to Hurt

Do you ever wish security would stop blocking the tools you need to do your job? Surprise: your security team wants the same.

There you are, just trying to get your work done, when…

You need an AI to translate documentation, but all the AI services are blocked by a security web monitoring tool.

You finish coding and QA for a new software version just under the wire, but the release is late because security has not reviewed the open source software and libraries included.

Your new database works perfectly in dev/test, but it does not work in production because of a port configuration, and you do not have permissions. Changes to production permissions all require security approval

Here Comes… Shadow IT

Shadow IT is a spy-movie name for a phenomenon that is either a frustrating necessity or a game of whack-a-mole, depending on your responsibilities.

If you’re an engineer creating the next best product, shadow IT is a necessity. 

Company-supplied information technology does not change fast enough to keep up with the market, let alone allow you to innovate. Despite that, your security team will come down hard on anyone who tries to go outside the allowed vendors and products. Data storage has to be squared away in encrypted, protected spaces, and you have to jump like a show pony to get access. And you have no flexibility in the tools you’re allowed to use, even if you could produce faster and better with other options.

So you stop playing by the rules, and you find tools and tech that work.

That is, until someone protests the cloud hosting bill, finds the wifi access point, or notices the unofficial software repository. Security takes away your tools or cuts off access. And then you are upset, your team feels attacked, and security is up in arms.

If you are on a security team, shadow IT is a game of whack-a-mole. Company-supplied information technology changes without review. You know they’re trying to enable innovation, but they’re negating all the IT compliance certifications that allow you to sell your services and products. You have to investigate, prove, and argue about policies and regulations just to stop people from storing client secrets in their personal cloud storage.

Whether you are a new hire in the Security Operations Center or the unlucky CISO who reports to the CTO, this is a familiar refrain.

Yet no one wants this. Not you, not your boss, and not security.

If It Cannot Be Fixed, Break It

It’s time we change the ground rules of security to focus on compromise rather than stringency. 

Most security teams *want* to change their operations to concentrate on the capabilities they are trained for: threat intelligence, risk management, forensic analysis, and security engineering. I have never met a security professional who wants to spend their time arguing over a port configuration. It’s tiresome, and that friction inspires lasting antagonism on both sides.

Imagine working in a place where you can use innovative new tools, release products without a security delay, and change configurations so that your deployment works smoothly.

We can have this. 

But there is a subtle change that must happen to enable this security-IT paradise: non-security teams would have to understand and implement all the requirements security departments would check. And everyone who is part of the change would need to understand the implications of their actions and take sole responsibility for the security outcomes.

Let Security Let Go

My non-IT colleagues are shocked when I explain the scope of work for a security department in preparation for any release or product launch:

Weaknesses and exploits for custom and third-party code

Scope and adequacy of vendor security

Data encryption, transmission, and storage, especially across borders

Compliance with regulation and data protection laws

In many industries, we legally cannot remove security practices from IT processes. But we can change who takes responsibility for which parts of the work 

Security requirements are not a secret. A developer with integrated code scanners can avoid OWASP Top 10 flaws and vulnerable libraries and remove hard-coded accounts. Infrastructure admins with access to network security tools can run tidy networks, servers, and containers with precise configurations.

The result? The security team can let go of their rigid deployment rules.

If developers use code security tools and incorporate good practices, security team approval should take hours rather than days or weeks. Security can also approve the standard container configuration rather than each separate container in an architecture. They can define the requirements, offer you tools to review your work, and help you integrate good practices into your workflow.

“Trust but verify” would become a daily pattern instead of lip service to good interdepartmental relationships. Security will continue to monitor the environment and the application after release. They will keep an eye on vendor assertions and audits, watching threat intelligence streams for notifications that demonstrate risk. Security teams will have time to do the job they signed up for, which is much more interesting than policing other departments.

This change would also require that the security team be *allowed* to let go. When trust is broken—if vendors are not properly assessed, or software is introduced but not reported—the fault should not lie with the security team. If insecure coding causes a compromise, the development team must be accountable, and if an inadequately configured network causes a data leak, the network and hosting team must be called on the carpet. If the requirements are in place but not met, the responsible parties must be those that agreed to them but neglected to enact them.

Freedom to Choose Comes with a Catch

This new freedom makes shadow IT unnecessary. Teams do not need to hide the choices they make. However, the freedom to choose comes with a catch: full responsibility for your choices.

Consider the company charge card: Finance teams create the policy for how to use company charge cards and provide the tools for reimbursement. They do not scrutinize every charge in real time, but they review usage and payments.

If the tool is abused and the agreed-upon care is ignored, the card user is held responsible. Any lack of knowledge does not exempt you from the consequences. For minor infractions, you may get a written notice. For severe infractions, you can expect to be terminated for cause.

The finance requirements, your agreement, regular review, and enacted consequences minimize fraud internally. More importantly, though, this combination protects the company against accusations of negligence.

Security responsibility could work the same. Security teams can set requirements that IT workers agree to individually. IT teams are then free to deploy and make changes as appropriate for their work. IT secures assets before they are put into production, and security continues with the best practice of reviewing assets continuously after the fact. Delays in getting the tools you need are reduced, and you control the deployment of your work with much more assurance. The incentive for shadow IT is much lower, and the personal risk of choosing it is higher.

That last bit is the catch, though—when you take control, you take responsibility for the result. Instead of committing to a patch, you back out insecure code and redeploy when it is corrected. When your department contracts with a squirrelly vendor, your manager’s budget takes the hit for breaking the contract. When the network is compromised, the CIO, not the CISO, gets fired.

Right now, the security team carries this responsibility and shoulders these risks. But the result is an enterprise held hostage by risk aversion, with no understanding or control over the outcomes.

So far, I’ve mostly addressed IT, but I also want to bring this argument back home: Security professionals, let’s stop taking control of everyone else’s work. When we make hard requirements that do not meet tech realities, our IT teams get better at hiding their tracks. You will make more progress if you invest in mutual success and reward people who step up to exceed your expectations.

When Security and IT Make Peace, Shadow IT Becomes Unnecessary

I once worked with a development team that wanted to store proprietary code in a hosted code repository. The repository was great for their needs: versioning automation, fine-grained access management, easy branching, access from anywhere, and centralized storage. Instead of waiting six months for the new vendor security investigation process, the developer team gathered the vendor’s audit certificates, data handling guarantees, and standard contract language about security and data mining. The devs proactively researched the third-party security scanning policies and asked for their incident response and notification policies.

Our security team would have struggled to locate this repository if the developers had simply chosen to use it. Instead, they circumvented our process in the best way—by providing every necessary answer to our security questions.

The reward was an instant yes from me, the security leader, without having to wait for my overworked team to schedule yet another vendor review.

My reward? No shadow IT plus a very happy IT team.

Security should go beyond allowing compromises like this: we should seek them out. Convince the CISO to work toward giving your IT teams both control and responsibility, find a compromise with the teams that will take security seriously—and save your energy for wrangling teams that don’t.

For admins and developers: Provide the ISO audit documents for that vendor you want to use. Be the first dev team to learn the org’s code scanning tool. Read the latest risk assessments from your cloud environment and don’t repeat vulnerable configurations. These small changes make your work faster, simpler, and less expensive than finding your own solutions.

Quelle: https://blog.docker.com/feed/

theCUBE Research economic validation of Docker’s development platform

Docker’s impact on agentic AI, security, developer productivity, costs, ROI.

An independent study by theCUBE Research.

To investigate Docker’s impact on developer productivity, software supply chain security, agentic AI development, cost savings, and ROI, theCUBE Research asked nearly 400 enterprise IT & AppDev leaders from medium to large global enterprises.  The industry context is that enterprise developers face mounting pressure to rapidly ship features, build agentic AI applications, and maintain security. All while navigating a fragmented array of development tools and open source code that require engineering cycles and introduce security risks. Docker transformed software development through containers and DevSecOps workflows, and is now doing the same for agentic AI development and software supply chain security.  theCUBE Research quantified Docker’s impact: teams build agentic AI apps faster, achieve near-zero CVEs, remediate vulnerabilities before exploits, ship modern cloud-native applications, save developer hours, and generate financial returns.

Keep reading to get key highlights and analysis. Download theCube Research report and ebook to take a deep dive.

Agentic AI development streamlined using familiar technologies

Developers can build, run, and share agents and compose agentic systems using familiar Docker container workflows. To do this, developers would build agents safely using Docker MCP Gateway Catalog and Toolkit; run agents securely with Docker Sandboxes; and run models with Docker Model Runner. These capabilities align with theCUBE Research findings that 87% of organizations reduced AI setup time by over 25% and 80% report accelerating AI time-to-market by at least 26%.  Using Docker’s modern and secure software delivery practices, development teams can implement AI feature experiments faster and in days test agentic AI capabilities that previously took months. Nearly 78% of developers experienced significant improvement in the standardization and streamlining of AI development workflows, enabling better testing and validation of AI models. Docker helps enterprises generate business advantages through deploying new customer experiences that leverage agentic AI applications. This is phenomenal, given the nascent stage of agentic AI development in enterprises.

Software supply chain security and innovation can move in lockstep

Security engineering and vulnerability remediation can slow development to a crawl. Furthermore, checkpoints or controls may be applied too late in the software development cycle, or after dangerous exploits, creating compounded friction between security teams seeking to prevent vulnerability exploits and developers seeking to rapidly ship features. Docker embeds security directly into development workflows through vulnerability analysis and continuously-patched certified container images. theCUBE Research analysis supports these Docker security capabilities: 79% of organizations find Docker extremely or very effective at maintaining security & compliance, while 95% of respondents reported that Docker improved their ability to identify and remediate vulnerabilities. By making it very simple for developers to use secure images as a default, Docker enables engineering teams to plan, build, and deploy securely without sacrificing feature velocity or creating deployment bottlenecks. Security and innovation can move in lockstep because Docker concurrently secures software supply chains and eliminates vulnerabilities.

Developer productivity becomes a competitive advantage

Consistent container environments eliminate friction, accelerate software delivery cycles, and enable teams to focus on building features rather than overcoming infrastructure challenges. When developers spend less time on environment setup and troubleshooting, they ship more features. Application features that previously took months now reach customers in weeks. The research demonstrates Docker’s ability to increase developer productivity. 72% of organizations reported significant productivity gains in development workflows, while 75% have transformed or adopted DevOps practices when using Docker. Furthermore, when it comes to AI and supply chain security, the findings mentioned above further support how Docker unlocks developer productivity.

Financial returns exceed expectations

CFOs demand quantifiable returns for technology investments, and Docker delivers them. 95% of organizations reported substantial annual savings, with 28% saving more than $250,000 and another 43% reporting $50,000-$250,000 in cost reductions from infrastructure efficiency, reduced rework, and faster time-to-market. The ROI story is equally compelling: 69% of organizations report ROI exceeding 101%, with 28% achieving ROI above 201%. When factoring in faster feature delivery, improved developer satisfaction, and reduced security incidents, the business case for Docker becomes even more tangible. The direct costs of a security breach can surpass $500 million, so mitigating even a fraction of this cost may provide enough financial justification for enterprises to deploy Docker to every developer.

Modernization and cloud native apps remain top of mind

For enterprises who maintain extensive legacy systems, Docker serves as a proven catalyst for cloud-native transformation at scale. Results show that nearly nine in ten (88%) of organizations report Docker has enabled modernization of at least 10% of their applications, with half achieving modernization across 31-60% of workloads and another 20% modernizing over 60%. Docker accelerates the shift from monolithic architectures to modern containerized cloud-native environments while also delivering substantial business value.  For example, 37% of organizations report 26% to >50% faster product time-to-market, and 72% report annual cost savings ranging from $50,000 to over $1 million.

Learn more about Docker’s impact on enterprise software development

Docker has evolved from a containerization suite into a development platform for testing, building, securing, and deploying modern software, including agentic AI applications. Docker enables enterprises to apply proven containerization and DevSecOps practices to agentic AI development and software supply chain security. 

Download (below) the full report and the ebook from theCUBE Research analysis to learn Docker’s impact on developer productivity, software supply chain security, agentic AI application development, CI/CD and DevSecOps, modernization, cost savings, and ROI.  Learn how enterprises leverage Docker to transform application development and win in markets where speed and innovation determine success.

theCUBE Research economic validation of Docker’s development platform

> Download the Report

> Download the eBook

Quelle: https://blog.docker.com/feed/

How to Connect MCP Servers to Claude Desktop with Docker MCP Toolkit

What if you could turn Claude from a conversational assistant into a development partner that actually does things—safely, securely, and without touching your local machine?

If you’ve been exploring Claude Desktop and wondering how to connect it with real developer tools, Docker MCP Toolkit is the missing piece you’ve been looking for.

Here’s the reality: Today’s AI assistants are brilliant at reasoning and explaining concepts. But when it comes to acting on that knowledge? They’re stuck. They can’t deploy containers, manage repos, or analyze data without trusted tools to bridge that gap.

That’s where Docker MCP Toolkit comes in.

Think of it this way:

Claude is the reasoning engine—the brain that understands what you want

Claude Desktop is the translator, turning your requests into actionable commands

Docker MCP Toolkit is the hand that safely executes those actions in isolated containers

Together, they transform Claude from something that just talks to something that builds alongside you—all through Docker’s trusted, security-hardened workflow.

In this guide, I’ll walk you through setting up Claude Desktop as an MCP client, connecting it to MCP servers running in Docker Desktop, and automating real developer tasks securely and reproducibly. 

What if you could turn 4 hours of work into 10 minutes? Imagine taking a screenshot of any app UI—a Stripe checkout page, a dashboard component, anything—and watching Claude recreate it as a fully-tested, production-ready React component, complete with Jest tests, GitHub repo, and deployment. That’s exactly what we’ll build together

What is Model Context Protocol (MCP)?

Before we dive into the setup, let’s clarify what MCP actually is.

Model Context Protocol (MCP) is the standardized way AI agents like Claude connect to tools, APIs, and services. It’s what lets Claude go beyond conversation and perform real-world actions—like deploying containers, analyzing datasets, or managing GitHub repositories.

In short: MCP is the bridge between Claude’s reasoning and your developer stack. And Docker? Docker provides the guardrails that make it safe.

Why use Docker MCP Toolkit with Claude Desktop?

I’ve been working with AI tools for a while now, and this Docker MCP setup is one of the most impressive things I’ve seen. Docker MCP Toolkit bridges Claude’s intelligence with Docker’s trusted developer workflow.

Docker MCP Toolkit doesn’t just connect Claude to your tools—it creates a secure, containerized environment where Claude can work without any risk to your local machine. Every action happens in an isolated container that gets torn down when you’re done. No mess, no security risks, complete reproducibility.

Once connected, Claude Desktop can tap into Docker’s curated MCP servers to automate developer tasks that used to eat up hours of your day. Here’s what that looks like in practice:

GitHub Integration: Want to create a repo, push some code, or review a pull request? Claude’s got you covered. No more switching between tabs constantly.

Kubernetes Management: If you’re into K8s (and let’s be honest, who isn’t these days?), Claude can deploy pods, manage services, and even handle Helm charts.

Data Analysis: Upload a CSV file and watch Claude analyze it, create visualizations, and give you insights you didn’t even know you needed.

Web Scraping: With Firecrawl integration, Claude can go fetch data from websites and bring it back to you, all formatted and ready to use.

Each of these MCP servers can be enabled directly through the MCP Toolkit in Docker Desktop. The setup is straightforward, the execution is secure, and the results are reproducible every single time.

Setting up Claude Desktop with Docker MCP Toolkit

Prerequisites

Before you begin, make sure you have:

A machine with 8GB RAM minimum, ideally 16GB

Install Docker Desktop

Step 1: Install and sign in to Claude Desktop

Head over to claude.ai/desktop and download Claude Desktop if you haven’t already. The installation is pretty straightforward – just follow the prompts. Once it’s installed, sign in with your Anthropic account.

Step 2: Enable Docker MCP Toolkit

Open Docker Desktop and enable the MCP Toolkit from the Settings menu.

Step 3. Connect Claude Desktop as MCP Client

Click “MCP Toolkit” in the left sidebar of Docker Desktop and click on “Connect” under Claude Desktop.

Step 4. Verify the MCP_DOCKER connection 

Restart Claude Desktop and verify that the MCP_DOCKER connection appears under the Claude Desktop menu.

Step 5. View the Claude Configuration

You can view the Claude configuration file claude_desktop_config.json accessible via Claude > Settings > Developers > Edit Config.

Click “Edit Config” and open it via Visual Studio Code or your preferred IDE.

How it works

When you connect Claude Desktop to Docker MCP Toolkit, this configuration gets added to Claude Desktop’s claude_desktop_config.json file:

Here’s what each part does:

mcpServers: The parent object that defines all MCP server connections available to Claude Desktop

“MCP_DOCKER”: The name identifier for this connection. This is what you’ll see referenced in Claude Desktop’s settings and logs

“command”: “docker”: Tells Claude Desktop to use the Docker CLI as the execution command

“args”: [“mcp”, “gateway”, “run”]: The arguments passed to Docker, which translate to running:

$ docker mcp gateway run

What this actually does

When Claude Desktop needs to access MCP tools, it executes docker mcp gateway run, which:

Starts the Docker MCP Gateway – This acts as the central router/bridge

Connects Claude Desktop (MCP client) to all enabled MCP servers in Docker Desktop

Routes tool requests from Claude → Gateway → Appropriate MCP Server → Back to Claude

Runs everything in isolated containers for security and reproducibility

In simple terms:

Think of it like this:

Claude Desktop = Your AI assistant (the client)

Docker MCP Gateway = The switchboard operator (the router)

MCP Servers = The actual tools (GitHub, Firecrawl, Node.js Sandbox, etc.)

This one configuration line is what makes all the magic happen—it’s the handshake that lets Claude safely access and execute all those containerized developer tools.

Where to find this file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%Claudeclaude_desktop_config.json

Note: When you click “Connect” in Docker Desktop’s MCP Client settings, this configuration gets added automatically—you don’t need to edit it manually.

Real-World Demo: From Screenshot to Deployed App (With Tests!)

Now that you’ve connected Claude Desktop to Docker MCP Toolkit, let’s see it in action with a practical example. What if you could take a screenshot of any app, and have Claude recreate it with working code AND full test coverage in minutes?

The Problem

Right now, the typical workflow looks like this:

Designers create mockups

Developers manually rebuild them in code

QA writes and runs tests

Repeat until it works

It’s slow, manual, and riddled with back-and-forth communication gaps.

The Solution

Claude Desktop + Docker MCP Toolkit automates all three steps. You provide a screenshot, and Claude handles the rest—analysis, code generation, testing, debugging, and deployment.

What You’ll Build

A functional React component from a screenshot, complete with:

Jest unit tests 

A GitHub repo (basic structure)

Configuration files (package.json, babelrc.js, gitignore)

Files ready for download

Note: This is a code foundation, not production deployment. Does NOT include: CI/CD, live deployment, Playwright tests, or a11y validation.

The Demo Setup

For this demonstration, we’ll use five MCP servers:

Firecrawl – Captures and analyzes screenshots

Node.js Sandbox – Runs tests, installs dependencies, validates code (in isolated containers)

GitHub – Handles version control and deployment

Sequential Thinking – Debugs failing tests and optimizes code

Context7 – Provides code documentation for LLMs and AI code editors

The Walkthrough (Step-by-Step)

Phase 1: Screenshot to Component

– You provide a screenshot, could be a Stripe checkout page, a dashboard card, whatever UI you need.

– Claude analyzes the layout, identifies components, measures spacing, captures colors, and generates a React/Next.js component that matches the design.

Phase 2: Generate Test Suite

– Jest unit tests for component logic (41 tests covering rendering, interactions, filters and state management).

Phase 3: Run & Debug with Node.js Sandbox

This is where Docker really shines

– The Node.js Sandbox MCP spins up an isolated container, installs all dependencies safely, and runs your test suite.

– Everything happens in a sandboxes environment; nothing touches your local machine.

– If tests fail? Sequential Thinking kicks in. Claude iterates, fixes the code, spins up a fresh container, and runs the tests again. It repeats this loop until everything passes.

Phase 4: Deploy to GitHub

Once tests are green:

– Claude creates a GitHub repo with proper structure

– Pushes the code

The Results: 

Before: A screenshot of a UI

After: A fully tested, deployed, production-ready component

Time saved: What used to take 4 hours now takes 10 minutes.

Why This Matters

Security First Code runs in isolated containers, not on your machine. No risk of malicious code affecting your system. Every execution has an audit trail.

Reproducibility Same Docker image = same results, every time. Works on your machine, your teammate’s machine, and in CI/CD. No more “works on my machine” excuses.

Speed + Safety Test potentially dangerous AI-generated code without fear. Iterate quickly in isolated environments. Roll back instantly if something breaks.

Visibility See exactly what’s running and where. Monitor resource usage in Docker Desktop. Access full logs and debugging capabilities.

Configure MCP Servers

Assuming that Docker MCP Toolkit is already configured with Claude Desktop, follow the below steps to configure and add MCP servers. 

Click “MCP Toolkit” and select “Catalog” to search for the following MCP servers and add them one by one.

Firecrawl (web scraping, screenshots)

GitHub Official (repo management, PRs, issues, commits)

Sequential Thinking (systematic problem-solving)

Context7 (up-to-date code documentation for LLMs and code editors)

Note: The Node.js Sandbox MCP server implements Docker-out-of-Docker (DooD) pattern by mounting /var/run/docker.sock, giving the sandbox container full Docker daemon API access. This allows it to spawn ephemeral sibling containers for code execution – when Claude requests JavaScript execution, the sandbox container makes Docker API calls to create temporary Node.js containers (node:lts-slim, Playwright, etc.) with resource limits (512MB RAM, 0.75 CPU cores), executes the code in isolation, and auto-removes the container.The Docker socket mount is a privilege escalation vector (effectively granting root-level host access) but enables dynamic container orchestration without the overhead of true Docker-in-Docker. Files are persisted via volume mount to ~/Desktop/sandbox-output, mapped to /root inside execution containers. This architecture trades security for flexibility – acceptable for local development but requires Docker Scout vulnerability scanning and careful consideration for production use. Hence, you’ll require a separate entry for Node.js Sandbox MCP server in  the Claude configuration file. You’ll see how to configure later in this article.

Configure Firecrawl MCP Server

The Firecrawl MCP server gives Claude Desktop the ability to add powerful web scraping and search capabilities. To setup a Firecrawl MCP server, you’ll need Firecrawl API key via https://www.firecrawl.dev/app/api-keys. Create a new account if you’re visiting the https://www.firecrawl.dev/ portal for the first time. Click “API Keys’ on the left sidebar to get the new API keys created.

In Docker Desktop:

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Firecrawl”

Find Firecrawl in the results

Select Configurations

Add Firecrawl API Keys (firecrawl.api_key) that you created earlier

Leave all the other entries blank

Click Save and Add Server

The Firecrawl MCP server should now appear under “My Servers” in Docker MCP Toolkit.

What you get:

6+ Firecrawl tools including:

firecrawl_check_crawl_status – check the status of a crawl job.

firecrawl_crawl – Starts a crawl job on a website and extracts content from all pages.

firecrawl_map – Map a website to discover all indexed URLs on the site

firecrawl_extract – Extract structured information from web pages using LLM capabilties.

firecrawl_scrape -Scrape content from a single URL with advanced options.

firecrawl_search – Search the web and optionally extract content from search results.

Configure GitHub Official MCP Server

The GitHub MCP enables Claude Desktop to create issues, PRs, and manage repositories on your behalf.

Option 1: OAuth Authentication (Recommended – Easiest)

In MCP Toolkit → Catalog, search “GitHub Official”

Click + Add

Go to the OAuth tab in Docker Desktop

Find the GitHub entry

Click “Authorize”

Your browser opens GitHub’s authorization page

Click “Authorize Docker” on GitHub

You’re redirected back to Docker Desktop

Return to Catalog tab, find GitHub Official

Click Start Server

Advantage: No manual token creation. Authorization happens through GitHub’s secure OAuth flow with automatic token refresh.

Option 2: Personal Access Token (For Granular Control)

If you prefer manual control or need specific scopes:

Step 1: Create GitHub Personal Access Token

Go to https://github.com  and sign in

Click your profile picture → Settings

Scroll to “Developer settings” in the left sidebar

Click “Personal access tokens” → “Tokens (classic)”

Click “Generate new token” → “Generate new token (classic)”

Name it: “Docker MCP Browser Testing”

Select scopes:

repo (Full control of repositories)

workflow (Update GitHub Actions workflows)

Click “Generate token”

Copy the token immediately (you won’t see it again!)

Step 2: Configure in Docker Desktop

In MCP Toolkit → Catalog, find GitHub Official

Click + Add (if not already added)

Go to Configuration tab

Select “Personal Access Token” as the authentication method

Paste your token

Click Start Server

Or via CLI:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN=github_pat_YOUR_TOKEN_HERE

Configure Sequential Thinking MCP Server

The Sequential Thinking MCP server gives Claude Desktop the ability for dynamic and reflective problem-solving through thought sequences. Adding the Sequential Thinking MCP server is straightforward –  it doesn’t require any API key. Just search for Sequential Thinking in the Catalog and get it to your MCP server list.

In Docker Desktop:

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Sequential Thinking”

Find Sequential Thinking in the results

Click “Add MCP Server” to add without any configuration

The Sequential Thinking MCP MCP server should now appear under “My Servers” in Docker MCP Toolkit.

What you get:

A single Sequential Thinking tool that includes:

sequentialthinking – A detailed tool for dynamic and reflective problem-solving through thoughts. This tool helps analyze problems through a flexible thinking process that can adapt and evolve. Each thought can build on, question, or revise previous insights as understanding deepens.

Configure Node.js Sandbox MCP Server

The Node.js Sandbox MCP enables Claude Desktop to spin up disposable Docker containers to execute arbitrary JavaScript. To get it added to the Claude Desktop, replace the contents of the Claude Desktop configuration file with the following JSON structure. This configuration tells Claude Desktop to start the Node.js Sandbox with access to specific directories:

{
"mcpServers": {
"MCP_DOCKER": {
"command": "docker",
"args": ["mcp", "gateway", "run"]
},
"node-code-sandbox": {
"command": "docker",
"args": [
"run",
"-i",
"–rm",
"-v",
"/var/run/docker.sock:/var/run/docker.sock",
"-v",
"/Users/YOUR_USERNAME/Desktop/sandbox-output:/root",
"mcp/node-code-sandbox"
],
"env": {
"FILES_DIR": "/root",
"SANDBOX_MEMORY_LIMIT": "512m",
"SANDBOX_CPU_LIMIT": "0.75"
}
}
}
}

Before you restart your Claude Desktop, make sure that the /Users/YOUR_USERNAME/Desktop/sandout-output directory exists on your local system and this directory is made available to containers via Docker Desktop > Settings > Resources > File Sharing.

What you get:

7 Node.js Sandbox tools including:

get_dependency_types – Given an array of npm package names (and optional versions), fetch whether each package ships its own TypeScript definitions or has a corresponding @types/… package, and return the raw .d.ts text. Useful whenwhen you’re about to run a Node.js script against an unfamiliar dependency and want to inspect what APIs and types it exposes.

run_js- Install npm dependencies and run JavaScript code inside a running sandbox container.

run_js_ephermeral – Run a JavaScript snippet in a temporary disposable container with optional npm dependencies, then automatically clean up.

sandbox_exe – Execute one or more shell commands inside a running sandbox container. Requires a sandbox initialized beforehand.

sandbox_initialize – Start a new isolated Docker container running Node.js. Used to set up a sandbox session for multiple commands and scripts.

sandbox_stop – Terminate and remove a running sandbox container. Should be called after finishing work in a sandbox initialized with sandbox_initialize.

search_npm_packages – Search for npm packages by a search term and get their name, description, and a README snippet.

Configure Context7 MCP Server

The Context7 MCP enables Claude Desktop to access the latest and up-to-date code documentation for LLMs and AI code editors. Adding Context7 MCP server is straightforward. It doesn’t require any API key. Just search for Context7 in the Catalog and get it added to the MCP server lists.

In Docker Desktop:

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Context7”

Find Context7 in the results

Click “Add MCP Server” to add without any configuration

The Context7 MCP server should now appear under “My Servers” in Docker MCP Toolkit

What you get:

2 Context7 tools including:

get-library-docs – Fetches up-to-date documentation for a library.

resolve-library-id – Resolves a package/product name to a Context7-compatible library ID and returns a list of matching libraries. 

Verify the available tools under Claude Desktop

Once you have added all the MCP servers, click “Disconnect” and “Connect” so as to see the various MCP tools under MCP_DOCKER.

That’s it. It’s time to start interacting with your MCP servers and tools.

Let’s Test it Out

Prompt 1:

I'm going to upload a screenshot of a UI component. Please execute this complete workflow using all available MCP tools:

PHASE 0: STRATEGIC PLANNING Sequential Thinking to:

1. Analyze what type of component this appears to be
2. Determine what research and documentation we'll need
3. Plan the component architecture and structure
4. Identify testing requirements and edge cases
5. Create a step-by-step implementation strategy
6. Estimate the number of tests needed for full coverage

PHASE 1: DESIGN RESEARCH

Use Firecrawl to analyze the screenshot and extract:
– Complete color palette (hex values)
– All spacing and padding measurements
– Typography specifications (font family, sizes, weights) – Layout structure (grid, flexbox patterns)
– Component boundaries and hierarchy
– Interactive elements (buttons, inputs, dropdowns)

PHASE 2: DOCUMENTATION RESEARCH

Use Context7 to fetch the latest documentation for:
– React (latest hooks patterns and best practices)
– Next.js (if applicable
– Jest (current testing patterns and matchers)
– React Testing Library (current query methods and best practices)
– Lucide React or any icon library needed
– Any other libraries identified in the planning phase

PHASE 3: CODE GENERATION

Based on the research from Phases 0-2, generate:
1. A React/Next.js component that:
– Uses exact colors and spacing from Firecrawl analysis
– Follows current best practices from Context7 docs
– Implements the architecture from Sequential Thinking plan
– Includes proper TypeScript types
– Has clean, readable, production-ready code '

2. Complete project structure:
– package.json with correct dependencies
– babel.config.js for Jest
– .gitignore
– README.md with setup instructions

PHASE 4: COMPREHENSIVE TESTING

Generate Jest + React Testing Library tests covering:
– Component rendering (all variants)
– User interactions (clicks, inputs, selections)
– State management and updates
– Edge cases (empty states, loading, errors)
– Accessibility (ARIA roles, keyboard navigation)
– Props validation and defaults

Follow current best practices from Context7 documentation. Aim for the test count estimated in Phase 0.

PHASE 5: VALIDATION & DEBUGGING

Use Node.js Sandbox to:
1. Install all dependencies in isolated container
2. Run the complete test suite
3. Capture all output and results

If ANY tests fail:
– Use Sequential Thinking to:
* Analyze the error messages
* Generate 3 hypotheses for root cause
* Rank hypotheses by likelihood
* Test hypothesis 1 with a fix
* Re-run in Node.js Sandbox
* If still failing, move to hypothesis #2
* Continue until all tests pass

If tests need clarification on errors:
– Use Context7 to look up specific error messages or API changes

Ensure that you upload the screenshot

Here’s how Claude Desktop responds.

I'll execute this comprehensive workflow step-by-step using all available MCP tools. Let me start by analyzing the uploaded screenshot and planning our approach.
PHASE 0: STRATEGIC PLANNING
Let me use Sequential Thinking to analyze this component and create a comphrenesive implementation strategy.
Perfect! I've completed the strategic planning. Now let's move to Phase 1.
PHASE 1: DESIGN RESEARCH

For Phase 1, I'll use Firecrawl to scrape the uploaded image and extract visual specifications. However, Firecrawl is designed for web scraping, not image analysis. I should use the image directly for visual analysis and describe what I see.
..

PHASE 2: DOCUMENTATION RESEARCH
Now let me fetch the latest documentation using Context7:

Now let me fetch React Testing Library documentation:…
I'll use /websites/testing-library_com-docs as it has the most code snippets (565) and comprehensive documentation.
Perfect! I have enough documentation. Now let's proceed to PHASE 3: CODE GENERATION

Perfect! Now let's move to PHASE 6: DEPLOYMENT. Let me create a GitHub repository
..

Now let's proceed with PHASE 4: COMPREHENSIVE TESTING. I'll create comprehensive test files for all components:

Now I'll create comprehensive tests for the resource library component based on the screenshot. Let me generate a complete test suite.

Excellent progress! CSS issue resolved – 35 of 41 tests now passing! Let me analyze the remaining 6 failures:

BUG IDENTIFIED: Type Filter Mismatch
Problem: Type filter keys don't match resource.type values:
Filter keys: 'Infographics', 'Videos', 'White Papers' (plural/different)
Resource types: 'Infographic', 'Video', 'White Paper' (singular/different)
Fixing the component:
SUCCESS! ALL 41 TESTS PASSING!
Test Suites: 1 passed, 1 total
Tests: 41 passed, 41 total
Time: 3.046 s
..
Due to character limits, let me push the updated files more efficiently. Since all tests pass locally, let me commit the fixes…

Within minutes, Claude will: 

Analyze the design 

Generate the component 

Write comprehensive tests 

Run them in an isolated Docker container 

Debug any failures 

Push the final code to GitHub with a live repo link

Example Result: https://github.com/ajeetraina/resources-browser-component

You should be able to verify if Node.js Sandbox container is running or not using Docker Dashboard. The logs show the MCP (Model Context Protocol) server successfully initializing inside a Docker container. The container starts a scavenger process that will automatically clean up after 1 hour of inactivity (checking every 60 seconds). It then performs a handshake announcing it supports MCP Protocol 2025-06-18 and server version 1.3.0.

The server registers 7 tools for JavaScript execution: 

sandbox_initialize (create persistent containers), 

sandbox_exec (run shell commands), 

run_js (run JS in persistent container), 

run_js_ephemeral (one-shot JS with auto-cleanup), 

sandbox_stop (cleanup), 

get_dependency_types (fetch TypeScript definitions), and 

search_npm_packages (search npm registry). 

All tools that execute JavaScript require ESModules syntax and use the ./files directory for file persistence. The server is fully operational and ready to execute code! 

Wrapping Up

You’ve just witnessed how Docker MCP Toolkit transforms Claude Desktop from a chat assistant into a genuine development partner.

What used to require hours of manual work—analyzing designs, writing code, creating tests, debugging failures, setting up repos—now happens in minutes with a single prompt.

This is the new paradigm for AI-assisted development. You’re not just using AI anymore. You’re collaborating with it in a way that’s secure, reproducible, and production-ready.

Ready to try it? Open Docker Desktop to get started with MCP Toolkit (requires v4.48 or newer to launch automatically).

Learn more

New to Docker? Download Docker Desktop today.

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers.

Get started with the MCP Toolkit: Run MCP servers easily and securely.

Read our MCP Horror Stories for real-life MCP security cases.

Quelle: https://blog.docker.com/feed/

Docker Hub Incident Report – October 20, 2025

Docker experienced significant disruptions due to a widespread outage in AWS’s US-East-1 region on October 20, 2025. Developers worldwide rely on Docker as part of their daily workflow, and we regret the disruption this caused. In this post, we want to provide transparency about what happened, what we have learned, and how we are strengthening our systems for the future.

What Happened

Beginning October 20, 2025 at 06:48 UTC, Docker Hub, Hardened Images, Scout, Build Cloud, Automated Builds, and Testcontainers Cloud experienced an increase in failure rate when AWS’s largest region, US-East-1 experienced an outage with DynamoDB, EC2, Network Load Balancer, and other AWS services. See the AWS summary of the service disruption for further details. This increasing failure rate led to a complete outage of Docker services across the aforementioned products beginning at 2025-10-20 08:01 UTC.

Starting at 2025-10-20 09:40 UTC, AWS reported progress and partial restoration that resulted in Docker services being partially restored to operation. Full restoration of Docker services was completed by 2025-10-21 09:42 UTC. The complete timeline and user impacts can be seen here.

Timeline & Impact of Events

2025-10-20 06:48 UTC

AWS DynamoDB and EC2 APIs begin failing, causing degraded performance across Docker Hub, Build Cloud, Testcontainers Cloud, and other related services.

2025-10-20 06:51 UTC

AWS STS begins failing with cascading failures across AWS services

Degradation of Docker services increases. 

Users experience widespread, increased intermittent failures across all requests

2025-10-20 08:01 UTC

All services unavailable

2025-10-20 09:21 UTC

AWS SQS recovers

AWS STS recovers

AWS EC2 still failing

Users continue experiencing high error rates across all Docker services – over 90%

2025-10-20 09:40 UTC

AWS DynamoDB recovery begins

Docker Hub recovery begins – error rate less than 20%

Docker Hardened recovery begins – error rate less than 20%

2025-10-20 12:28 UTC

AWS EC2 recovery begins with throttling in effect

Docker Scout recovery begins

Docker Offload recovery begins

Docker Build Cloud recovery begins

Docker Testcontainers Cloud recovery begins

Automated builds remain unavailable

2025-10-20 18:52 UTC

Docker Hub and Scout fully recover

Docker Build Cloud and Testcontainers Cloud seeing improvements – error rate ~50%

Automated Builds remain unavailable

2025-10-20 20:50 UTC

AWS EC2 recovers fully

Docker Build Cloud, Offload, & Testcontainers Cloud fully recover

Automated Builds remain unavailable

2025-10-21 09:42 UTC

Automated builds fully recover

All services operational

Ongoing Monitoring

All Docker systems are currently operational and we continue to monitor the status of our infrastructure. For real-time operational details, visit our status page where you can subscribe to notifications.

Resilience and Next Steps

We take Docker Hub’s reliability seriously and understand its critical role in development workflows worldwide. Among Docker’s services, Hub’s registry operations, especially image pulls, are the most heavily used and the most essential to keeping developer workflows moving.

Our first priority is ensuring Docker Hub remains available even in the event of a regional cloud failure. To that end, our immediate focus areas include:

Caching strategies: Expanding and optimizing cache layers to reduce the blast radius of upstream failures, ensuring customers can continue accessing frequently used images even during partial outages.

Multi-region resilience: Enabling regional redundancy for Docker Hub’s read operations, beginning with image pulls. This will allow pulls to continue seamlessly even if a single cloud region experiences disruption. We are also exploring approaches to extend these capabilities to write operations such as image pushes, which involve significantly more complexity across regions.

The Docker community depends on Hub’s reliability, and we take that trust seriously. We are committed to learning from this event so that Docker Hub remains a dependable foundation for developers everywhere.
Quelle: https://blog.docker.com/feed/

Why More People Are Taking Control of Their Digital Lives with Self-Hosted Alternatives

Remember when owning software meant you bought a CD, installed it, and it was yours until your computer died? Even if you got a new computer, you could install that same software on the new one. Only “serious” software packages had strict licensing restrictions.

These days, most of our tools live in the cloud, guarded by login screens and monthly invoices. That does sound more convenient—until the subscription fee increases, the features change, or the service shuts down overnight.

If you’ve ever thought, “There must be a better way”, you’re not alone. Self-hosting is on the rise, and it’s not just because people enjoy tinkering with servers in their basements (though some do). Whether you only want a way to store your family photos and documents or you want to do your own password management, there’s probably an open-source option you can host yourself.

In this article, I dig into why self-hosting is growing, the different ways to host your own services (from a Raspberry Pi to a rented server), and some beginner-friendly software to get started with. 

Something to keep in mind: self-hosting and open-source software often go hand in hand, but they’re not the same thing. Many popular self-hosted apps are open-source, but you can also self-host closed-source tools, just as there are open-source apps that only run in someone else’s cloud. In other words, self-hosting is about who controls the infrastructure, not necessarily how the software is licensed.

What Motivates People to Try Self-Hosting?

Here are some of the biggest reasons you might decide to run things yourself:

Privacy and Data Sovereignty

When you use a third-party service, your data often lives on their servers under their rules. That means trusting them to keep it secure, private, and not to sell it to advertisers or hand it over at the drop of a subpoena. With self-hosting, you decide where your data lives and who can see it. 

There’s also something inherently creepy about providers like Google Photos wanting to normalize the idea of deepfaking your family photos to create memories that never happened. These are *your* photos, not training data for Google’s AI models.

Escaping Subscription Fatigue

Five dollars here, ten dollars there… and suddenly you’re paying more for cloud services than for your actual internet connection. To make matters worse, subscription tiers are often set up so that the one feature you actually need sits behind the premium option. 

Self-hosted software is often free, or it might carry a one-time cost; you only need to provide the hardware and electricity. 

For a home user, that can lead to serious savings over time.

Avoiding Platform Lock-in

Have you ever tried to switch from one cloud service to another, only to find your data is trapped in a weird file format or missing entirely? That’s platform lock-in, and it’s as fun as it sounds. 

With self-hosted, open-source tools, you control the formats, backups, and migration process, making it easier to change things later.

When you’re in control of your data, you can move it to a new platform in an afternoon instead of spending days untangling exported files—or worse, discovering that some of your data is simply gone for good.

The Joy of Customization and Independence

Many self-hosted open-source projects let you tweak things to your liking, from the frontend color scheme to adding features no commercial platform offers. There’s something satisfying about knowing your setup runs exactly how you want it to.

Updates can still cause trouble even in self-hosted setups, but at least you decide when you want to upgrade. Unlike when OpenAI recently “upgraded” to ChatGPT 5, and users were clamoring to get ChatGPT 4 back!

It also means you decide how often you update, which features to enable, and how your data is organized. Your server won’t suddenly rearrange your photos or change its interface just because someone in a corporate boardroom decided it was time for a “fresh new look.”

A mainstream cloud service usually has to adopt a one-size-fits-all policy, but if you try self-hosting, you’ve got the time to try out a custom-fitted solution.

What Is Self-Hosting?

At its core, self-hosting simply means running software on infrastructure you control: your own hardware, a rented server, or some mix of the two. Instead of trusting a third-party provider to store your files, stream your media, or manage your notes, you’re the one who’s in charge.

There’s no one “right” way to self-host. The best option for you depends on how much control you want, how hands-on you’re willing to be, and what kind of hardware you have access to.

Running Your Own Hardware

This could be anything from a powerful (network-attached storage (NAS)) box, an old laptop you’ve repurposed, or a Raspberry Pi quietly humming in the corner. You get maximum control and keep your data entirely in your own hands, without relying on an outside company to keep it safe.

You might even be brave enough to go the route of a refurbished enterprise server for more horsepower. But be aware: those older machines can be noisy and surprisingly power-hungry. Your electricity bill might notice before you do. You’ll also most likely be running Linux as an operating system because most self-hosted projects are built and supported for Linux first. This opens up a world of possibilities—but it also comes with a bit of a learning curve if you’ve never worked with Linux before.

Of course, you’re also in control of your own backups, which can be a double-edged sword. You should be fine if you follow the 3-2-1 rule of backups, which advocates for keeping three copies of your data on at least two different mediums, with at least one copy being stored off-site.

Remember, RAID is good, but RAID is not a backup!

Using a VPS or Rented Server

A VPS, or virtual private server, is basically a small slice of someone else’s much larger computer in a data center, rented on a month-to-month basis. It can provide excellent uptime, stable internet speeds, and the ability to run almost anything you could run at home, without worrying about power cuts or router failures.

Depending on the provider and the pricing tier, there might also be some extras included: automated backups, easy scaling if you need more resources, and sometimes even built-in security tools.

The trade-off is that you’re still trusting the hosting provider, who has full control over the physical host machine your VPS runs on, which in turn gives them the ability to access or manage your virtual server if they choose. If you have particularly strong privacy concerns, then the VPS route might not be for you.

Oh, and you’ll probably still be running Linux.

The Hybrid Approach

Many people prefer to use a mix of these two approaches: maybe your photo library runs on a VPS for easy access anywhere in the world, while your media server lives at home to stream movies directly over your local network.

Hybrid setups can give you the best of both worlds in terms of convenience and control, without putting all your eggs in one basket.

Popular Self-Hosted Alternatives

Once you decide to give self-hosting a try, the next question is: what should you run first? The good news is there’s a huge range of open-source software that’s not only powerful, but also easy to set up—especially if they’re available to run as a [container](https://www.docker.com/resources/what-container/).

Containers are tightly packaged software delivery mechanisms that contain all the necessary pieces for running a particular application, like software libraries and configuration. And while you’d typically run different services (like a web server and database) as separate containers, tools like Docker Compose lets you start them all together with a single command, instead of manually configuring everything one piece at a time.

Below are some popular self-hosted options to get started with. They’re sourced from recommendations that are mentioned again and again on communities like r/selfhosted or r/homelab.

Photos

If you’ve been looking for a Google Photos replacement, Immich is worth consideration. It can automatically back up your photos and videos from your phone, offers popular Google-esque features like facial recognition, and has a slick web interface for browsing your library.

It’s open-source, and you can download the source code directly from their Github repository. They also ask that users support the project by buying a support license. The license does not change anything about the product; it’s simply a way for fans of the project to contribute to a great, open-source application.

Productivity

Everyone needs a word processor, and even though Google Docs is free to use with a Google account, it’s not really free: you’re still paying with your data and privacy.

LibreOffice is a fully-featured, open-source office productivity suite with applications for word processing, spreadsheets, and presentations, to mention a few. It is also compatible with most of the new Microsoft Office document formats like *.docx* and *.pptx*.

You can also try “Collabora Online Development Edition” (CODE for short), an online, collaborative editing platform built on top of LibreOffice. Its Docker image makes running it rather effortless.

File Storage

Seafile is a file storage application for hosting and sharing any file from your own servers. While they do have a Pro edition with more features, the completely free Community Edition can run on your own hardware.

Another option is Nextcloud, an open-source content collaboration platform with a rich app ecosystem for features like file storage, calendars, contacts, and more, all running on infrastructure you control.

Media

If you’ve ever dreamed of running your own Netflix-style media library, Jellyfin can make it happen. It streams your movies, TV shows, and music to any device you own: no ads, no data tracking, and no sudden content removals.

Supporting Tools

If you’re considering running a self-hosted setup, there are a few supporting tools you should almost always include.

One of these is Vaultwarden, a lightweight, self-hosted password manager that syncs across devices and works with the Bitwarden browser extensions and apps.

Another is Netbird. When you start hosting your own hardware and software, one of the more common challenges is securing your network access to your infrastructure. Netbird is a zero-configuration networking platform that makes setting up VPNs to your various devices a much less daunting task.

Taking Back Control of Your Digital Life

Self-hosting isn’t just for enterprise sysadmins anymore. Whether you want to protect your privacy, cut down on subscriptions, avoid lock-in, or just build something that works exactly the way you want it, there’s never been an easier time to take control of your digital life.

As you’ve seen in this article, there’s more than one way to do it: running a quiet little server in your home, renting a VPS with built-in backups, or mixing the two options in a hybrid setup. And with so many great open-source tools available—from Immich and Nextcloud to Jellyfin and Vaultwarden—you can build your own digital toolkit without sacrificing convenience or features.

Container platforms like Docker make it even easier to get started. Many self-hosted projects provide official Docker images, so you can deploy apps consistently on almost any machine. With a little time and patience, you could be running powerful, private, and subscription-free services in no time.

If you’ve ever felt like the cloud has taken a little too much control over your data (and your wallet), maybe it’s time to explore the idea of owning your own little piece of cloud.

Quelle: https://blog.docker.com/feed/

AI Guide to the Galaxy: MCP Toolkit and Gateway, Explained

This is an abridged version of the interview we had in AI Guide to the Galaxy, where host Oleg Šelajev spoke with Jim Clark, Principal Software Engineer at Docker, to unpack Docker’s MCP Toolkit and MCP Gateway.

TL;DR

What they are: The MCP Toolkit helps you discover, run, and manage MCP servers; the MCP Gateway unifies and securely exposes them to your agent clients.

Why Docker: Everything runs as containers with supply-chain checks, secret isolation, and OAuth support.

How to use: Pick servers from the MCP Catalog, start the MCP Gateway, and your client (e.g., Claude) instantly sees the tools.

First things first: if you want the official overview and how-tos, start with the Docker MCP Catalog and Toolkit.

A quick origin story (why MCP and Docker?)

Oleg: You’ve been deep in agents for a while. Where did this all start?

Jim: When tool calling arrived, we noticed something simple but powerful: tools look a lot like containers. So we wrapped tools in Docker images, gave agents controlled “hands,” and everything clicked. That was even before the Model Context Protocol (MCP) spec landed. When Anthropic published MCP, it put a name to what we were already building.

What the MCP Toolkit actually solves

Oleg: So, what problem does the Toolkit solve on day one?

Jim: Installation and orchestration. The Toolkit gives you a catalog of MCP servers (think: YouTube transcript, Brave search, Atlassian, etc.) packaged as containers and ready to run. No cloning, no environment drift. Just grab the image, start it, and go. As Docker builds these images and publishes them to Hub, you get consistency and governance on pull.

Oleg: And it presents a single, client-friendly surface?

Jim: Exactly. The Toolkit can act as an MCP server to clients, aggregating whatever servers you enable so clients can list tools in one place.

How the MCP Gateway fits in

Oleg: I see “Toolkit” inside Docker Desktop. Where does the MCP Gateway come in?

Jim: The Gateway is a core piece inside the Toolkit: a process (and open source project) that unifies which MCP servers are exposed to which clients. The CLI and UI manage both local containerized servers and trusted remote MCP servers. That way you can attach a client, run through OAuth where needed, and use those remote capabilities securely via one entry point.

Oleg: Can we see it from a client’s perspective?

Jim: Sure. Fire up the Gateway, connect Claude, run mcp list, and you’ll see the tools (e.g., Brave Web Search, Get Transcript) available to that session, backed by containers the Gateway spins up on demand.

Security: provenance, secrets, and OAuth without drama

Oleg: What hardening happens before a server runs?

Jim: On pull/run, we do provenance verification, ensuring Docker built the image, checking for an SBOM, and running supply-chain checks (via Docker Scout) so you’re not executing something tampered with.

Oleg: And credentials?

Jim: Secrets you add (say, for Atlassian) are mounted only into the target container at runtime, nothing else can see them. For remote servers, the Gateway can handle OAuth flows, acquiring or proxying tokens into the right container or request path. It’s two flavors of secret management: local injection and remote OAuth, both controlled from Docker Desktop and the CLI.

Profiles, filtering, and “just the tools I want”

Oleg: If I have 30 servers, can I scope what a given client sees?

Jim: Yes. Choose the servers per Gateway run, then filter tools, prompts, and resources so the client only gets the subset you want. Treat it like “profiles” you can version alongside your code; compose files and config make it repeatable for teams. You can even run multiple gateways for different configurations (e.g., “chess tools” vs. “cloud ops tools”).

From local dev to production (and back again)

Oleg: How do I move from tinkering to something durable?

Jim: Keep it Compose-first. The Gateway and servers are defined as services in your compose files, so your agent stack is reproducible. From there, push to cloud: partners like Google Cloud Run already support one-command deploys from Compose, with Azure integrations in progress. Start locally, then graduate to remote runs seamlessly.

Oleg: And choosing models?

Jim: Experiment locally, swap models as needed, and wire in the MCP tools that fit your agent’s job. The pattern is the same: pick models, pick tools, compose them, and ship.

Getting started with MCP Gateway (in minutes)

Oleg: Summarize the path for me.

Jim:

Pick servers from the catalog in Docker Desktop (or CLI).

Start the MCP Gateway and connect your client.

Add secrets or flow through OAuth as needed.

Filter tools into a profile.

Capture it in Compose and scale out.

Why the MCP Toolkit and Gateway improve team workflows

Fast onboarding: No glue code or conflicting envs, servers come containerized.

Security built-in: Supply-chain checks and scoped secret access reduce risk.

One workflow: Local debug, Compose config, cloud deploys. Same primitives, fewer rewrites.

Try it out

Spin up your first profile and point your favorite client at the Gateway. When you’re ready to expand your agent stack, explore tooling like Docker Desktop for local iteration and Docker Offload for on-demand cloud resources — then keep everything declarative with Compose.

Ready to build? Explore the Docker MCP Catalog and Toolkit to get started.

Learn More

Watch the rest of the AI Guide to the Galaxy series

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers

Open Docker Desktop and get started with the MCP Toolkit (Requires version 4.48 or newer to launch the MCP Toolkit automatically)

Check out our latest guide on how to setup Claude Code with Docker’s MCP Toolkit

Quelle: https://blog.docker.com/feed/

Your Org, Your Tools: Building a Custom MCP Catalog

I’m Mike Coleman, a staff solutions architect at Docker. In this role, I spend a lot of time talking to enterprise customers about AI adoption. One thing I hear over and over again is that these companies want to ensure appropriate guardrails are in place when it comes to deploying AI tooling. 

For instance, many organizations want tighter control over which tools developers and AI assistants can access via Docker’s Model Context Protocol (MCP) tooling. Some have strict security policies that prohibit pulling images directly from Docker Hub. Others simply want to offer a curated set of trusted MCP servers to their teams or customers.

In this post, we walk through how to build your own MCP catalog. You’ll see how to:

Fork Docker’s official MCP catalog

Host MCP server images in your own container registry

Publish a private catalog

Use MCP Gateway to expose those servers to clients

Whether you’re pulling existing MCP servers from Docker’s MCP Catalog or building your own, you’ll end up with a clean, controlled MCP environment that fits your organization.

Introducing Docker’s MCP Tooling

Docker’s MCP ecosystem has three core pieces:

MCP Catalog

A YAML-based index of MCP server definitions. These describe how to run each server and what metadata (description, image, repo) is associated with it. The MCP Catalog hosts over 220+ containerized MCP servers, ready to run with just a click. 

The official docker-mcp catalog is read-only. But you can fork it, export it, or build your own.

MCP Gateway

The MCP Gateway connects your clients to your MCP servers. It doesn’t “host” anything — the servers are just regular Docker containers. But it provides a single connection point to expose multiple servers from a catalog over HTTP SSE or STDIO.

Traditionally, with X servers and Y clients, you needed X * Y configuration entries. MCP Gateway reduces that to just Y entries (one per client). Servers are managed behind the scenes based on your selected catalog.

You can start the gateway using a specific catalog:

docker mcp gateway run –catalog my-private-catalog

MCP Gateway is open source: https://github.com/docker/mcp-gateway

Figure 1: The MCP Gateway provides a single connection point to expose multiple MCP servers

MCP Toolkit (GUI)

Built into Docker Desktop, the MCP Toolkit provides a graphical way to work with the MCP Catalog and MCP Gateway. This allows you to:

Access to Docker’s MCP Catalog via a rich GUI

Secure handling of secrets (like GitHub tokens)

Easily enable MCP servers

Connect your selected MCP servers with one click to a variety of clients like Claude code, Claude Desktop, Codex, Cursor, Continue.dev, and Gemini CLI

Workflow Overview

The workflow below will show you the steps necessary to create and use a custom MCP catalog. 

The basic steps are:

Export the official MCP Catalog to inspect its contents

Fork the Catalog so you can edit it

Create your own private catalog

Add specific server entries

Pull (or rebuild) images and push them to your registry

Update your catalog to use your images

Run the MCP Gateway using your catalog

Connect clients to it

Step-by-Step Guide: Creating and Using a Custom MCP Catalog

We start by setting a few environment variables to make this process repeatable and easy to modify later.For the purpose of this example, assume we are migrating an existing MCP server (DuckDuckGo) to a private registry (ghcr.io/mikegcoleman). You can also add your own custom MCP server images into the catalog, and we mention that below as well. 

export MCP_SERVER_NAME="duckduckgo"
export GHCR_REGISTRY="ghcr.io"
export GHCR_ORG="mikegcoleman"
export GHCR_IMAGE="${GHCR_REGISTRY}/${GHCR_ORG}/${MCP_SERVER_NAME}:latest"
export FORK_CATALOG="my-fork"
export PRIVATE_CATALOG="my-private-catalog"
export FORK_EXPORT="./my-fork.yaml"
export OFFICIAL_DUMP="./docker-mcp.yaml"
export MCP_HOME="${HOME}/.docker/mcp"
export MCP_CATALOG_FILE="${MCP_HOME}/catalogs/${PRIVATE_CATALOG}.yaml"

Step 1: Export the official MCP Catalog 

Exporting the official Docker MCPCatalog gives you a readable local YAML file listing all servers. This makes it easy to inspect metadata like images, descriptions, and repository sources outside the CLI.

docker mcp catalog show docker-mcp –format yaml > "${OFFICIAL_DUMP}"

Step 2: Fork the official MCP Catalog

Forking the official catalog creates a copy you can modify. Since the built-in Docker catalog is read-only, this fork acts as your editable version.

docker mcp catalog fork docker-mcp "${FORK_CATALOG}"
docker mcp catalog ls

Step 3: Create a new catalog

Now create a brand-new catalog that will hold only the servers you explicitly want to support. This ensures your organization runs a clean, controlled catalog that you fully own.

docker mcp catalog create "${PRIVATE_CATALOG}"

Step 4: Add specific server entries

Export your forked catalog to a file so you can copy over just the entries you want. Here we’ll take only the duckduckgo server and add it to your private catalog.

docker mcp catalog export "${FORK_CATALOG}" "${FORK_EXPORT}"
docker mcp catalog add "${PRIVATE_CATALOG}" "${MCP_SERVER_NAME}" "${FORK_EXPORT}"

Step 5: Pull (or rebuild) images and push them to your registry

At this point you have two options:

If you are able to pull from Docker Hub, find the image key for the server you’re interested in by looking at the YAML file you exported earlier. Then pull that image down to your local machine. After you’ve pulled it down, retag it for whatever repository it is you want to use. 

Example for duckduckgo

vi "${OFFICIAL_DUMP}" # look for the duckduck go entry and find the image: key which will look like this:
# image: mcp/duckduckgo@sha256:68eb20db6109f5c312a695fc5ec3386ad15d93ffb765a0b4eb1baf4328dec14f

# pull the image to your machine
docker pull
mcp/duckduckgo@sha256:68eb20db6109f5c312a695fc5ec3386ad15d93ffb765a0b4eb1baf4328dec14f

# tag the image with the appropriate registry
docker image tag mcp/duckduckgo@sha256:68eb20db6109f5c312a695fc5ec3386ad15d93ffb765a0b4eb1baf4328dec14f ${GHCR_IMAGE}

# push the image
docker push ${GHCR_IMAGE}

At this point you can move on to editing the MCP Catalog file in the next section.

 If you cannot download from Docker Hub you can always rebuild the MCP server from its GitHub repo. To do this, open the exported YAML and look for your target server’s GitHub source repository. You can use tools like vi, cat, or grep to find it — it’s usually listed under a source key. 

Example for duckduckgo:
source: https://github.com/nickclyde/duckduckgo-mcp-server/tree/main

export SOURCE_REPO="https://github.com/nickclyde/duckduckgo-mcp-server.git"

Next, you’ll rebuild the MCP server image from the original GitHub repository and push it to your own registry. This gives you full control over the image and eliminates dependency on Docker Hub access.

echo "${GH_PAT}" | docker login "${GHCR_REGISTRY}" -u "${GHCR_ORG}" –password-stdin

docker buildx build
–platform linux/amd64,linux/arm64
"${SOURCE_REPO}"
-t "${GHCR_IMAGE}"
–push

Step 6: Update your catalog 

After publishing the image to GHCR, update your private catalog so it points to that new image instead of the Docker Hub version. This step links your catalog entry directly to the image you just built.

vi "${MCP_CATALOG_FILE}"

# Update the image line for the duckduckgo server to point to the image you created in the previous step (e.g. ghcr.io/mikegcoleman/duckduckgo-mcp)

Remove the forked version of the catalog as you no longer need it

docker mcp catalog rm "${FORK_CATALOG}"

Step 7: Run the MCP Gateway 

Enabling the server activates it within your MCP environment. Once enabled, the gateway can load it and make it available to connected clients. You will get warnings about “overlapping servers” that is because the same servers are listed in two places (your catalog and the original catalog)

docker mcp server enable "${MCP_SERVER_NAME}"
docker mcp server list

Step 8: Connect to popular clients 

Now integrate the MCP Gateway with your chosen client. The raw command to run the gateway is: 

docker mcp gateway run –catalog "${PRIVATE_CATALOG}"

But that just runs an instance on your local machine, when what you probably want is to integrate with some client application. 

To do this you need to format the raw command so that it works for the client you wish to use. For example, with VS Code you’d want to update the mcp.json as follows:

"servers": {
"docker-mcp-gateway-private": {
"type": "stdio",
"command": "docker",
"args": [
"mcp",
"gateway",
"run",
"–catalog",
"my-private-catalog"
]
}
}

Finally, verify that the gateway is using your new GHCR image and that the server is properly enabled. This quick check confirms everything is configured as expected before connecting clients.

docker mcp server inspect "${MCP_SERVER_NAME}" | grep -E 'name|image'

Summary of Key Commands

You might find the following CLI commands handy:

docker mcp catalog show docker-mcp –format yaml > ./docker-mcp.yaml
docker mcp catalog fork docker-mcp my-fork
docker mcp catalog export my-fork ./my-fork.yaml
docker mcp catalog create my-private-catalog
docker mcp catalog add my-private-catalog duckduckgo ./my-fork.yaml
docker buildx build –platform linux/amd64,linux/arm64 https://github.com/nickclyde/duckduckgo-mcp-server.git
-t ghcr.io/mikegcoleman/duckduckgo:latest –push
docker mcp server enable duckduckgo
docker mcp gateway run –catalog my-private-catalog

Conclusion

By using Docker’s MCP Toolkit, Catalog, and Gateway, you can fully control the tools available to your developers, customers, or AI agents. No more one-off setups, scattered images, or cross-client connection headaches.

Your next steps:

Add more servers to your catalog

Set up CI to rebuild and publish new server images

Share your catalog internally or with customers

Docs:

https://docs.docker.com/ai/mcp-catalog-and-toolkit/

https://github.com/docker/mcp-gateway/

Happy curating. 

We’re working on some exciting enhancements to make creating custom catalogs even easier. Stay tuned for updates!

Learn more

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers

Open Docker Desktop and get started with the MCP Toolkit (Requires version 4.48 or newer to launch the MCP Toolkit automatically)

Read about How Open Source Genius Cut Entropy Debt with Docker MCP Toolkit and Claude Desktop

Quelle: https://blog.docker.com/feed/

Docker + E2B: Building the Future of Trusted AI

Trusted Software Starts Here

The era of agents is here. Some teams are experimenting, others are just getting started, and a few are already running agents in production. But one challenge stands out: trust. Trust that your agents will act securely. Over 20 million developers already rely on Docker to build and ship software safely and fast. Now, we’re helping you build what’s next.

Over the past few months, Docker has continued to make significant investments in driving developer productivity and building a trusted foundation for AI, with a focus on simplicity and portability, from how you build and run agents to how you secure them. We’ve shipped new capabilities: a toolkit for MCP, support for running open-weight models locally, and a catalog of 200+ MCP servers. 

Today, we’re taking that commitment to the next level through a new partnership with E2B, a company that provides secure cloud sandboxes for AI agents. Together, we’re giving developers fast, secure access to hundreds of real-world tools, without sacrificing safety or speed.

The Next Frontier of AI: Trust

Agents can code, connect, and act, but without control, that power creates risk in two areas.

First, when agents run code. AI-generated code often executes in live environments without safeguards, accessing files or APIs it shouldn’t. Teams are learning that “auto-run” doesn’t mean “safe to run.”

Second, when agents connect to real-world tools. Integrating with external tools can create security trade-offs. And the challenge keeps growing. With the rise of the Model Context Protocol (MCP), every connected tool becomes its own potential risk surface with hundreds of servers and tokens to manage. Today, developers use a variety of DIY isolation techniques, custom-built sandboxes, and resource limits, but all of them tend to slow productivity.

Developers have seen this before. Before containers, developers struggled with messy dependencies and inconsistent environments when building applications. 

Today’s AI builders face a similar challenge. Models need access to APIs and databases, but enabling that access slows them down and introduces complexity and risk. Just as containers standardized how applications run, Docker is now standardizing how agents connect and act.

This is where E2B and Docker come together. E2B secures agent-generated code execution with cloud sandboxes. Docker secures tool access through the MCP Gateway and Catalog, offering direct connectivity to 200+ real-world tools. Each MCP server is curated by Docker and automatically audited for exploits and malicious behavior.

Docker + E2B: Secure Access to Hundreds of MCP Tools

Starting today, every E2B sandbox includes direct access to Docker’s MCP Catalog, a collection of 200+ tools such as GitHub, Perplexity, Browserbase, and ElevenLabs, all enabled by the Docker MCP Gateway. Developers can now run and connect agents more confidently, without slowing down.

Try it here.

What’s Next 

This is only the beginning. Today’s partnership is the first step in a larger journey. Together, our goal is to build the future of a secure AI stack, where every agent runs securely, every connection is verifiable, and developers can move fast without compromise.

From Docker’s perspective, our goal is to ensure developers can trust the MCP servers they pull, validate them through Docker, and govern their AI stack with the same visibility and confidence they already have for containers today.

The next wave of AI development will be built on trust. Trust in the tools, the data, and the environments behind them. At Docker, we’re building that foundation.

Quelle: https://blog.docker.com/feed/