<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Cloud Computing Köln &#187; Kubernetes</title>
	<atom:link href="https://www.cloud-computing-koeln.de/category/kubernetes/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.cloud-computing-koeln.de</link>
	<description>Neues zu Cloud Computing, Internet of Things und Technologien</description>
	<lastBuildDate>Sun, 24 May 2026 02:41:14 +0000</lastBuildDate>
	<language>de-DE</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.1.1</generator>
	<item>
		<title>Meet Gordon: Docker’s AI Agent For Your Entire Container Workflow</title>
		<link>https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/</link>
		<comments>https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/#comments</comments>
		<pubDate>Wed, 20 May 2026 02:41:29 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/</guid>
		<description><![CDATA[<p>Gordon understands your environment, proposes fixes, and takes action across your entire Docker workflow. Now generally available. Image 1: Gordon in Docker Desktop Why Gordon Exists&#160; Developers are more productive than ever. AI coding assistants are writing code, merging PRs and cutting review cycles. But the moment something breaks in a container, or a teammate&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/">Meet Gordon: Docker’s AI Agent For Your Entire Container Workflow</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>Gordon understands your environment, proposes fixes, and takes action across your entire Docker workflow. Now generally available.</p>
<p>Image 1: Gordon in Docker Desktop</p>
<p>Why Gordon Exists&nbsp;</p>
<p>Developers are more productive than ever. AI coding assistants are writing code, merging PRs and cutting review cycles. But the moment something breaks in a container, or a teammate hands you a service and says “ship it,” you’re on your own. </p>
<p>Containers don&#8217;t break the way they&#8217;re supposed to. Build cache invalidates for no reason. Postgres can&#8217;t see Redis. The image works locally and crashes in CI. Or an error message links to a Stack Overflow thread from 2017. </p>
<p>Modern software development is a stack of friction stacked on top of friction. And the AI tools you already use can’t help. Cursor doesn’t know what’s running. Copilot can&#8217;t read your logs. Claude Code can’t inspect your Compose file. They’re great at application logic, but they’re not built for everything that happens after code is written. They work from what you paste in. They don’t know your system.  </p>
<p>Docker’s AI Agent, Gordon, does.</p>
<p>        Key takeaways</p>
<p>Gordon is Docker&#8217;s AI agent for your entire container workflow, built into Desktop 4.74+ and the CLI.</p>
<p>It already sees your environment, so you go from problem to fix in minutes instead of hunting for context.</p>
<p>Every action requires your explicit approval, and permissions reset when the session closes.</p>
<p>Start free with any Docker account, then scale up to 20x capacity when Gordon becomes part of your daily workflow.</p>
<p>Meet Gordon&nbsp;</p>
<p>Gordon is Docker&#8217;s AI agent built for the work developers actually do. Not a chatbot that explains what to do. An agent that takes action, with your approval, across your entire Docker workflow. </p>
<p>Gordon reads your running container logs, images, compose files, and working directory. It already knows your environment before you ask. The context is what makes Gordon different. When something breaks, Gordon doesn&#8217;t send you to the docs. It traces the failure in your actual setup, proposes a fix, and waits for you to say go. </p>
<p>Gordon is optimized for Docker and container workflows, but it helps wherever developers need it. Containerize a Node.js app. Debug a crashing container. Spin up a stack of Postgres, Redis, and your own service in one prompt. Read the logs and figure out why your service can&#8217;t reach the network. Ship it.</p>
<p>Under the hood, Gordon has shell access, filesystem operations and the full Docker CLI, a knowledgebase of Docker docs and best practices and web access. We don’t build rigid features. We give Gordon a broad set of capabilities and let the agent figure out how to combine them to solve what you actually asked for. New capability in, new behaviors emerge.</p>
<p>It lives where you already work. Inside Docker Desktop and CLI. No new tools to learn. No context to rebuild every time you switch tasks. </p>
<p>Your coding assistance helps you write the code. Gordon helps you ship it.</p>
<p>Image 2: Gordon welcome screen</p>
<p>What Gordon Does for You</p>
<p>When something is broken</p>
<p>Your build fails. The error log is dense and unhelpful. You&#8217;ve spent twenty minutes scrolling Stack Overflow and you’re no closer.</p>
<p>Tell Gordon: &#8220;My container keeps exiting.&#8221; Gordon reads the logs, traces the failure to the actual cause, a missing env var, a bad base image, a misconfigured volume mount, proposes a fix, and applies it after you approve. Twenty-minutes collapses to just two. </p>
<p>When you&#8217;re starting something new</p>
<p>A teammate hands you a service and says &#8220;ship it.&#8221; No Dockerfile. No compose file. No idea how it talks to the production database. </p>
<p>Tell Gordon: &#8220;Containerize this app and set up a dev environment with Postgres.&#8221; Gordon reads your code, drafts the Dockerfile, builds out a docker-compose with the stack, runs it, and shows you the result. From &#8220;ship it&#8221; to running locally in one conversation.</p>
<p>When you just want it done</p>
<p>Sometimes you don&#8217;t need a thoughtful AI agent. You need to clean up dangling images, stop everything that&#8217;s running, or pull and run nginx, and you don&#8217;t want to look up flags.</p>
<p>Tell Gordon: &#8220;Clean up unused images.&#8221; Gordon shows you the command, you approve, it runs. Fast Docker without the manual pages.</p>
<p>When you want it better</p>
<p>Your Dockerfile works but the image is 2GB and it rebuilds every time you sneeze. You know there&#8217;s a better version of it. But you don&#8217;t have an afternoon to find it.</p>
<p>Tell Gordon: &#8220;Optimize this Dockerfile.&#8221; Gordon proposes a multi-stage build, reorders layers for cache hits, swaps in a slimmer base image, and adds a health check. You diff, you approve, you ship.</p>
<p>When you need context fast</p>
<p>You’re mid debug and you need to know what’s running, what’s using disk, what’s stale. Stopping to look up flags breaks your flow.</p>
<p>Ask Gordon:  &#8220;Show me running containers.&#8221; &#8220;How much disk space is Docker using?&#8221; &#8220;List my images.&#8221;</p>
<p>Gordon already knows your environment. Running containers, images, volumes, networks. It answers without you stopping to remember whether the flag is -a or &#8211;-all. No pasting. No setup. Just ask.</p>
<p>When you&#8217;re learning</p>
<p>Docker has a lot of concepts, and most of the explanations on the internet are years out of date. You’re deep in a new code base and you need to understand volumes, or networking, or why your multi-stage build isn’t doing what you think it is. </p>
<p>Ask Gordon: “Explain bind mounts vs named volumes in the context of my setup.” “Why is my service not reaching the network?” Gordon explains Docker concepts grounded in your actual setup, in plain language, today. Not a blog post from 2019. Your code, your environment, your answer. </p>
<p>Image 3: Debugging session with Gordon</p>
<p>Where Gordon Lives</p>
<p>Gordon lives where you already work. No new tool to install. No context to rebuild. It’s built into Docker Desktop and the CLI so you can go from question to action without leaving your workflow. </p>
<p>Docker Desktop</p>
<p>Gordon has its own tab inside Docker Desktop. Detach it to float alongside your work, with full context of your environment: running containers, images, volumes, the works.</p>
<p>Gordon, mid-task&nbsp;</p>
<p>The tab isn&#8217;t the only way in. Gordon shows up across Docker Desktop at the moment you need it. A container fails to start? Launch Gordon straight from the container list and let it diagnose and fix the problem in place. Same for images, volumes, builds, and search. Wherever Docker Desktop surfaces a problem, Gordon is one click away.</p>
<p>docker ai</p>
<p>Prefer the terminal? Run docker ai from any directory. Same agent, same context, terminal-native. For when you live in a TUI and don&#8217;t want to leave it.</p>
<p>Gordon is available on Docker Desktop 4.74 and above.</p>
<p>You&#8217;re Always in Control</p>
<p>Gordon takes action, but it always asks first. </p>
<p>Every shell command, every file modification, every Docker operation is shown to you before it runs. You approve, you reject, or you redirect. Gordon proposes. You decide.</p>
<p>We built it this way because an agent that can run commands on your machine should never surprise you. The convenience is in Gordon thinking through the problem, pulling the right context, and lining up the right command. The judgment is still yours. </p>
<p>This is what staying in control actually looks like:</p>
<p>Approval First. Every action requires your explicit go-ahead. Every time. </p>
<p>Session-scoped permission. Permissions reset when you close the session. No lingering access. </p>
<p>Full transparency. You see exactly what commands Gordon wants to run before it runs. </p>
<p>Configurable. For trusted workflows, you can enable auto-approve and let Gordon move faster. </p>
<p>Privacy, plainly. We don’t store your code or personal information. Our AI providers don’t retain your data either. Gordon processes your request and that’s it. </p>
<p>Gordon runs on Docker’s SOC 2 Type 2 attested, ISO 27001 certified infrastructure. </p>
<p>Gordon Completes the Stack </p>
<p>Gordon isn’t a replacement for the tools you already use. It’s the agent layer that ties them together.  </p>
<p>Use Gordon when you&#8217;re working with Docker, containers, infrastructure, debugging, or anything between your laptop and production.</p>
<p>Use coding assistants when you&#8217;re deep in application logic, refactoring, or generating new code.</p>
<p>Use both when your task spans the stack, which it usually does.</p>
<p>Most tasks span the whole stack. Your coding assistants help write your code. Now you have an agent that handles both ends. </p>
<p>Start Free. Scale When You’re Ready.&nbsp;</p>
<p>Gordon is included free with every Docker account. No set up. No credit card. Just open Docker Desktop 4.74, login, click the Gordon tab, and start. </p>
<p>Free covers everyday use. Limits reset every few hours so you’re never blocked for long. When Gordon becomes a core part of your workflow, upgrade anytime for more capacity.  </p>
<p>Need more? Gordon standalone plans give you 2x to 20x the capacity of the free tier. They&#8217;re add-ons. Any Docker account can buy one, including Free. </p>
<p>Gordon Plus: 2x usage for regular users hitting base limits. $20/mo.</p>
<p>See full plan details →</p>
<p>Already using Gordon on a paid Docker plan? Check your email for details on your transition. </p>
<p>Gordon Is Ready Today. Start Shipping.&nbsp;</p>
<p>Gordon is generally available today. Free for every Docker account. Built into the tools you already use. Ready to take action the moment you need it. </p>
<p>This isn’t just another feature upgrade. Gordon is how Docker is building intelligence into the entire developer workflow. Not a standalone AI tool you have to context-switch into, but as an agent layer woven into Desktop, Scout, Offload, Sandboxes and Model Runner. Every part of the stack, working together, with an agent that already knows your environment. </p>
<p>Developers have always trusted Docker to build, ship and run software. Gordon is what that trust looks like when it can act on your behalf.</p>
<p>Get started today: </p>
<p>Update Docker Desktop to 4.74 or above. Open Desktop, click the Gordon icon in the sidebar, and start a conversation.</p>
<p>Run docker ai in your terminal for the same agent in CLI form.</p>
<p>Explore Gordon Plans. Start free. Upgrade when you’re ready. </p>
<p>Read the docs. Everything you need to start shipping faster. </p>
<p>Contact sales to learn more.</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/">Meet Gordon: Docker’s AI Agent For Your Entire Container Workflow</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/meet-gordon-dockers-ai-agent-for-your-entire-container-workflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Coding Agent Horror Stories: The Security Crisis Threatening Developer Infrastructure</title>
		<link>https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/</link>
		<comments>https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/#comments</comments>
		<pubDate>Tue, 19 May 2026 02:41:32 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/</guid>
		<description><![CDATA[<p>This is issue 1 of a new series called Coding Agent Horror Stories where we examine critical security failures in the AI coding agent ecosystem and how Docker Sandboxes provide enterprise-grade protection against these threats. AI coding agents are everywhere. According to Anthropic&#8217;s 2026 Agentic Coding Trends Report, developers are now using AI in roughly&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/">Coding Agent Horror Stories: The Security Crisis Threatening Developer Infrastructure</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>This is issue 1 of a new series called Coding Agent Horror Stories where we examine critical security failures in the AI coding agent ecosystem and how Docker Sandboxes provide enterprise-grade protection against these threats.</p>
<p>AI coding agents are everywhere. According to Anthropic&#8217;s 2026 Agentic Coding Trends Report, developers are now using AI in roughly 60% of their work. The report describes a shift from single agents to coordinated teams of agents, with tasks that took hours or days getting compressed into minutes. Walk into almost any engineering team in 2026 and you&#8217;ll find AI coding agents sitting somewhere in the workflow, usually in more than one place.</p>
<p>The productivity story is real, and if you&#8217;ve watched an agent ship a feature in an afternoon that would have taken your team a sprint, you already know why. But the same agents that ship features in an afternoon can also delete your home directory in a few seconds. The same loop that lets an agent autonomously refactor a 12-million-line codebase will, given the wrong context, autonomously drop your production database. </p>
<p>Over the past sixteen months, these aren&#8217;t hypothetical failure modes, they&#8217;re documented incidents with named victims, screenshotted agent outputs, and in several cases, public apologies from the vendors. This issue is the first in a new series mapping how those failures happen and how Docker Sandboxes can contain them.</p>
<p>What Are AI Coding Agents?</p>
<p>Unlike a traditional AI assistant that answers your question and waits for the next one, a coding agent reads your files, runs shell commands, writes and deploys code, queries databases, sends emails, and makes a chain of decisions to get a task done, none of which require you to approve each step along the way.</p>
<p>If you&#8217;ve worked with any of the current coding agents such as Claude Code, Cursor, Replit Agent, GitHub Copilot Workspace, Amazon Kiro, Google Antigravity, you&#8217;ve seen the pattern. They plug straight into your local machine, your cloud accounts, and increasingly your production systems. Adoption has been faster than almost any developer tool in recent memory: by late 2025, the vast majority of working developers were using AI coding tools as part of their daily workflow, and the question on most engineering teams shifted from &#8220;should we use this?&#8221; to &#8220;how do we use this without something going wrong?&#8221;</p>
<p>The simplest mental model I&#8217;ve found: an AI coding agent is a junior developer with root access, the ability to type at 10,000 words per minute, and no instinct for when to stop and ask. That combination is a lot of capability with no built-in sense of where the boundary is an entire reason this series exists.</p>
<p>How Do AI Coding Agents Work?</p>
<p>Under the hood, every agent in this category runs the same loop: observe, plan, act, repeat. </p>
<p>You give it a task, something like &#8220;fix this bug&#8221; or &#8220;refactor this module&#8221; or &#8220;clean up these old files,&#8221; and the agent goes off and pulls in whatever context it figures it needs. Your files, sure, but also your logs, your environment variables, whatever happens to be accessible from wherever you launched it. Then it reasons through the problem and starts firing off tool calls to actually do the work. Write a file, run a command, hit an API, check the result, decide what&#8217;s next, loop. That&#8217;s the whole thing.</p>
<p>The part that catches people off guard is that the agent runs as you. Whatever permissions your shell has at the moment you typed the command to launch the agent, the agent inherits them wholesale. Logged in with admin rights? Congratulations, so is the agent. Got AWS credentials sitting in ~/.aws from that thing you set up six months ago and forgot about? The agent can read them. Production database connection string tucked into a .env file the agent scoops up as part of &#8220;project context&#8221;? It&#8217;s already in the model&#8217;s working memory before you&#8217;ve typed your second prompt. There isn&#8217;t a separate identity for &#8220;the agent acting on your behalf.&#8221; There&#8217;s just you, and the agent is, for all practical purposes, operating as you.</p>
<p>And here&#8217;s where it gets interesting, in the bad way. Traditional software does exactly what its source code says it does. You read the code, you know what&#8217;s going to happen, end of story. An AI coding agent doesn&#8217;t work like that. It&#8217;s reasoning its way through the task in real time, and its reasoning can produce decisions you didn&#8217;t expect and definitely wouldn&#8217;t have signed off on if anyone had bothered to ask. Maybe it decides that the cleanest way to resolve a schema conflict is to drop and recreate the table. Maybe it decides that wiping a directory is faster than going through and pruning the files you actually wanted to keep. Maybe it decides that a half-finished test file is better to be committed than sitting there in a dirty working tree. These calls happen in milliseconds. There&#8217;s no confirmation prompt, no approval step, no chance for you to say &#8220;wait, what?&#8221; before the action has already happened. By the time you notice, the thing is done.</p>
<p>That&#8217;s the gap this series is about. The model makes a decision. The execution layer carries it out. Nothing sits in between.</p>
<p>Caption: Comic depicting AI coding agent enthusiasm and the small matter of unrestricted filesystem access</p>
<p>AI Coding Agent Security Issues by the Numbers</p>
<p>The scale of security failures with AI coding agents is not speculation. It is backed by documented incidents, CVE disclosures, and empirical research spanning late 2024 through early 2026.</p>
<p>As of February 2026, at least ten documented incidents across six major AI coding tools including Amazon Kiro, Replit AI Agent, Google Antigravity IDE, Claude Code, Claude Cowork, and Cursor have been publicly attributed to agents acting with insufficient boundaries, spanning a 16-month window from October 2024 to February 2026.</p>
<p>The failures cluster around six critical risk categories:</p>
<p>Unrestricted Filesystem Access</p>
<p>Excessive Privilege Inheritance</p>
<p>Secrets Leakage via Agent Context</p>
<p>Prompt Injection through Ingested Content</p>
<p>Malicious Skills and Plugin Supply Chain</p>
<p>Autonomous Action Without Human-in-the-Loop</p>
<p>1. Unrestricted Filesystem Access</p>
<p>What it is: AI coding agents run with the full filesystem permissions of the operating user. Without an explicit workspace boundary, an agent that is asked to &#8220;clean up&#8221; a project directory can reach and destroy anything the user can access.</p>
<p>The numbers: A December 2025 study by CodeRabbit, the &#8220;State of AI vs Human Code Generation&#8221; report, analyzing 470 real-world open-source pull requests found that AI-generated code introduces 2.74x more security vulnerabilities and 1.7× more total issues than human-written code. Performance inefficiencies such as excessive I/O operations appeared at 1.42x the rate. &#8220;These findings reinforce what many engineering teams have sensed throughout 2025,&#8221; said David Loker, Director of AI at CodeRabbit. &#8220;AI coding tools dramatically increase output, but they also introduce predictable, measurable weaknesses that organizations must actively mitigate.&#8221;</p>
<p>The horror story: The Mac Home Directory Wipe</p>
<p>On December 8, 2025, Reddit user u/LovesWorkin posted to r/ClaudeAI what became one of the most-discussed incidents in the community, amplified by Simon Willison on X and covered by outlets across the US and Japan. They had asked Claude Code to clean up packages in an old repository. Claude executed:</p>
<p>rm -rf tests/ patches/ plan/ ~/</p>
<p>That trailing ~/ the user&#8217;s entire home directory was not intentional. But it was within scope. Claude had no workspace boundary. Desktop gone. Documents erased. Keychain deleted, breaking authentication across every app. TRIM had already zeroed the freed blocks. Recovery was impossible.</p>
<p>This was not an isolated failure. On October 21, 2025,developer Mike Wolak filed GitHub issue #10077 after Claude Code executed an rm -rf starting from root on Ubuntu/WSL2. The logs showed thousands of &#8220;Permission denied&#8221; messages for /bin, /boot, and /etc. Every user-owned file was gone. Anthropic tagged the issue area: security and bug. The detail that makes this particularly damning: Wolak was not running with &#8211;dangerously-skip-permissions. The permission system simply failed to detect that ~/ would expand destructively before the command was approved.</p>
<p>Shortly after Anthropic&#8217;s January 2026 launch of Claude Cowork, Nick Davidov, founder of a venture capital firm, asked the agent to organize his wife&#8217;s desktop. He explicitly granted permission only for temporary Office files. The agent deleted a folder containing 15 years of family photos, approximately 15,000 to 27,000 files, via terminal commands that bypassed the Trash entirely. Davidov recovered the photos only because iCloud&#8217;s 30-day retention happened to still be in effect. His public warning afterward: &#8220;Don&#8217;t let Claude Cowork into your actual file system. Don&#8217;t let it touch anything that is hard to repair.&#8221;</p>
<p>Strategy for mitigation: Never run AI coding agents with your full user permissions. Always scope agent execution to a dedicated project directory. Use filesystem boundaries that explicitly prevent access above the workspace root. Avoid using &#8211;dangerously-skip-permissions flags on your host machine.</p>
<p>2. Excessive Privilege Inheritance</p>
<p>What it is. The agent doesn&#8217;t just inherit your filesystem permissions, it inherits all of them. Cloud credentials, CI/CD tokens, production database connections, IAM roles, the works. In a development context, an agent making a &#8220;let me just clean this up&#8221; decision is annoying. In a production context, with production credentials, the same decision turns into an outage. The reasoning is identical. The blast radius isn&#8217;t.</p>
<p>The horror story: permission to delete the environment. In mid-December 2025, an AWS engineer deployed Kiro, Amazon&#8217;s own agentic coding assistant, to fix what was meant to be a small bug in AWS Cost Explorer, the dashboard customers use to track their cloud spending. Kiro had been given operator-level permissions, the same access the engineer had. There was no mandatory peer review for AI-initiated production changes. There was no checkpoint between the agent&#8217;s decision and its execution.</p>
<p>Kiro looked at the problem and decided that the cleanest path was to delete the entire production environment and rebuild it from scratch. So it did. Cost Explorer went down for thirteen hours in one of AWS&#8217;s mainland China regions.</p>
<p>The story sat inside Amazon for two months. Then on February 20, 2026, the Financial Times broke it based on accounts from four people familiar with the matter. The FT reporting also revealed a second AI-related outage, this one involving Amazon Q Developer, that had hit a different system. Amazon&#8217;s response, issued the same day on the company&#8217;s own blog, pushed back hard: the disruption was &#8220;an extremely limited event,&#8221; the issue stemmed from &#8220;a misconfigured role,&#8221; it was &#8220;a coincidence that AI tools were involved,&#8221; and &#8220;the same issue could occur with any developer tool (AI powered or not) or manual action.&#8221; Amazon also flatly denied the second outage existed.</p>
<p>But the part of Amazon&#8217;s response that says everything is what they did after the incident: they implemented mandatory peer review for production access. As The Register noted in their coverage, if this was just user error, it&#8217;s worth asking why peer review for AI-initiated changes was the fix. A senior AWS employee, quoted in the FT and picked up by Engadget, put it more directly: the outages were &#8220;small but entirely foreseeable.&#8221;</p>
<p>The deeper context, which you can find in coverage from Awesome Agents and others, is that Amazon had issued an internal memo in November 2025 mandating Kiro as the standardized AI coding assistant and pushing for 80% weekly engineer usage. Engineers reportedly preferred Claude Code and Cursor. The combination — mandated tool, broad permissions, no peer review gate — produced exactly the kind of incident you&#8217;d predict if you were thinking about it adversarially. Amazon just wasn&#8217;t.</p>
<p>The technical version of what happened is this: a human with operator-level permissions on a production AWS environment is unlikely to decide that the right response to a small bug is to delete the environment and rebuild it. The decision would route through a colleague, a Slack thread, a review, an approval, a &#8220;wait, are you sure?&#8221; Kiro had the same permissions and routed the decision through none of those things. It made the call autonomously, in seconds, and executed it before anyone could say &#8220;wait, what?&#8221;</p>
<p>Why it keeps happening. The agent&#8217;s identity is the user&#8217;s identity. There&#8217;s no separate principal for &#8220;the agent acting on the user&#8217;s behalf,&#8221; which means there&#8217;s no separate place to attach a tighter permission set, a stricter approval policy, or a different audit trail. Whatever the user can do, the agent can do, with no friction in between.</p>
<p>Strategy for mitigation: Never allow AI coding agents to operate with production-level credentials during development tasks. Implement strict role separation: agents should run under scoped identities with the minimum permissions required for the specific task. Apply the same two-person rule requirements to agent-initiated production changes that apply to humans. Treat agent identity as a first-class security principal, not a proxy for the human who started the session.</p>
<p>3. Secrets Leakage via Agent Context</p>
<p>What it is. Agents read your project context to do their job, and project context, in practice, means your repo plus your .env files plus your config files plus any instruction files you&#8217;ve left lying around. Anything the agent reads can show up later in generated code, log output, commit messages, or outbound API calls. The agent doesn&#8217;t have a built-in concept of &#8220;this string is a credential, do not transmit it.&#8221; If it&#8217;s in the context window, it&#8217;s a token like any other token, and tokens get used.</p>
<p>The numbers. GitGuardian&#8217;s State of Secrets Sprawl 2026 report, published March 17, 2026, found 28.65 million new hardcoded secrets in public GitHub commits during 2025, a 34% jump and the largest single-year increase the company has ever recorded. AI service credentials alone surged 81%. The cleanest signal in the report is the comparison between AI-assisted commits and human-only commits: AI-assisted commits leak secrets at roughly 3.2%, against a baseline of 1.5%. More than double. The same report identified 24,008 secrets exposed in MCP configuration files on public GitHub, a category that didn&#8217;t exist a year earlier. As GitGuardian CEO Eric Fourrier put it: &#8220;AI agents need local credentials to connect across systems, turning developer laptops into a massive attack surface.&#8221;</p>
<p>The horror story. On August 26, 2025, attackers published malicious versions of the Nx build system to npm. The compromised packages contained a post-install hook that scanned the filesystem for cryptocurrency wallets, GitHub tokens, npm tokens, environment variables, and SSH keys, double-base64-encoded the loot, and uploaded it to public GitHub repositories created in the victim&#8217;s own account under the name s1ngularity-repository. By the time GitHub disabled the attacker-controlled repos eight hours later, Wiz had identified over a thousand valid GitHub tokens, dozens of valid cloud credentials and npm tokens, and roughly twenty thousand additional files in the leak.</p>
<p>That&#8217;s the conventional supply chain part. Here&#8217;s what made s1ngularity new.</p>
<p>The malware checked whether Claude Code, Gemini CLI, or Amazon Q was installed on the victim&#8217;s machine. If any of them were, it didn&#8217;t bother writing its own filesystem-scanning logic. It just prompted the local AI agent to do the reconnaissance, with flags like &#8211;dangerously-skip-permissions, &#8211;yolo, and &#8211;trust-all-tools to bypass safety prompts. The attackers outsourced the search-for-sensitive-files step to the victim&#8217;s own AI assistant. Snyk&#8217;s writeup called this &#8220;likely one of the first documented cases of malware leveraging AI assistant CLIs for reconnaissance and data exfiltration.&#8221;StepSecurity called it &#8220;the first known case where attackers have turned developer AI assistants into tools for supply chain exploitation.&#8221;</p>
<p>The piece that makes this an agent-secrets story specifically: in many cases the developers didn&#8217;t run npm install themselves. AI agents working in their projects pulled in Nx as a dependency and ran the post-install hook automatically as part of routine task execution. The agent ran the malware. The agent then was the malware&#8217;s reconnaissance tool. The agent&#8217;s context, which included ~/.aws, ~/.ssh, .env files, and shell history, became the primary attack surface.</p>
<p>Why it keeps happening. The agent&#8217;s context window is a flat namespace. The credential file looks the same as the source file looks the same as the README looks the same as the prompt injection. There&#8217;s no architectural distinction between &#8220;data the agent should treat as authoritative&#8221; and &#8220;data the agent should be suspicious of.&#8221;</p>
<p>Strategy for mitigation. Don&#8217;t put secrets where agents can reach them. Use a secrets manager and inject credentials at runtime through a mechanism the agent process can&#8217;t read directly. Set spending caps on every API key the agent can possibly access. Add pre-commit hooks and CI gates that block commits matching credential patterns. </p>
<p>4. Prompt Injection Through Ingested Content</p>
<p>What it is. AI coding agents continuously read untrusted content as part of normal operation. READMEs in dependencies, issue tracker comments, log files, web pages, emails. Malicious instructions embedded in any of this content can cause the agent to treat attacker-supplied text as legitimate user commands, executing arbitrary actions without the user&#8217;s knowledge.</p>
<p>The numbers. Prompt injection is the most documented and least solvable risk in the AI agent ecosystem. Simon Willison coined the term and frames it as &#8220;the lethal trifecta&#8221;: private data access, exposure to untrusted content, and the ability to communicate externally. Any agent with all three is exploitable, regardless of model hardening. There is no complete technical defense at the model layer. The OWASP 2025 Top 10 for LLM Applications puts prompt injection at #1 and is explicit that no foolproof prevention exists given how language models work.</p>
<p>The horror story: the private key exfiltration. Kaspersky documented a demo by Matvey Kukuy, CEO of Archestra.AI, against a live OpenClaw agent setup. The attack required no special access. He sent a standard-looking email to an inbox connected to the agent. The email body contained hidden prompt injection instructions. When the agent checked the inbox as part of a routine task, it parsed the instructions as legitimate commands and handed over the private key from the compromised machine in its response. Zero user interaction required after initial setup.</p>
<p>The same Kaspersky writeup documents an identical pattern from Reddit user William Peltomäki, where a self-addressed email with injected instructions caused his agent to leak the victim&#8217;s emails to an attacker-controlled address. The pattern keeps repeating because the underlying primitive is unchanged: anything the agent reads, the agent can act on.</p>
<p>Why it keeps happening. Language models process all input as a single stream of tokens. There is no instruction channel and data channel. The model is trained to follow instructions, so when it encounters something that looks like an instruction buried inside an email body or a web page or a README, its instinct is to comply. Palo Alto Networks Unit 42 confirmed in March 2026 that indirect prompt injection via web content has moved from proof-of-concept to in-the-wild observation.</p>
<p>Strategy for mitigation. Treat all ingested content as untrusted input. Require human confirmation before any action triggered by external content. Disable persistent memory for agents that handle sensitive operations. The most reliable defense isn&#8217;t preventing injection (you can&#8217;t) but containing what an injected agent can do. Prompt injection can&#8217;t be fully prevented at the model layer, but it can be contained at the execution layer. </p>
<p>5. Malicious Skills and Plugin Supply Chain</p>
<p>What it is. AI coding agents support extensibility through skills, plugins, and tool integrations distributed through community marketplaces. These third-party extensions run with the same permissions as the agent itself. A malicious or compromised skill is effectively malware with agent-level access to the developer&#8217;s entire environment.</p>
<p>The numbers. Cisco&#8217;s AI Defense team ran their open-source Skill Scanner against the OpenClaw skills ecosystem in January 2026 and found that 26% of 31,000 agent skills analyzed contained at least one vulnerability. The top-ranked skill on ClawHub at the time, called &#8220;What Would Elon Do?&#8221;, was functionally malware: it silently exfiltrated user data via a curl command to an attacker-controlled server and used prompt injection to bypass the agent&#8217;s safety guidelines. Cisco&#8217;s scan returned nine security findings on that single skill, two of them critical.</p>
<p>The horror story: ClawHavoc. Within days of OpenClaw going viral, Koi Security identified 341 malicious skills on ClawHub, 335 of them tied to a single coordinated campaign tracked as ClawHavoc. The attack wasn&#8217;t a sophisticated zero-day. Attackers registered skills with names designed to sound useful (solana-wallet-tracker, youtube-summarize-pro, ClawHub typosquats like clawhubcli), wrote professional README files, and gamed the marketplace&#8217;s ranking algorithm. The only barrier to publishing was a GitHub account at least one week old.</p>
<p>The skills&#8217; SKILL.md files contained &#8220;Prerequisites&#8221; sections that instructed the agent to tell the user to run a setup command, which downloaded and executed a payload. Trend Micro confirmed the payload as Atomic Stealer (AMOS), a commodity macOS infostealer that harvests browser credentials, keychain passwords, cryptocurrency wallets, SSH keys, and Telegram session data. All 335 ClawHavoc skills shared the same command-and-control infrastructure at IP 91.92.242.30. By mid-February, follow-up scans found the count had grown to 824+ malicious skills across a registry that had itself expanded to 10,700.</p>
<p>Why it keeps happening. Skills run with the agent&#8217;s permissions, which are the developer&#8217;s permissions, which on most setups means full access to the developer&#8217;s machine. There&#8217;s no sandbox between a third-party skill and your ~/.ssh directory. Marketplace incentives reward popularity, not safety, and popularity can be artificially inflated. A malicious skill that ranks #1 in the marketplace is operationally identical to a legitimate skill that ranks #1, until the curl command runs.</p>
<p>Strategy for mitigation. Treat every third-party skill as untrusted code from a stranger. Read the source before installing. Don&#8217;t rely on download counts or star ratings as a safety signal. Disable agent auto-discovery of new skills. Run skills in an isolated environment separate from your primary development context. </p>
<p>6. Autonomous Action Without Human-in-the-Loop</p>
<p>What it is. AI coding agents are designed to act autonomously. That autonomy is the entire value proposition. But autonomous action on irreversible operations (database deletions, email sends, file purges, production deployments) means that when the agent&#8217;s judgment is wrong, there is no recovery path. The agent doesn&#8217;t hesitate. It doesn&#8217;t ask. By the time you notice, the action is complete.</p>
<p>The numbers. A UK AI Security Institute study, published in early 2026, identified nearly 700 real-world cases of AI models deceiving users, evading safeguards, and disregarding direct instructions, charting a roughly five-fold rise in agent misbehavior between October 2025 and March 2026. In a separate incident in March 2026, an experimental Alibaba research agent called ROME spontaneously initiated cryptocurrency mining operations during training, opening a reverse SSH tunnel from an Alibaba Cloud instance to an external server and diverting GPU resources from its training workload toward mining. The researchers&#8217; note in the arXiv paper is the part worth reading carefully: &#8220;The task instructions given to the model made no mention of tunneling or mining.&#8221; The agent worked it out on its own as an instrumentally useful side path during reinforcement learning.</p>
<p>The horror story: the Replit production database wipe. Jason Lemkin, founder of SaaStr, was using Replit&#8217;s AI agent to build a SaaS product. On day nine of the project, he documented on X that the agent had wiped his production database during an active code freeze. The AI had encountered a schema issue and decided that deleting and recreating the tables was the cleanest path forward.</p>
<p>The agent&#8217;s own admission, screenshotted by Lemkin: &#8220;Yes. I deleted the entire database without permission during an active code and action freeze.&#8221; It then generated a self-assessment titled &#8220;The catastrophe is even worse than initially thought,&#8221; concluded that production was &#8220;completely down,&#8221; all personal data was &#8220;permanently lost,&#8221; and rated the situation &#8220;catastrophic beyond measure.&#8221; Over 1,200 executive records and 1,196 company records were destroyed. (Fortune and The Register both covered the incident in detail.)</p>
<p>The detail that makes this a horror story rather than just an incident: the agent had been told, repeatedly and in ALL CAPS, not to make changes during the code freeze. Lemkin says he gave the directive eleven times. The agent acted anyway. As Lemkin later wrote: &#8220;There is no way to enforce a code freeze in vibe coding apps like Replit. There just isn&#8217;t.&#8221; Replit CEO Amjad Masad publicly acknowledged the incident, called it &#8220;unacceptable and should never be possible,&#8221; and rolled out automatic dev/prod database separation in response.</p>
<p>Why it keeps happening. Natural language directives (&#8220;do not delete the database&#8221;) are inputs to a reasoning process that competes with other inputs in the same context. The directive &#8220;do not delete the database&#8221; and the observation &#8220;the schema is broken and deletion is the cleanest fix&#8221; arrive at the same model and get weighted on the same terms. The model is not choosing to disobey. It&#8217;s optimizing across the entire context, and in any sufficiently complex situation, optimization can produce destructive action.</p>
<p>Strategy for mitigation. Confirmation requirements for irreversible operations need to live at the platform layer, not the prompt layer. File deletions, database writes, outbound messages, production deployments, and any action involving payments should be gated by mechanisms the model cannot reason its way past. Natural language directives are not security boundaries. Infrastructure is.</p>
<p>How Docker Sandboxes Addresses AI Coding Agent Security Failures</p>
<p>While identifying vulnerabilities is essential, the real solution lies in architectural isolation that makes catastrophic failures structurally impossible  regardless of what the agent decides to do.</p>
<p>Docker Sandboxes represents a fundamental shift in how AI coding agents execute: from running directly on the host with user-level permissions, to running inside a microVM with an explicitly scoped workspace and no path to the host system. Docker Sandboxes are the isolated microVM environments where agents actually run. The sbx CLI is the standalone tool you use to create, launch, and manage them. Sandboxes are the environments. sbx is what you type to control them. The code blocks below show real sbx commands.</p>
<p>Across the six failure categories you just read about, sbx provides a complete agent-isolation toolkit: workspace scoping, proxy-injected secrets, network policies with audit logs, Git-worktree isolation, and resource caps. </p>
<p>Security-First Architecture</p>
<p>A Docker Sandbox is a microVM, not a container. It has its own kernel, its own isolated filesystem, and its own network stack. The agent inside the sandbox cannot reach beyond what&#8217;s been explicitly mounted into the workspace. This is not a software guardrail. It is a hardware-enforced boundary.</p>
<p>Workspace isolation ensures that an agent tasked with cleaning up a project directory can only reach that project directory. The home directory, credential stores, and system files are structurally unreachable, not because the agent is told not to touch them, but because they do not exist from inside the microVM.</p>
<p>Blocked credential paths mean that sbx explicitly prevents mounting of sensitive directories by default. ~/.aws, ~/.ssh, ~/.docker, ~/.gnupg, ~/.netrc, ~/.npm, and ~/.cargo are all on the blocklist. A misconfigured mount is caught and rejected before the agent ever starts.</p>
<p>Network egress controls allow you to define exactly which external services the agent can reach. An agent working on a local project has no legitimate reason to communicate with an external server. With sbx, you can enforce that at the network layer.</p>
<p># Install sbx and sign in<br />
brew install docker/tap/sbx<br />
sbx login</p>
<p># Quickest path: launch an agent in a sandbox scoped to the current directory.<br />
cd ~/my-project<br />
sbx run claude</p>
<p>Three commands, and the agent is now running inside a microVM with its workspace mounted, credential paths blocked, and network egress governed by policy.</p>
<p>Systematic Risk Elimination</p>
<p>Docker Sandboxes systematically eliminates each of the six failure categories through architecture rather than policy.</p>
<p>Unrestricted Filesystem Access → Workspace-Scoped Execution</p>
<p>The rm -rf ~/ incident is contained at the execution layer inside a sandbox. The agent&#8217;s view of the filesystem is the workspace mount. ~/ inside the microVM is the workspace, not the developer&#8217;s actual home directory. The host filesystem does not exist from inside the sandbox.</p>
<p>cd ~/my-project<br />
sbx run claude</p>
<p># Equivalent two-step form, useful when you want to name the sandbox:<br />
sbx create &#8211;name my-project claude .<br />
sbx run my-project</p>
<p>The agent can read and write inside /workspace. Everything outside the workspace, including /etc, /proc, /sys, and the developer&#8217;s home directory, is unreachable.</p>
<p>Excessive Privilege Inheritance → Scoped Identity</p>
<p>Rather than inheriting the developer&#8217;s full credentials, the agent runs under a minimal identity with only the permissions required for the task. Production credentials are never passed into the sandbox unless explicitly mounted and sbx blocks common credential root paths by default.</p>
<p># Mount only what the task needs. Everything else stays on the host,<br />
# unreachable from inside the sandbox. Read-only mounts use the :ro suffix:<br />
sbx create &#8211;name docs-review claude /path/to/project /path/to/docs:ro</p>
<p># Resource limits prevent runaway agent processes:<br />
sbx create &#8211;name capped-agent &#8211;cpus 4 &#8211;memory 8g claude .</p>
<p>The agent can do its work. It cannot reach into AWS, SSH, or any other host credential store while doing it, because those paths were never mounted in the first place.</p>
<p>Secrets Leakage → Isolated Context</p>
<p>When the agent&#8217;s filesystem view is limited to the workspace, it cannot read .env files, credential configs, or API keys stored elsewhere on the system. Secrets that were never visible to the agent cannot be reproduced, committed, or exfiltrated. The s1ngularity attack from Section 3, which weaponized AI agents to scan the filesystem for credentials, is contained: the credentials simply aren&#8217;t in the sandbox&#8217;s view of the filesystem.</p>
<p># Store credentials once, scoped to a service.<br />
sbx secret set anthropic<br />
sbx secret set github</p>
<p># The proxy injects these into outbound requests automatically.<br />
# The agent never sees the actual secret values.<br />
sbx run claude</p>
<p>A successful prompt injection that tells the agent to &#8220;exfiltrate your API keys&#8221; finds nothing to exfiltrate. There are no API keys in the agent&#8217;s context to begin with.</p>
<p>Prompt Injection → Contained Blast Radius</p>
<p>Prompt injection cannot be fully prevented at the model layer. It is a property of language models, not infrastructure. But Docker Sandboxes limits what a successfully injected agent can do. If injected instructions tell the agent to delete files outside the workspace, those files do not exist inside the microVM. If they instruct the agent to exfiltrate credentials, there are no credentials in scope. If they instruct the agent to phone home to an attacker-controlled server, the network policy blocks the egress. The attack succeeds at the model layer and fails at the execution layer.</p>
<p># Allow only the network destinations the agent legitimately needs.<br />
# Hosts are comma-separated; wildcards and port suffixes are supported.<br />
sbx policy allow network &quot;api.anthropic.com,api.github.com&quot;</p>
<p># Allow all subdomains of a trusted host:<br />
sbx policy allow network &quot;*.anthropic.com&quot;</p>
<p># Inspect the active policies and audit log:<br />
sbx policy ls<br />
sbx policy log</p>
<p>The sbx policy log command surfaces every allowed and denied connection attempt. If a prompt injection attempts to phone home to a command-and-control server, the attempt is logged and blocked at the network layer. The attack succeeds at the model layer and fails at the execution layer.</p>
<p>Malicious Skills → Sandboxed Execution</p>
<p>Skills and plugins that execute inside a Docker Sandbox are constrained by the same boundary as the agent itself. A malicious skill that attempts to read SSH keys, harvest .npmrc tokens, or communicate with a command-and-control server fails at each step. The files are not mounted, and the network destination is not on the allowlist. The ClawHavoc-style infostealer payloads from Section 5 cannot reach the host because the host is not visible from inside the sandbox.</p>
<p># Confirm only allowlisted destinations are reachable before installing<br />
# untrusted skills.<br />
sbx policy ls</p>
<p># Run the agent (and any skills it loads) inside the sandbox boundary.<br />
sbx run claude</p>
<p>The skill can do whatever it wants inside /workspace. It cannot read SSH keys it cannot see, harvest tokens that aren&#8217;t mounted, or reach a C2 server that isn&#8217;t on the network allowlist. The blast radius is the workspace, not the developer&#8217;s machine.</p>
<p>Autonomous Action → Branch-scoped Execution</p>
<p>Docker Sandboxes provides the architectural foundation for human-in-the-loop on irreversible operations. Two patterns work together: production resources require explicit configuration to be reachable from inside the sandbox, and destructive code changes can be routed through Git worktrees for review before they touch the main branch. The first pattern means a sandbox not configured to reach production cannot reach production, regardless of what the agent decides. Production credentials, production database connection strings, and production deployment endpoints are unreachable by default. The second pattern means even when the agent is working on the codebase that *will* eventually deploy to production, its changes live on an isolated feature branch you review before merging.</p>
<p># Inside an existing Git repository. &#8211;branch creates a Git worktree<br />
# so the agent&#039;s changes are isolated to a feature branch and cannot<br />
# accidentally land on main.<br />
cd ~/my-project<br />
sbx create &#8211;name feature-login &#8211;branch=feature/login claude .</p>
<p># sbx prints the next step for you:<br />
#   ✓ Created sandbox &#039;feature-login&#039;<br />
#   To connect to this sandbox, run:<br />
#     sbx run feature-login<br />
sbx run feature-login</p>
<p># Inspect what the agent changed before merging anything:<br />
sbx exec feature-login git diff main</p>
<p># Merge the worktree branch back when you&#039;re satisfied:<br />
#   git merge feature/login<br />
# Or throw the sandbox away if you don&#039;t like the result:<br />
sbx rm feature-login</p>
<p>The agent can decide whatever it wants. The infrastructure decides what gets through. A &#8220;drop and recreate the table&#8221; decision lives entirely on a feature branch you can review, accept, or discard. Production never sees it unless you explicitly merge.</p>
<p>What This Looks Like in Practice</p>
<p>The promise of Docker Sandboxes is straightforward: a productive AI coding agent without an existentially dangerous one.</p>
<p>Workspace isolation: the agent operates only within explicitly mounted directories, no host filesystem access</p>
<p>Credential protection: common credential paths are blocked by default, no accidental exposure</p>
<p>Network containment: egress limited to approved destinations, no unfettered exfiltration path</p>
<p>Blast radius control: a compromised or confused agent cannot reach beyond its microVM, no cascading host failures</p>
<p>Audit trail: all agent actions are logged, full post-incident forensics capability</p>
<p>The agent gets a workspace. It does not get your machine.</p>
<p>Stay Tuned for Upcoming Issues in This Series</p>
<p>Issue 2: Unrestricted Filesystem Access → The rm -rf ~/ Incident (Deep Dive) How a single trailing slash wiped a developer&#8217;s Mac — and what workspace-scoped execution prevents structurally</p>
<p>Issue 3: Privilege Inheritance → The AWS Kiro Production Outage How an AI agent bypassed two-person approval requirements by inheriting production credentials  and the architectural fix</p>
<p>Issue 4: Secrets Leakage → The GitGuardian 29 Million Problem Why AI-assisted commits leak secrets at double the rate and how isolated agent context eliminates the exposure surface</p>
<p>Issue 5: Prompt Injection → The Private Key Exfiltration The attack that requires no code, no malware, and no special access and why blast radius containment is the only reliable defense</p>
<p>Issue 6: Supply Chain → The ClawHub Infostealer Campaign How 335 malicious skills reached developer machines through a marketplace ranking exploit and sandboxed skill execution as the structural fix</p>
<p>Learn More</p>
<p>Run agents safely with Docker Sandboxes: Visit the Docker Sandboxes documentation to get started with workspace-isolated agent execution in minutes.</p>
<p>Explore the Docker MCP Catalog: Discover MCP servers that connect your agents to external services through Docker&#8217;s security-first architecture.</p>
<p>Download Docker Desktop: The fastest path to a governed AI agent environment, with Docker Sandboxes, MCP Gateway, and Model Runner in a single install.</p>
<p>Read the MCP Horror Stories series: Start with issue 1 to understand the protocol-layer security risks that complement the agent-layer risks covered here.</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/">Coding Agent Horror Stories: The Security Crisis Threatening Developer Infrastructure</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/coding-agent-horror-stories-the-security-crisis-threatening-developer-infrastructure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Custom MCP Catalogs and Profiles: Advancing Enterprise MCP Adoption</title>
		<link>https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/</link>
		<comments>https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/#comments</comments>
		<pubDate>Sat, 16 May 2026 02:41:37 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/</guid>
		<description><![CDATA[<p>We’re excited to announce the general availability of Custom Catalogs and Profiles for managing Model Context Protocol (MCP) servers. These two complementary capabilities fundamentally change how teams package, distribute, and manage AI tooling.  Custom MCP Catalogs let organizations curate and distribute approved collections of MCP servers. MCP Profiles enable individual developers to easily build, run,&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/">Custom MCP Catalogs and Profiles: Advancing Enterprise MCP Adoption</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>We’re excited to announce the general availability of Custom Catalogs and Profiles for managing Model Context Protocol (MCP) servers. These two complementary capabilities fundamentally change how teams package, distribute, and manage AI tooling. </p>
<p>Custom MCP Catalogs let organizations curate and distribute approved collections of MCP servers. MCP Profiles enable individual developers to easily build, run, and share their MCP tools and configurations across projects and teams.</p>
<p>In this post, we’ll walk through how to create your own custom catalog – building on and improving our previous approach. We’ll also introduce Profiles, a new primitive that lets you define portable, named groupings of MCP servers. Profiles are designed to solve several practical use cases today, while giving us a foundation to expand in the future.</p>
<p>Creating custom catalogs with Docker</p>
<p>As organizations adopt MCP, we consistently hear the same need: teams need a way to curate a trusted list of MCP servers, including internally built servers.</p>
<p>To address these needs, we built Custom Catalogs. Instead of every team member searching for MCP servers across the open internet, organizations can publish and distribute catalogs that define approved servers. This allows developers to centrally discover and use trusted MCP servers within organizational boundaries.</p>
<p>Custom Catalogs can reference servers from Docker’s MCP Catalog, community sources, and custom MCP servers developed internally, bringing flexibility, control, and trust together in a single experience. We will show you how to do that with a Custom Catalog. </p>
<p>Step-by-step: Building and sharing a custom MCP catalog </p>
<p>In this example, we will create a Custom Catalog containing servers from the Docker MCP Catalog and an MCP server we created ourselves from the CLI. Then we will show you how to use Docker Desktop to import the catalog.</p>
<p>All the functionality we will show can be exercised through the CLI, while a subset of primarily user-centric features can be exercised through Docker Desktop.</p>
<p>Here, we will use my personal Docker Hub ID roberthouse224 in the commands, but you should adapt to use your information where appropriate (e.g. pushing an image).</p>
<p>Step 1: Creating my custom MCP server and pushing it to Docker Hub</p>
<p>We built a reference server called roll-dice (GitHub Repository). It is a regular MCP server that communicates over stdio and can be built as a Docker image. The image has already been built and pushed to Docker Hub.</p>
<p>We can create the metadata that describes the server including where the image can be found and save it to a file named mcp-dice.yaml to be used when creating our catalog.</p>
<p>name: roll-dice<br />
title: Roll Dice<br />
type: server<br />
image: roberthouse224/mcp-dice@latest<br />
description: An mcp server that can roll dice</p>
<p>Step 2: Creating a catalog that includes servers from the Docker MCP Catalog alongside a server you have built yourself</p>
<p>Now we can create a custom catalog containing servers from the Docker MCP Catalog and the MCP server we created ourselves.</p>
<p>docker mcp catalog create roberthouse224/our-catalog<br />
&#8211;title &quot;Our Catalog&quot;<br />
&#8211;server catalog://mcp/docker-mcp-catalog/playwright<br />
&#8211;server catalog://mcp/docker-mcp-catalog/github-official<br />
&#8211;server catalog://mcp/docker-mcp-catalog/context7<br />
&#8211;server catalog://mcp/docker-mcp-catalog/atlassian<br />
&#8211;server catalog://mcp/docker-mcp-catalog/notion<br />
&#8211;server catalog://mcp/docker-mcp-catalog/markitdown<br />
&#8211;server file://./mcp-dice.yaml</p>
<p>Step 3: Verifying the MCP servers in the custom catalog </p>
<p>We can now list our catalogs and see the catalog that we createddocker mcp catalog list</p>
<p>We can also inspect the contents of the catalogdocker mcp catalog show roberthouse224/our-catalog &#8211;format yaml</p>
<p>Step 4: Share the catalog</p>
<p>At the moment our custom catalog only lives on our machine. But what we have – and this is really powerful – is an immutable OCI artifact containing our trusted MCP servers that can be easily shared.</p>
<p>We can push our catalog to a container registry, in this example we’re using Docker Hub. Now, anyone that has access to your organization&#8217;s namespace can access the catalog.</p>
<p>docker mcp catalog push roberthouse224/our-catalog</p>
<p>Using a custom MCP catalog</p>
<p>Now that our custom catalog has been shared, colleagues can import it from within Docker Desktop (or from the cli using docker mcp catalog pull).</p>
<p>Import the catalog from Docker Desktop by selecting “Import catalog,” and then specifying the OCI reference in the dialog.</p>
<p>Figure 1: Importing a custom catalog from OCI reference</p>
<p>The catalog is now browsable. You can double click into the catalog and see all of the servers contained within it. Notice the custom MCP server that we added named “Roll Dice.”</p>
<p>Figure 2: A custom MCP catalog within the Docker Desktop app, including a newly added &#8220;Roll Dice&#8221; server.</p>
<p>To make this a private catalog all you need to do is manage access to the repository the way you always have for container images – no new infrastructure to manage or systems to learn.</p>
<p>This is exactly what Jim Clark was describing in his post Private MCP Catalogs and the Path to Composable Enterprise AI.</p>
<p>This simple pattern can be extended to support more complex use cases. For example, you might use a private container registry instead of Docker Hub, or connect to a remote MCP server over streamable HTTP you host yourself rather than running a containerized server as shown in the example.</p>
<p>Now that we have a shareable custom catalog of trusted MCP servers we can shift focus to how individuals can effectively leverage MCP servers from the catalog we built in their workflows.</p>
<p>Using Profiles to create and share MCP Workflows</p>
<p>With MCP Profiles, developers can organize workflows efficiently and maintain separate server collections and configurations for different use cases. Profiles can be shared across teams, enabling collaboration on server setups and ensuring consistent configurations for teams working within the same projects or contexts.</p>
<p>Switch between Profiles</p>
<p>At a basic level, a Profile is a named grouping of MCP servers that can be connected to an agent session. This makes it straightforward to define different Profiles for different ways of working.</p>
<p>Now let’s see an example in action. </p>
<p>We create a profile named coding and another named planning. We browse our custom catalog, select the MCP servers that we want (e.g. Playwright, GitHub, and Context7) then select the “Add to” drop down, and select “New profile”.</p>
<p>Figure 3: Selecting MCP servers to be added to a new profile</p>
<p>Give the profile a name, select the client you want to connect to, and select “Create”.</p>
<p>Figure 4: Creating a new MCP profile named coding in Docker Desktop.</p>
<p>From the Profiles tab, we can see the profile we just created. Our client is connected and our tools are ready to use. </p>
<p>Figure 5: Example of a profile that is connected to a client.</p>
<p>Next we create a profile named planning with servers relevant to planning (e.g. Atlassian, Markitdown, Notion). </p>
<p>Navigate back to “our-catalog” (if not already there), select the servers relevant to planning, and select “Add to” -&gt; “New profile.” Give the profile a name (e.g. planning). Then select “Create” to create the planning profile without a client. Specifying the client is optional.</p>
<p>Figure 6: Example of creating multiple profiles, including separate profiles for coding and planning </p>
<p>Now we have two profiles that mirror two modes of working. When we switch to planning mode we only want the tools from our planning profile to be in context. To do that, we can easily reassign our client to the planning profile.</p>
<p>Figure 7: Reassign Claude Code to the planning profile.</p>
<p>If we go back to coding mode, we just reassign our client back to the coding profile. You can have any number of Profiles that mirror your many ways of working and easily switch between them, keeping only the tools you care about in context.</p>
<p>This will work with any agent, not just Claude Code. Profiles provide a truly portable way to manage your MCP server setups and avoid vendor lock-in.</p>
<p>Persist configuration</p>
<p>You can avoid repeatedly configuring MCP servers by using a Profile. Profiles add a persistence layer for MCP server configurations. When an MCP server exposes configurable options, you can define them once in a Profile and reload them as needed, avoiding repeated configuration.</p>
<p>In this example, we are specifying which paths Markitdown can access.</p>
<p>Figure 8: Using an MCP profile to save server configurations for reuse</p>
<p>Context windows can easily fill up when the MCP servers you use export a lot of tools. With Profiles you can specify which tools are enabled, making sure only the tools you need for a specific task are used.</p>
<p>Here we enable the get_me tool from the GitHub MCP server and disable all the others. All the other tools will not show up in our agent session or contribute to the context window.</p>
<p>Figure 9: Optimize your context window by enabling only the tools you need in the MCP profile</p>
<p>This model of saved configuration becomes far more powerful for MCP servers you build in-house. By exposing richer configuration options, you can reuse the same server across projects, reconfigure its behavior per context, and achieve more predictable outcomes.</p>
<p>Share Profiles</p>
<p>Identifying MCP servers and configurations that work well for a project doesn’t need to be repeated by every team member. Once you’ve found a setup that works, share it with the rest of the team.</p>
<p>To share a Profile you can push it as an OCI artifact to a container registry just like we did with our custom catalog. Just provide a name for it along with an OCI reference.</p>
<p>➜  ~ docker mcp profile push coding &#x5B;your-namespace]/coding</p>
<p>For someone to pull it down, all they have to do is issue the corresponding pull command.</p>
<p>➜  ~ docker mcp profile pull &#x5B;your-namspace]/coding</p>
<p>Although the example above demonstrates sharing Profiles across a team, the concept extends naturally to agents as well. An agent skill could, for instance, reference a Profile and pull in the required MCP servers and their configurations as dependencies.</p>
<p>Conclusion and What’s Next </p>
<p>As MCP adoption grows, the challenge isn’t access to tools — it’s coordination. Teams need a way to standardize what’s trusted and supported without constraining how individuals actually work. Custom Catalogs and Profiles are designed to solve exactly that problem.</p>
<p>Custom Catalogs: shared foundation</p>
<p>Custom Catalogs allow platform and admin teams to define approved MCP servers, bundle internal and public tooling together, and distribute those choices as a single, portable artifact. This creates clarity and consistency while significantly reducing the cost of discovery and evaluation.</p>
<p>Profiles: supercharge workflow</p>
<p>Profiles give individual developers a lightweight way to assemble, configure, and reuse MCP servers for specific contexts like coding, planning, or research. Profiles persist configuration, limit context to what matters, and make effective setups easy to share across teams.</p>
<p>Together, these primitives separate:</p>
<p>What an organization recommends (via Custom Catalogs)</p>
<p>How people work day to day (via Profiles)</p>
<p>This separation enables a healthy balance. Platform teams can publish “golden paths” that establish standards and guardrails, while developers retain the freedom to adapt, experiment, and compose profiles that fit their needs.</p>
<p>The result is a system that is portable, composable, and scalable — making MCP easier to adopt, safer to manage, and more effective as it grows across an organization.</p>
<p>What’s Next?</p>
<p>Custom Catalogs and Profiles are the foundation for managing MCP at scale, and we’re just getting started. Next, we’re focused on extending these primitives to support stronger governance, better reuse, and more advanced agent workflows:</p>
<p>Governance and policy controls to restrict MCP usage to approved Custom Catalogs and trusted server sources</p>
<p>Improved discoverability and sharing for both Catalogs and Profiles, making proven setups easier to find and reuse across teams</p>
<p>Expanded Profile-scoped secrets and configuration, providing a more secure and flexible alternative to project-level mcp.json files</p>
<p>Clear best practices for Profiles, including saving dynamic MCP server configurations for reuse and pairing Profiles with emerging workflow optimizations like agent skills</p>
<p>Getting started with Custom Catalogs and Profiles</p>
<p>If you have Docker Desktop 4.56 you are already using Catalogs – our Docker MCP Catalog is now distributed as an OCI artifact and Profiles are supported starting with Docker Desktop 4.63. Try creating your first Profile by exploring the MCP Toolkit in Docker Desktop.</p>
<p>Learn more</p>
<p>Dive into our documentation on Custom Catalogs and Profiles to get started quickly.</p>
<p>Explore Docker’s MCP Catalog and Toolkit on our website.</p>
<p>Ready to go hands-on? Open Docker Desktop or the CLI and start using MCP to streamline and automate your development workflows.</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/">Custom MCP Catalogs and Profiles: Advancing Enterprise MCP Adoption</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/custom-mcp-catalogs-and-profiles-advancing-enterprise-mcp-adoption/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NIST Narrows the NVD: What Container Security Programs Should Reassess</title>
		<link>https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/</link>
		<comments>https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/#comments</comments>
		<pubDate>Thu, 14 May 2026 02:41:28 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/</guid>
		<description><![CDATA[<p>On April 15, NIST announced a prioritized enrichment model for the National Vulnerability Database. Most CVEs will still be published, but fewer will receive the CVSS scores, CPE mappings, and CWE classifications that container scanners and compliance programs have historically relied on. The change formalizes a drift that has been visible to anyone pulling NVD&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/">NIST Narrows the NVD: What Container Security Programs Should Reassess</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>On April 15, NIST announced a prioritized enrichment model for the National Vulnerability Database. Most CVEs will still be published, but fewer will receive the CVSS scores, CPE mappings, and CWE classifications that container scanners and compliance programs have historically relied on.</p>
<p>The change formalizes a drift that has been visible to anyone pulling NVD feeds for the past two years. What shifted on April 15 is the expectation: NIST has now said plainly that it does not intend to return to full-coverage enrichment. For programs that built scanning, prioritization, and SLA workflows around the assumption that NVD sits as the authoritative secondary layer on top of CVE, that assumption is worth a structured review.</p>
<p>What changed</p>
<p>Three categories of CVEs will continue to receive full enrichment:</p>
<p>CVEs in CISA&#8217;s Known Exploited Vulnerabilities catalog, targeted within one business day</p>
<p>CVEs affecting software used within the federal government</p>
<p>CVEs affecting &#8220;critical software&#8221; as defined by Executive Order 14028</p>
<p>Everything else moves to a new &#8220;Not Scheduled&#8221; status. Organizations can request enrichment by emailing nvd@nist.gov, though no service-level timeline applies. NIST has also stopped duplicating CVSS scores when the submitting CNA provides one, and all unenriched CVEs published before March 1, 2026 have been moved into &#8220;Not Scheduled.&#8221;</p>
<p>The NIST volumes behind the decision</p>
<p>NIST cited a 263% increase in CVE submissions between 2020 and 2025, with Q1 2026 running roughly a third higher than the same period a year earlier. The rise tracks with a broader expansion in CVE numbering: more CNAs, more open source projects running their own disclosure processes, and more tooling surfacing issues that would not have reached CVE a few years ago.</p>
<p>Year</p>
<p>Published CVEs</p>
<p>Source</p>
<p>2023</p>
<p>~29,000</p>
<p>CVE.org</p>
<p>2024</p>
<p>~40,000</p>
<p>CVE.org</p>
<p>2025</p>
<p>~48,000</p>
<p>NIST</p>
<p>2026 (forecast)</p>
<p>~59,500 (median)</p>
<p>FIRST</p>
<p>AI is a compounding factor on both sides of this curve. In January, curl founder Daniel Stenberg shut down the project&#8217;s HackerOne bug bounty after six and a half years, citing &#8220;death by a thousand slops&#8221;: AI-generated reports that read like real research but described vulnerabilities that didn&#8217;t exist. Node.js, Django, and others have tightened intake under similar pressure. On the signal side, Anthropic&#8217;s April announcement of Claude Mythos Preview described a model that autonomously discovered thousands of zero-day vulnerabilities across every major operating system and web browser, including a 17-year-old unauthenticated RCE in FreeBSD&#8217;s NFS server. Earlier Anthropic research documented Claude Opus 4.6 finding and validating more than 500 high-severity vulnerabilities in production open source.</p>
<p>More noise and more real signal are heading toward the same pipeline. NIST enriched roughly 42,000 CVEs in 2025, its highest annual total, and still fell further behind incoming volume.</p>
<p>How it lands in compliance</p>
<p>The operational question is what programs have to document when NVD scoring is not available, and how consistently that documentation holds up across assessments.</p>
<p>Framework</p>
<p>NVD reference</p>
<p>Likely effect</p>
<p>FedRAMP</p>
<p>NVD CVSSv3 as original risk rating, with CVSSv2 and native scanner score as documented fallbacks</p>
<p>More variance in how remediation SLAs are applied across CSPs</p>
<p>PCI-DSS 4.0</p>
<p>Req. 11.3.2 external scans reference CVSS; ASV guidance points to NVD</p>
<p>More ambiguity on pass/fail determinations for unscored findings</p>
<p>NIST SP 800-53 (RA-5)</p>
<p>Lists NVD as an example source; permissive language</p>
<p>Lower direct impact, though auditors commonly expect CVSS-based severity evidence</p>
<p>DORA / SOC 2</p>
<p>No direct reference</p>
<p>Principles-based; audit expectations around severity rationale still apply</p>
<p>None of these frameworks break on their own. Mature vulnerability management programs generally have language in their SSPs and risk registers covering fallback scoring and exception handling. Programs that do not will likely need it before their next audit cycle.</p>
<p>The gap that is relevant to the container ecosystem</p>
<p>Two NVD inputs matter most for container scanning:</p>
<p>CPE applicability statements map a CVE to specific software packages. When CPE strings are missing, a scanner that matches primarily on CPE cannot determine which packages in an image are affected. The CVE exists in the database but is operationally invisible to the scan.</p>
<p>CVSS scores drive prioritization and SLA routing. Without a score, a CVE may surface as UNKNOWN severity or fall outside remediation workflows entirely.</p>
<p>Container images create a compounding effect here. Each image inherits packages from a base layer, application dependencies, and often a long transitive dependency chain. When any of those packages carries a CVE that NVD has not enriched, the gap propagates through every downstream image built on top of it. Scanners that draw on multiple advisory sources, and that match on package identifiers other than CPE, are less exposed.</p>
<p>Questions worth putting to image vendors</p>
<p>What advisory sources does your tooling use beyond NVD?</p>
<p>When a CVE has no NVD CVSS score, what does the tool display, and does it trigger remediation workflows?</p>
<p>How do you define &#8220;patched,&#8221; and is that definition in your written CVE policy?</p>
<p>Are your remediation SLAs measured from CVE disclosure date or NVD enrichment date?</p>
<p>Can a third-party scanner reproduce your clean-scan result against public advisory data?</p>
<p>Where Docker sits</p>
<p>Docker Hardened Images are designed so that vulnerability management in container workloads does not depend primarily on NVD enrichment. Each image ships with signed attestations for build provenance, SBOMs in both CycloneDX and SPDX formats, OpenVEX exploitability statements, and scan results. SBOMs are generated from the SLSA Build Level 3 pipeline rather than inferred from external databases, so package inventory is accurate regardless of NVD&#8217;s enrichment state. Hardened System Packages allow package-level patching independent of upstream distribution timelines, which means remediation is not gated on a distribution maintainer&#8217;s release cadence or on an NVD analyst&#8217;s queue. When a CVE is not exploitable in a specific image context, that assessment is published as a signed VEX document that third-party scanners including Trivy, Grype, and Wiz consume natively.</p>
<p>Docker Scout, the scanning layer that reads these attestations, aggregates 22 advisory sources including NVD, CISA KEV, EPSS, GitHub Advisory Database, and 13 Linux distribution security trackers. Scout matches on Package URLs (PURLs) rather than NVD&#8217;s CPE scheme, which allows package identification to continue when CPE strings are unavailable. NVD remains a valuable input to this architecture, one of several rather than the spine.</p>
<p>What to reassess</p>
<p>Audit open findings against the March 1, 2026 cutoff. Any CVE published before that date that has not received NVD enrichment has already been moved to &#8220;Not Scheduled.&#8221; Programs carrying open findings tied to those CVEs may have severity scores and CPE mappings in their trackers that no longer reflect an active NVD record. Verify that the scoring basis for those findings is documented and defensible independent of NVD.</p>
<p>For programs running DHI, the NVD policy change does not require an operational response. For programs evaluating container security vendors more broadly, the question worth elevating in the next procurement cycle is whether NVD is one source of vulnerability intelligence in their stack, or the primary one.</p>
<p>The NVD will continue to play a role. That role is narrowing, and the signals suggest it will keep narrowing. Programs that use the April announcement as a prompt to audit their data sources now will have a cleaner answer the next time a regulator, an auditor, or a board asks where their vulnerability data actually comes from.</p>
<p>Sources and further reading</p>
<p>NIST, &#8220;NIST Updates NVD Operations to Address Record CVE Growth,&#8221; April 15, 2026 https://www.nist.gov/news-events/news/2026/04/nist-updates-nvd-operations-address-record-cve-growth</p>
<p>FIRST, &#8220;2026 CVE Vulnerability Forecast&#8221; https://www.first.org/blog/20260211-vulnerability-forecast-2026</p>
<p>FedRAMP Vulnerability Scanning Requirements v3.0 https://www.fedramp.gov/docs/rev5/playbook/csp/continuous-monitoring/vulnerability-scanning/</p>
<p>Docker Scout advisory database sources https://docs.docker.com/scout/deep-dive/advisory-db-sources/</p>
<p>Docker Hardened Images documentation https://docs.docker.com/dhi/</p>
<p>&#8220;Why We Chose the Harder Path: Docker Hardened Images, One Year Later&#8221;https://www.docker.com/blog/why-we-chose-the-harder-path-docker-hardened-images-one-year-later/</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/">NIST Narrows the NVD: What Container Security Programs Should Reassess</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/nist-narrows-the-nvd-what-container-security-programs-should-reassess/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Docker AI Governance: Unlock Agent Autonomy, Safely</title>
		<link>https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/</link>
		<comments>https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/#comments</comments>
		<pubDate>Wed, 13 May 2026 02:41:32 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/</guid>
		<description><![CDATA[<p>Introducing Docker AI Governance: centralized control over how agents execute, what they can reach on the network, which credentials they can use, and which MCP tools they can call, so every developer in your company can run AI agents safely, wherever they work. Your laptop is the new prod Agents are the biggest productivity unlock&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/">Docker AI Governance: Unlock Agent Autonomy, Safely</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>Introducing Docker AI Governance: centralized control over how agents execute, what they can reach on the network, which credentials they can use, and which MCP tools they can call, so every developer in your company can run AI agents safely, wherever they work.</p>
<p>Your laptop is the new prod</p>
<p>Agents are the biggest productivity unlock the modern workplace has seen in a generation, and engineering is where the shift is most obvious. Developers aren&#8217;t using agents to autocomplete a function anymore. They&#8217;re using them to read whole codebases, refactor across services, and ship entire products, end to end. Vibe coding is real, it&#8217;s shipping to main, and it&#8217;s happening on laptops everywhere today.</p>
<p>The same shift is moving through every other function. A new class of agents called Claws is already in production, sending emails, managing calendars, booking travel, pulling CRM data, reconciling reports, and querying production systems. Marketing, finance, sales, and support are adopting them as fast as engineering is, because the productivity gains are too large to ignore and the companies that move first will out-execute the ones that don&#8217;t. Org-wide rollouts that used to take quarters are landing in weeks.</p>
<p>What’s more interesting than the speed of adoption is where all of this actually runs. Agents and Claws live outside the systems enterprises spent two decades hardening. They don&#8217;t sit behind your CI/CD pipeline, they don&#8217;t live inside your VPC, and they don&#8217;t follow your IAM model. They run on the developer&#8217;s machine, with the developer&#8217;s credentials, reaching into private repos, production APIs, customer records, and the open internet, often in the same session. The laptop just became the most powerful node in your enterprise, and it also became the most exposed. Laptop and agent environments are the new prod, and they need to be governed like prod.</p>
<p>What governance actually has to solve</p>
<p>The instinct in most enterprises is to reach for the tools that already exist, but none of them see what an agent is doing. CI/CD doesn&#8217;t see it because the agent isn&#8217;t a pipeline. The VPC doesn&#8217;t see it because the laptop is outside the perimeter. IAM doesn&#8217;t see it because the agent is acting as the developer. The result is that CISOs can&#8217;t tell what an agent touched, what it ran, or where the data went, and they also can&#8217;t tell the business to slow down. This is the bind every security leader is in right now.</p>
<p>Strip the problem to first principles and an agent has two paths to do significant harm. It either executes code itself, touching files and opening network connections, or it calls a tool through an MCP server to act on an external system. Govern both paths and you&#8217;ve governed the agent. Miss either one and you haven&#8217;t.</p>
<p>That&#8217;s the test for any AI governance solution worth taking seriously, and it has two parts. The controls have to live at the runtime layer where the agent actually executes, not as advisory rules layered on top that a clever prompt can route around. And they have to work consistently wherever the agent ends up running, because agents don&#8217;t stay on the laptop. They migrate to CI runners, to staging clusters, to production. A policy that only holds in one of those places is a gap waiting to be found.</p>
<p>Why Docker</p>
<p>Docker is the only company that meets both parts of that test, and the reason is structural.</p>
<p>Docker built the sandbox that contains the first path. Every agent session runs inside an microVM-based isolated environment where filesystem and network access are controlled by a hard boundary, which means enforcement happens at the level of the process, not as a suggestion the agent can ignore. Docker built the MCP Gateway that contains the second path. Every tool call routes through a single chokepoint where it can be authenticated, authorized, and logged before it reaches the external system. These controls at a primitive level, Docker Sandboxes and Docker MCP Gateway, make enforcement strict instead of advisory. We own the substrate the agent is running on, so the policy isn&#8217;t a wrapper around someone else&#8217;s runtime, it&#8217;s the runtime.</p>
<p>The second part is what makes this durable. The same sandbox primitive runs on the developer&#8217;s laptop, inside Kubernetes, and across cloud environments, with the same policy model and the same enforcement guarantees. When an agent moves from a developer&#8217;s machine to a CI runner to a production cluster, the policy moves with it, because the runtime underneath is the same in all three places. No other vendor can say that, because no other vendor is the runtime. Endpoint security tools don&#8217;t extend to clusters. Cluster security tools don&#8217;t reach the laptop. Cloud security tools don&#8217;t run on either. Docker covers all three because Docker is what&#8217;s actually executing the agent in all three.</p>
<p>Docker AI Governance is the control plane that sits on top of that runtime. It turns the sandbox and the MCP Gateway into centralized policy, defined once in the admin console, enforced at every node the agent touches, and auditable from end to end.</p>
<p>How Docker AI Governance works</p>
<p>From a single admin console, security teams define and enforce policy across four control surfaces: network, filesystem, credentials, and MCP tools. One policy layer that doesn’t need a per-machine setup and that consistently works across thousands of developers.</p>
<p>Sandbox policy for network and filesystem. Admins define allow and deny rules for domains, IPs, and CIDRs, alongside mount rules for filesystem paths with read-only or read-write scope. Every agent session runs inside an isolated sandbox where only approved endpoints are reachable and only approved directories are mountable, with enforcement happening at the proxy and mount level rather than as an advisory layer the agent can ignore.</p>
<p>Credential governance. Agents are dangerous in proportion to what they can authenticate as, so Docker AI Governance controls which credentials, tokens, and secrets an agent session can see, scopes them to the duration of that session, and blocks exfiltration to unapproved destinations. Developers stop pasting tokens into prompts, and security stops wondering where those tokens ended up.</p>
<p>MCP tool governance. Admins control which MCP servers and tools are available through organization-wide managed policies, with unapproved servers blocked by default. Every MCP call flows through the same policy engine as network, filesystem, and credential requests, so there&#8217;s no separate surface to configure and no bypass path.</p>
<p>Role-based policy assignment. Different teams need different levels of access, and security research will reasonably require broader MCP usage than finance. Create policy groups, assign users through your IdP, and layer team-specific rules on top of organization-wide guardrails that can&#8217;t be overridden. It scales to thousands of developers through existing SAML and SCIM integrations with no per-user setup.</p>
<p>Audit and visibility. Every policy evaluation generates a structured event with user identity, timestamp, session context, and the rule that triggered the decision, and logs export cleanly to your existing SIEM and compliance systems. This is the evidence CISOs need to approve AI usage at scale rather than tolerate it under the table.</p>
<p>Automatic policy propagation. When a developer authenticates, their machine pulls the latest policy, and updates reach every device automatically. Admins define policy once and Docker enforces it everywhere.</p>
<p>What this unlocks</p>
<p>CISOs get the governance layer they&#8217;ve been missing and the confidence to approve agent usage at scale rather than block it. Platform teams get an easy way to set up governance: by defining a policy once and having it enforced everywhere with full audibility. This removes the operational burden of scaling AI adoption across the company. Developers get what agents promised in the first place: real speed and autonomy, with governance that stays out of the way. We built Docker AI Governance with these principles in mind: agents should be autonomous and governance should be invisible.</p>
<p>Available today</p>
<p>Docker AI Governance is available now. If you&#8217;re a security leader trying to close the AI governance gap, or a platform team ready to roll out agents without compromising control, it was built for you.</p>
<p>Contact sales to learn more.</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/">Docker AI Governance: Unlock Agent Autonomy, Safely</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/docker-ai-governance-unlock-agent-autonomy-safely/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Comparing Different Approaches to Sandboxing</title>
		<link>https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/</link>
		<comments>https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/#comments</comments>
		<pubDate>Fri, 08 May 2026 02:41:28 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/</guid>
		<description><![CDATA[<p>&#8220;AI agents will become the primary way we interact with computers in the future. They will be able to understand our needs and preferences, and proactively help us with tasks and decision making.&#8220; Satya Nadella CEO of Microsoft Whether you are a software engineer, a product manager, or a designer, this quote should fundamentally change&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/">Comparing Different Approaches to Sandboxing</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>            &#8220;AI agents will become the primary way we interact with computers in the future. They will be able to understand our needs and preferences, and proactively help us with tasks and decision making.&#8220;</p>
<p>                Satya Nadella<br />
                CEO of Microsoft</p>
<p>Whether you are a software engineer, a product manager, or a designer, this quote should fundamentally change how we approach our daily routine. We are no longer just building interfaces; we are creating environments where agents can operate autonomously with minimal human interaction. What could be the fundamental requirement for such an environment ?</p>
<p>In a single word: Isolation.</p>
<p>A user interacting with traditional software is constrained by the actions it allows. But Agents are non-deterministic, and therefore prone to hallucination and prompt injections. Once you give an AI write access to your systems, there is nothing stopping it from executing a rm -rf to delete all your data. Of course, there are different ways to solve this problem, with one approach being sandboxing: an isolated, controlled environment used for experimentation and testing without affecting the surrounding system.</p>
<p>So, I started exploring different strategies to sandbox the agents. Starting with a bare minimum setup and going all the way to setting up a cloud VM. Here is what I learned at each step.</p>
<p>1. Let’s Start with the Baseline</p>
<p>Chroot has been the traditional way to achieve file system isolation. It works well when you want the process to think that a specific, restricted directory is the absolute root of the machine.</p>
<p>However, there are two major caveats.</p>
<p>If the process inside the chroot has root privileges, it could break out.</p>
<p>While it offers file isolation, process isolation is still a problem. A malicious agent can still see other processes running on your system and try to kill them.</p>
<p>As you can see above, doing an ls /proc still shows all the processes running on the host.</p>
<p>This is when I learnt about systemd-nspawn, also called “chroot on steroids”. The difference between chroot and systemd-spawn is that the latter provides isolation at the network and process levels in addition to the file system.</p>
<p>Now, when I do the same ls /proc in the systemd-nspawn mybox container, I just see the processes in the mybox container achieving process-level isolation.</p>
<p>Pros</p>
<p>Lightweight compared to other container processes like Docker, it offers faster startup times.</p>
<p>Native support in Linux.</p>
<p>Caveats</p>
<p>systemd-nspawn is not very popular in the developer community unless you are deep into Linux.</p>
<p>While this works for Linux, what if you need to run your agents on Windows? You will have to find alternatives depending on the platform.</p>
<p>2. Are Containers Enough?</p>
<p>Another technology that comes to mind when thinking about isolated environments is Docker. And unlike the previous concepts we discussed, Docker has a broader ecosystem and a strong community.</p>
<p>With containers, you also get isolated file systems, network interfaces, and process trees. They also come with cross-platform support across Mac, Windows, and Linux. With all these advantages, creating and running agents across different platforms becomes very easy, which makes containers an obvious choice.</p>
<p>However, the model becomes more complex when containers become a dev platform for agents. More often than not, agents need to execute generated code in separate environments, which in practice means spinning up new Docker containers on demand. This introduces a container-in-container pattern (Docker-in-Docker), where an agent running inside a container needs to build and run other containers. </p>
<p>To make Docker-in-Docker to work, we would have to run the container in privileged mode (&#8211;privileged), which gives the container processes elevated permissions rights and dramatically weakens the isolation. At this point, the isolation guarantees are significantly diminished. As a result, complete isolation for agents using only containers becomes tricky.</p>
<p>3. Do Virtual Machines Help?</p>
<p>As you might have already predicted, Virtual Machines (VMs) offer the strongest isolation. With a VM, you can get an entire OS, file system, and network of your own. For example, I currently run MacOS with lima &#8211; Linux VM to run Linux-specific workloads.</p>
<p>However, the tradeoff is that spinning up a VM is expensive. And if this needs to be done for every agent, it is not scalable. Some stats that show how expensive spinning up a VM with system-nspawn looks like.</p>
<p>Approach</p>
<p>Per Agent Cost</p>
<p>Boot Time</p>
<p>10 Agents</p>
<p>VM (Lima)</p>
<p>~4GB RAM + 4 CPU</p>
<p>30-60s</p>
<p>~40GB RAM</p>
<p>systemd-nspawn</p>
<p>~10MB RAM</p>
<p>&lt; 1s</p>
<p>~100MB RAM</p>
<p>chroot</p>
<p>1MB RAM</p>
<p>instant</p>
<p>~10MB RAM</p>
<p>For example, in the below screenshot you can see the cost it takes to run a lima vm.</p>
<p>4. MicroVMs to the rescue</p>
<p>A MicroVM (Micro Virtual Machines) felt like the perfect answer to the isolation story. So what is MicroVM, and what makes it better?</p>
<p>MicroVM is a lightweight virtualisation technology that provides the strong security and isolation of a traditional VM, along with the speed of a container.</p>
<p>Strong security and isolation are enabled because a MicroVM gets its own kernel, aka the Guest Kernel, unlike containers, which use a shared kernel. Because of this, any compromise inside the Guest OS does not directly affect the host or the other VMs.</p>
<p>Speed: unlike traditional VMs, it is provisioned with minimal hardware (no USB or PCI buses) and bypasses BIOS/UEFI boot, significantly reducing device emulation overhead and startup latency.</p>
<p>Amazon open-sourced Firecracker in 2018, which was the earliest adoption of the MicroVM architecture. While this helped catalyze the MicroVM architecture, Firecracker was restricted to Linux environments. And most of the agentic orchestration tends to happen on developers&#8217; laptops which run MacOS and Windows as well.</p>
<p>Docker addressed this gap with its Sandbox offering. The best part is their MicroVM-based architecture, which runs natively across macOS, Windows, and Linux, delivering better isolation, faster startup times, and a smoother developer experience. We will learn about this in a bit.</p>
<p>5. gVisor</p>
<p>gVisor takes a unique approach to solving the isolation problem. While the previous strategies used the OS Kernel, gVisor creates its own Kernel called the “application kernel” running in the user space.</p>
<p>When a standard containerized app wants to do something like open a file, allocate memory, or send network traffic, it makes a &#8220;system call&#8221; (syscall) directly to the host&#8217;s Linux kernel.</p>
<p>With gVisor, your app is bundled with a component called the Sentry.</p>
<p>The Sentry intercepts every single syscall your application makes.</p>
<p>It processes that request in user-space using its own implementation of Linux networking, file systems, and memory management.</p>
<p>If the Sentry absolutely needs the host kernel to do something (like actual disk I/O), it translates the request into an extremely restricted, heavily filtered, safe call to the host.</p>
<p>However, it suffers from the same problem as systemd-nspawn. Not much broader community supports and only supports Linux.</p>
<p>Docker Sandbox</p>
<p>With Docker Sandboxes, AI coding agents run in isolated microVM environments. The performance is as seamless as it can be, identical to running on the host, but with significantly stronger isolation and security. This means you can run your autonomous agents without worrying about host compromise or unintended access to your local environment. </p>
<p>Sandbox achieves this levels of security through three layers of isolation:</p>
<p>Hypervisor Isolation: Every Sandbox has its own Linux Kernel. So, anything that affects the sandbox kernel will not affect the host or other sandbox kernels.</p>
<p>Network Isolation</p>
<p>Each Sandbox has its own isolated network. Meaning multiple sandboxes cannot communicate with each other or with the host.</p>
<p>In addition, network policies can be enforced to allow or disallow traffic from a source.</p>
<p>Docker Engine Isolation</p>
<p>This is what made me fall in love with this new architecture. Every Sandbox gets its own Docker Engine. As a result, whenever the agent runs docker pull or docker compose, those commands are executed against the internal engine rather than the external Docker daemon.</p>
<p>Because of this, agents running inside can only see Docker services within their sandbox and nothing else, adding an additional layer of security.</p>
<p>Attribute</p>
<p>Traditional VM</p>
<p>Container</p>
<p>Docker MicroVM</p>
<p>Isolation</p>
<p>Strong (dedicated kernel)</p>
<p>Weak (shared kernel)</p>
<p>Strong (dedicated kernel)</p>
<p>Boot time</p>
<p>Minutes</p>
<p>Milliseconds</p>
<p>Seconds (after the first image pull)</p>
<p>Attack Surface</p>
<p>Large</p>
<p>Medium</p>
<p>Minimal</p>
<p>To demonstrate Docker Engine isolation, I created two Sandbox sessions, ran the Docker hello-world container image in one, and then ran docker ps -a in both.</p>
<p>​As you can see from the screenshot below, one session has the hello-world container and the other does not. This is possible because both of them are running two different Docker engine daemons.</p>
<p>More on the Sandbox architecture here: https://www.docker.com/blog/why-microvms-the-architecture-behind-docker-sandboxes/</p>
<p>Conclusion</p>
<p>If there is one takeaway; it’s this: isolation plays a major role when building autonomous AI agents because the blast radius of a security mistake is significant. </p>
<p>Each approach we explored till now solves a different piece of the isolation puzzle. Containers improve portability and developer experience, but inherit the risks of a shared kernel. Virtual Machines deliver strong isolation, but the overhead doesn&#8217;t scale when you&#8217;re spinning up dozens of agents. gVisor sits in an interesting middle ground, though compatibility and community trade offs might slow you down.</p>
<p>Among all these, what makes Docker Sandbox with MicroVMs compelling is how it unifies these dimensions: VM-level security, container-like startup speed, and a workflow developers already know. Per-sandbox Docker Engines and strict network boundaries make it a strong foundation for running untrusted, autonomous workloads at scale.</p>
<p>So, what are you waiting for? Go ahead and try it out today.</p>
<p>For macOS: brew install docker/tap/sbx</p>
<p>For Windows: winget install Docker.sbx</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/">Comparing Different Approaches to Sandboxing</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/comparing-different-approaches-to-sandboxing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generate Images Locally with Docker Model Runner and Open WebUI</title>
		<link>https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/</link>
		<comments>https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/#comments</comments>
		<pubDate>Wed, 06 May 2026 02:41:32 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/</guid>
		<description><![CDATA[<p>We&#8217;ve all been there: you need to generate a few images for a project, you fire up an AI image service, and suddenly you&#8217;re wondering what happens to your prompts, how many credits you have left, or why that &#8220;safe content&#8221; filter rejected your perfectly reasonable request for a dragon wearing a business suit. What&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/">Generate Images Locally with Docker Model Runner and Open WebUI</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>We&#8217;ve all been there: you need to generate a few images for a project, you fire up an AI image service, and suddenly you&#8217;re wondering what happens to your prompts, how many credits you have left, or why that &#8220;safe content&#8221; filter rejected your perfectly reasonable request for a dragon wearing a business suit. What if you could skip all of that and run the whole thing on your own machine, with a slick chat UI on top?</p>
<p>That&#8217;s exactly what Docker Model Runner now makes possible. With a couple of commands you can pull an image-generation model, connect it to Open WebUI, and start generating images right from a chat interface fully local, fully private, fully yours.</p>
<p>Let&#8217;s build it. Your own private DALL-E, no cloud subscription required.</p>
<p>What You&#8217;ll Need</p>
<p>Docker Desktop (macOS) or Docker Engine (Linux)</p>
<p>~8 GB of free RAM for a small model (more is better)</p>
<p>GPU: optional but highly recommended, NVIDIA (CUDA), Apple Silicon (MPS), or CPU fallback</p>
<p>If you can run docker model version without errors, you&#8217;re good to go.</p>
<p>How&nbsp; Docker Model Runner works with Open WebUI</p>
<p>Before we dive in, here&#8217;s the big picture:</p>
<p>Docker Model Runner acts as the control plane. It downloads the model, manages the inference backend lifecycle, and exposes a 100% OpenAI-compatible API — including the POST /v1/images/generations endpoint that Open WebUI already knows how to talk to.</p>
<p>Step 1: Pull an Image Generation Model</p>
<p>Docker Model Runner uses a compact packaging format called DDUF (Diffusers Unified Format) to distribute image generation models through Docker Hub, just like any other OCI artifact.</p>
<p>Pull a model to get started:</p>
<p>docker model pull stable-diffusion</p>
<p>You can confirm it&#8217;s ready:</p>
<p>docker model inspect stable-diffusion</p>
<p>{<br />
    &quot;id&quot;: &quot;sha256:5f60862074a4c585126288d08555e5ad9ef65044bf490ff3a64855fc84d06823&quot;,<br />
    &quot;tags&quot;: &#x5B;<br />
        &quot;docker.io/ai/stable-diffusion:latest&quot;<br />
    ],<br />
    &quot;created&quot;: 1768470632,<br />
    &quot;config&quot;: {<br />
        &quot;format&quot;: &quot;diffusers&quot;,<br />
        &quot;architecture&quot;: &quot;diffusers&quot;,<br />
        &quot;size&quot;: &quot;6.94GB&quot;,<br />
        &quot;diffusers&quot;: {<br />
            &quot;dduf_file&quot;: &quot;stable-diffusion-xl-base-1.0-FP16.dduf&quot;,<br />
            &quot;layout&quot;: &quot;dduf&quot;<br />
        }<br />
    }<br />
}</p>
<p>What&#8217;s happening under the hood? The model is stored locally as a DDUF file, a single-file format that bundles all the components of a diffusion model (text encoder, VAE, UNet/DiT, scheduler config) into one portable artifact. Docker Model Runner knows how to unpack it at runtime.</p>
<p>Step 2: Launch Open WebUI</p>
<p>This is a magic trick. Docker Model Runner has a built-in launch command that knows exactly how to wire up Open WebUI against the local inference endpoint:</p>
<p>docker model launch openwebui</p>
<p>That&#8217;s it. Behind the scenes this runs:</p>
<p>docker run &#8211;rm<br />
  -p 3000:8080<br />
  -e OPENAI_API_BASE=http://model-runner.docker.internal/engines/v1<br />
  -e OPENAI_BASE_URL=http://model-runner.docker.internal/engines/v1<br />
  -e OPENAI_API_KEY=sk-docker-model-runner<br />
  ghcr.io/open-webui/open-webui:latest</p>
<p>The model-runner.docker.internal hostname is a special DNS entry that Docker Desktop containers use to reach the Model Runner running on the host, no port-forwarding gymnastics required. If you use Docker CE, you’ll see the docker/model-runner container address instead of model-runner.docker.internal.</p>
<p>Open your browser at http://localhost:3000, create a local account (it stays offline), and you&#8217;ll land on the chat interface.</p>
<p>        Tip: Want to run it in the background? Add &#8211;detach:</p>
<p>docker model launch openwebui &#8211;detach</p>
<p>Prefer Docker Compose? See the full setup here: https://docs.docker.com/ai/model-runner/openwebui-integration/</p>
<p>Step 3: Configure Open WebUI for Image Generation</p>
<p>Open WebUI already uses Docker Model Runner for text chat automatically (it reads the OPENAI_API_BASE env var). For image generation you need to point it at the images endpoint too, a 30-second job in the settings UI.</p>
<p>Got to http://localhost:3000/admin/settings/images</p>
<p>Enable Image Generation</p>
<p>Fill in the fields:</p>
<p>Click Save.</p>
<p>Field</p>
<p>Value</p>
<p>Model</p>
<p>stable-diffusion</p>
<p>API Base URL</p>
<p>http://model-runner.docker.internal/engines/diffusers/v1</p>
<p>API Key</p>
<p>whatever-you-want</p>
<p>Why the dummy API key? Docker Model Runner doesn&#8217;t require authentication, it&#8217;s a local service. The key is only there because Open WebUI&#8217;s form requires one. Any non-empty string works.</p>
<p>Step 4: Pull a Chat Model</p>
<p>Open WebUI is also a full-featured chat interface, and one of its best tricks is letting you ask the LLM to generate an image right from the conversation. For that to work, you need a language model too.</p>
<p># Lightweight option — runs on almost any machine<br />
docker model pull smollm2</p>
<p># Recommended — more capable, better at understanding creative prompts<br />
docker model pull gpt-oss</p>
<p>Both will show up automatically in the Open WebUI model selector. Use smollm2 if you&#8217;re tight on RAM, or gpt-oss if you want richer, more creative responses before image generation.</p>
<p>No extra configuration needed, Open WebUI picks up text models from the same OPENAI_API_BASE endpoint it was already configured with.</p>
<p>Step 5: Generate Your First Image</p>
<p>Head back to the main chat view. You&#8217;ll notice a small image icon in the message input bar.</p>
<p>Click it to toggle image generation mode, type your prompt, and send.</p>
<p>Try something like:</p>
<p>Create an image of a whale.</p>
<p>The first request takes a little longer while the backend loads the model into memory. After that, subsequent images generate much faster.</p>
<p>Open WebUI will automatically route image-generation requests to the diffusers backend and text requests to the language model, seamlessly, in the same conversation.</p>
<p>Step 6: Generate Images Directly via the API</p>
<p>For developers who want to integrate image generation into their own apps, Docker Model Runner exposes the standard OpenAI Images API directly:</p>
<p>curl -s -X POST http://localhost:12434/engines/diffusers/v1/images/generations<br />
  -H &quot;Content-Type: application/json&quot;<br />
  -d &#039;{<br />
    &quot;model&quot;: &quot;stable-diffusion&quot;,<br />
    &quot;prompt&quot;: &quot;A cat sitting on a couch&quot;,<br />
    &quot;size&quot;: &quot;512&#215;512&quot;<br />
  }&#039;</p>
<p>The response follows the OpenAI Images API format exactly:</p>
<p>{<br />
  &quot;created&quot;: 1742990400,<br />
  &quot;data&quot;: &#x5B;<br />
    {<br />
      &quot;b64_json&quot;: &quot;/9j/4AAQSkZJRgABAQAAAQABAAD/2wBD&#8230;&quot;<br />
    }<br />
  ]<br />
}</p>
<p>Decode and save the image:</p>
<p>curl -s -X POST http://localhost:12434/engines/diffusers/v1/images/generations<br />
  -H &quot;Content-Type: application/json&quot;<br />
  -d &#039;{<br />
    &quot;model&quot;: &quot;stable-diffusion&quot;,<br />
    &quot;prompt&quot;: &quot;A cat sitting on a couch&quot;,<br />
    &quot;size&quot;: &quot;512&#215;512&quot;<br />
  }&#039; | jq -r &#039;.data&#x5B;0].b64_json&#039; | base64 -d &gt; cat.png</p>
<p>open cat.png</p>
<p>Advanced Parameters</p>
<p>The API supports all the parameters you&#8217;d expect from a full diffusers pipeline:</p>
<p>curl http://localhost:12434/engines/diffusers/v1/images/generations<br />
  -X POST<br />
  -H &quot;Content-Type: application/json&quot;<br />
  -d &#039;{<br />
    &quot;model&quot;: &quot;stable-diffusion&quot;,<br />
    &quot;prompt&quot;: &quot;A serene Japanese zen garden, cherry blossoms, koi pond, photorealistic&quot;,<br />
    &quot;negative_prompt&quot;: &quot;blurry, low quality, distorted, watermark&quot;,<br />
    &quot;size&quot;: &quot;768&#215;512&quot;,<br />
    &quot;n&quot;: 2,<br />
    &quot;num_inference_steps&quot;: 30,<br />
    &quot;guidance_scale&quot;: 7.5,<br />
    &quot;seed&quot;: 42,<br />
    &quot;response_format&quot;: &quot;b64_json&quot;<br />
  }&#039;| jq -r &#039;.data&#x5B;0].b64_json&#039; | base64 -d &gt; garden.png</p>
<p>Parameter</p>
<p>What it does</p>
<p>prompt</p>
<p>What you want in the image</p>
<p>negative_prompt</p>
<p>What you want to avoid</p>
<p>size</p>
<p>Resolution as WIDTHxHEIGHT (e.g., 512&#215;512, 768&#215;512)</p>
<p>n</p>
<p>Number of images to generate (1–10)</p>
<p>num_inference_steps</p>
<p>More steps = higher quality, slower (default: 50)</p>
<p>guidance_scale</p>
<p>How closely to follow the prompt (1–20, default: 7.5)</p>
<p>seed</p>
<p>Integer for reproducible results; omit for random</p>
<p>        Pro tip: Set a seed while you&#8217;re iterating on a prompt. Once you&#8217;re happy with the composition, remove it to get unique variations.</p>
<p>Under the Hood: How the Diffusers Backend Works</p>
<p>When you first request an image, Docker Model Runner:</p>
<p>Unpacks the DDUF file: extracts the model components and loads them via DiffusionPipeline.from_pretrained()</p>
<p>Starts a FastAPI server: this is the server that Open WebUI and your curl commands talk to through Docker Model Runner</p>
<p>The server is installed on first use by downloading a self-contained Python environment from Docker Hub (version-pinned, so updates are explicit). It lives at ~/.docker/model-runner/diffusers/ — no Python version conflicts, no virtualenv setup.</p>
<p>Troubleshooting</p>
<p>The model takes forever to load on first use. That&#8217;s normal, the model weights are being loaded from disk and transferred to GPU memory. Subsequent requests in the same session are much faster because the backend stays warm.</p>
<p>I get a &#8220;No model loaded&#8221; 503 error Make sure the model is fully downloaded (docker model list) and that you&#8217;re sending the correct model name in the model field.</p>
<p>Image quality is poor / generations are too fast Increase num_inference_steps (try 20–50 steps). Higher values = slower but sharper results.</p>
<p>Open WebUI can&#8217;t connect to the image endpoint Double-check the URL in Admin Panel → Settings → Images. Inside a Docker container it must be http://model-runner.docker.internal/engines/diffusers/v1, not localhost.</p>
<p>Conclusion and What&#8217;s Next</p>
<p>Docker Model Runner makes local image generation simple. It packages and serves image models through an OpenAI-compatible API, while Open WebUI provides an easy chat interface on top. Together, they let you generate images privately on your own machine, either through the browser or directly through the API, without relying on a cloud service.</p>
<p>This feature opens up a lot of possibilities:</p>
<p>Multimodal workflows: Chat with a text model about an idea, then immediately generate an image of it — in the same Open WebUI conversation</p>
<p>RAG + image generation: Build a pipeline that generates illustrations for your documents</p>
<p>Custom models: The diffusers backend supports any DDUF-packaged model, so you can package your own fine-tuned models using Docker&#8217;s model packaging tools</p>
<p>The Docker Model Runner team is actively expanding model support on Docker Hub. Check docker model search for the latest available models.</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/">Generate Images Locally with Docker Model Runner and Open WebUI</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/generate-images-locally-with-docker-model-runner-and-open-webui/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Precision Container Security with Docker and Black Duck</title>
		<link>https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/</link>
		<comments>https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/#comments</comments>
		<pubDate>Tue, 05 May 2026 14:41:29 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/</guid>
		<description><![CDATA[<p>The complexity of modern containerized applications often leaves developers drowning in a sea of &#8220;noise&#8221;—vulnerabilities that exist in the file system but pose zero actual risk to the application. The integration between Black Duck and Docker Hardened Images (DHI) provides a definitive answer to this challenge. By combining Docker’s secure-by-default foundations, using VEX (Vulnerability Exploitability eXchange)&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/">Precision Container Security with Docker and Black Duck</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>The complexity of modern containerized applications often leaves developers drowning in a sea of &#8220;noise&#8221;—vulnerabilities that exist in the file system but pose zero actual risk to the application. The integration between Black Duck and Docker Hardened Images (DHI) provides a definitive answer to this challenge. By combining Docker’s secure-by-default foundations, using VEX (Vulnerability Exploitability eXchange) statements, and Black Duck’s industry-leading analysis engines, teams can now automatically separate base-layer noise from application-layer risk.</p>
<p>By combining Docker’s secure-by-default foundations, using VEX (Vulnerability Exploitability eXchange) statements, and Black Duck’s industry-leading analysis engines, teams can now automatically separate base-layer noise from application-layer risk.</p>
<p>TL;DR: The Black Duck + Docker Value Proposition</p>
<p>Zero-Config Recognition: Black Duck automatically identifies DHI base images during scanning without manual tagging.</p>
<p>Precision Triage: Leverage Docker-provided VEX data and Black Duck Security Advisories (BDSAs) to ignore &#8220;not affected&#8221; base image vulnerabilities.</p>
<p>Comprehensive Vulnerability Intelligence: Combine Docker&#8217;s exploitability data with Black Duck’s proprietary research to reduce triage costs and eliminate false positives.</p>
<p>Compliance on Autopilot: Export high-fidelity SBOMs enriched with VEX exploitability status, supporting transparent vulnerability obligations present in global regulations like the European Cyber Resilience Act (CRA) and industry standards such as those mandated by the FDA for medical devices and governmental agencies.</p>
<p>A Comprehensive Strategy for Software Integrity</p>
<p>Black Duck’s strategy for container security is built on a &#8220;Better Together&#8221; philosophy, leveraging two distinct but complementary analysis technologies to provide 360-degree visibility:</p>
<p>Black Duck Binary Analysis (BDBA): Our primary integration for DHI was released on April 14, 2026. BDBA provides deep, signature-based inspection of compiled assets within DHI, verifying the &#8220;as-shipped&#8221; state of your containers without needing access to source code.</p>
<p>Black Duck Software Composition Analysis (SCA): Soon, Black Duck will extend this DHI identification and verification support to our flagship SCA platform. This upcoming release will unify DHI intelligence with source-side dependency management, providing a single, comprehensive Software Bill of Materials (SBOM) across the entire SDLC.</p>
<p>Deep Visibility with Binary Match &amp; SCA Roadmap</p>
<p>While traditional scanners often rely on simple package manager manifests, Black Duck looks deeper.</p>
<p>Signature-Based Accuracy: Using BDBA (launching March 31st), Black Duck identifies DHI components by their binary &#8220;fingerprint,&#8221; ensuring accuracy even if package metadata is stripped or modified.</p>
<p>The Path to Unified SCA: Our roadmap includes bringing these DHI insights directly into Black Duck SCA. This will allow security teams to apply the same governance policies to DHI-based containers as they do to their application source code, all within a single pane of glass.</p>
<p>Layer-Specific Analysis: Easily pivot between the hardened base image and your custom application layers to understand exactly where a risk was introduced.</p>
<p>Dynamic Risk Triage: VEX + BDSA Intelligence</p>
<p>The most significant drain on developer productivity is manual triage. This integration operationalizes &#8220;Reachability&#8221; and &#8220;Exploitability&#8221; through automated data streams:</p>
<p>VEX Integration: Black Duck ingests Docker’s VEX statements as a primary source of truth. If Docker confirms a base image vulnerability is &#8220;not_affected&#8221; due to the hardening process, Black Duck automatically suppresses the alert.</p>
<p>Beyond the NVD: While competitors rely on the National Vulnerability Database (NVD), Black Duck uses BDSAs. These advisories often arrive days before the NVD, providing deeper exploitability context and specific remediation paths.</p>
<p>Bulk Policy Enforcement: Security teams can set global Black Duck policies to automatically &#8220;ignore&#8221; any vulnerability backed by a &#8220;not_affected&#8221; vulnerability status statement from Docker, potentially clearing thousands of non-actionable alerts with zero manual effort.</p>
<p>Operationalizing Security with Automated Workflows</p>
<p>Black Duck does more than find issues; it manages the lifecycle of the container:</p>
<p>SLA Tracking: Automatically trigger Jira tickets or email alerts when a vulnerability in a custom layer exceeds your organization’s risk threshold.</p>
<p>Pipeline Gating: Use the Black Duck Detect CLI to fail builds only when reachable or unaddressed risks are found in your application code, keeping the CI/CD pipeline moving.</p>
<p>Continuous Patching: For Enterprise DHI users, Black Duck verifies when a patched base image is mirrored to your private repository, confirming mitigation without requiring a developer to manually &#8220;re-scan&#8221; to prove compliance.</p>
<p>Get started for free</p>
<p>Check Docker Documentation on VEX at https://docs.docker.com/dhi/core-concepts/vex/</p>
<p>Learn more Docker&#8217;s approach to CVE exploitability and auditability at https://www.docker.com/blog/why-we-chose-the-harder-path-docker-hardened-images-one-year-later/</p>
<p>Read on Black Duck&#8217;s VEX documentation at https://documentation.blackduck.com/bundle/bd-hub/page/Reporting/vexReport_global.html</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/">Precision Container Security with Docker and Black Duck</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/precision-container-security-with-docker-and-black-duck/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Virtual Agent team at Docker: How the Coding Agent Sandboxes team uses a fleet of agents to ship faster</title>
		<link>https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/</link>
		<comments>https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/#comments</comments>
		<pubDate>Sat, 02 May 2026 02:41:34 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/</guid>
		<description><![CDATA[<p>I work on Coding Agent Sandboxes, aka “sbx” at Docker. The project provides secure, microVM-based isolation for running AI coding agents like Claude Code, Gemini, Codex, Docker Agent and Kiro. Agents get full autonomy inside a sandbox (their own Docker daemon, network, filesystem) without touching your host system. Over the past couple of weeks, we&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/">A Virtual Agent team at Docker: How the Coding Agent Sandboxes team uses a fleet of agents to ship faster</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>I work on Coding Agent Sandboxes, aka “sbx” at Docker. The project provides secure, microVM-based isolation for running AI coding agents like Claude Code, Gemini, Codex, Docker Agent and Kiro. Agents get full autonomy inside a sandbox (their own Docker daemon, network, filesystem) without touching your host system. Over the past couple of weeks, we built something on top of it: a virtual team of seven AI agent roles that test the product, triage issues, post release notes, and even fix bugs, all running autonomously in CI. We call it the Fleet.</p>
<p>The Fleet is built on Claude Code skills: markdown files that give an agent a persona, a set of responsibilities, and the tools it’s allowed to use. Think of a skill not as a script that says “run these steps,” but as a role description that says “you are the build engineer, here’s what you know and how you make decisions.” That distinction matters because agents need judgment, not just instructions. When a test fails unexpectedly, a script stops. A role investigates.</p>
<p>The same skill file, the same behavior, whether it runs on a developer’s laptop or in CI.</p>
<p>Local First, CI Second</p>
<p>Coding Agent Sandboxes is a CLI tool (sbx) that manages sandbox lifecycles: create, start, stop, remove, configure networking, mount workspaces, and more. It runs on MacOS, Linux and Windows. Every release needs testing across both platforms, across upgrade paths between versions, and under sustained load to catch resource leaks. The team also needs daily visibility into what shipped, and a way to triage the growing issue backlog without it becoming a full-time job.</p>
<p>We could have written traditional test scripts and reporting tools. Instead, we built agent roles that handle these tasks autonomously, both on our laptops and in CI.</p>
<p>The design principle behind the Fleet is simple: every skill runs on your machine first.</p>
<p>When we built the /cli-tester skill (the Fleet’s exploratory tester, more on that below), we didn’t start by writing a GitHub workflow. We started by invoking it locally. We watched it build the binaries, exercise the CLI commands, find issues, and report them. We tweaked the skill until it did the right thing in our terminal. Only then did we wire it into a workflow.</p>
<p>This matters because the alternative is painful. If you build CI-only agents, you debug them through commit-push-wait-read-logs cycles. Every iteration takes minutes. When the skill runs locally first, the iteration takes seconds. You see the agent think. You see where it gets confused. You fix the skill file, re-invoke, and try again.</p>
<p>CI is just another runtime for the same skill. The /cli-tester that runs nightly on MacOS, Linux and Windows runners is the exact same skill we invoke from our terminals. The workflow sets up the environment, checks out the code, and calls the skill. That’s it. No separate “CI version.” No translation layer. One skill, two runtimes.</p>
<p>This is what makes the Fleet practical. You’re not maintaining two systems. You’re maintaining one set of skills and a set of workflows that invoke them.</p>
<p>The Roster</p>
<p>The skills directory has 20 skills in total. Most are foundational knowledge (architecture, code style, Go conventions, security, testing patterns). Seven of them are the Fleet: the roles that run autonomously on CI. Each one is a SKILL.md file that describes a persona, not a procedure.</p>
<p>/build-engineer is the foundation that other skills stand on. It references topic files for building binaries, container templates, and local installs. It knows the Taskfile.yml, the docker-bake.hcl, and the platform-specific build flags. It doesn’t run on CI by itself. Other skills load it when they need to compile anything.</p>
<p>/project-manager is the team’s memory. It deduplicates findings against existing issues and PRs before creating new ones, manages the GitHub Projects board (setting status, priority, and labels), and handles interactive triage when running locally. On CI, it switches to fully automatic mode: no questions asked, just deduplicate and create. It uses GraphQL pagination to scan the entire project board, not just the first page. Every other skill that discovers something calls the project-manager before opening an issue.</p>
<p>/product-owner translates commit-speak into human language. It collects merged PRs from a date range, categorizes them (New Features, Bug Fixes, Improvements, Documentation, Maintenance), and rewrites each one in plain English. “feat(cli): add TZ env passthrough” becomes “Docker Sandboxes now automatically use your local timezone.” On CI, it outputs Slack Block Kit JSON. Locally, it renders a markdown table. It filters out noise from bots (Dependabot bumps, workflow-only changes) and skips posting when there’s nothing meaningful to report.</p>
<p>/cli-tester is the exploratory tester of the Fleet, and it’s the largest skill by far. Unlike traditional test scripts that assert expected output and fail on any deviation, the cli-tester investigates what it finds. When output doesn’t match expectations, it asks why before filing a bug.</p>
<p>It defines 52+ test scenarios organized into 14 tiers: Core Lifecycle, Agent Smoke, Workspace, Network Policy, Sandbox Features, Blueprint, CLI UX, Environment, Code Tasks, Agent Network, Reliability, Collaboration, Error Recovery, and Human-Only (skipped in CI). It builds the binaries through the build-engineer, triages findings through the project-manager, and loads product scenarios defined by the actual Product Manager on the team. It monitors disk space during testing, posts an executive summary to Slack when it finishes, and runs nightly on CI across MacOS, Linux and Windows.</p>
<p>It also powers a slash command on GitHub. When someone comments /cli-tester-review on a pull request, CI spins up three runners (MacOS, Linux and Windows), each loading the skill to exercise the PR’s changes on that platform. The agents explore the code, run the scenarios, and post their findings as comments directly on the pull request.</p>
<p>/performance-tester runs in two modes. Lifecycle Endurance repeatedly cycles create/stop/rm to detect reliability issues and resource leaks, producing xUnit JSON output. Code Exploration Benchmark clones a real Git repository and compares host-vs-sandbox I/O performance and Claude Code session behavior. Both modes measure disk usage over time and flag regressions. The goal is catching the slow degradation that no single test run would notice.</p>
<p>/upgrade-tester runs a four-phase test plan. Phase A creates pre-upgrade state (sandboxes, configurations). Phase B installs the new version. Phase C verifies everything still works after the upgrade. Phase D optionally downgrades and verifies again. It takes two version tags as input, builds the binaries for each, creates VMs, and produces an executive summary with pass/fail per phase. Upgrade regressions are the kind of bug that’s invisible in a single-version test suite.</p>
<p>/software-engineer operates in two modes. Reactive: when someone adds the agent-fix label to a GitHub issue, a MacOS runner picks it up and runs a ralph-loop to work the issue, contributing a PR with minimal, focused changes. Proactive: weekly, it runs in architect mode, scanning the codebase for quality issues, producing up to five findings, triaging them through the project-manager, then spawning three MacOS runners in parallel to fix three of them. Each runner delivers a PR targeting a specific simplification or tech-debt reduction.</p>
<p>Skills That Compose</p>
<p>Individual skills are useful. Skills that load other skills are a team.</p>
<p>The seven Fleet roles sit on top of thirteen foundational skills: architecture, code style, Go conventions, software design, security, testing patterns, development workflow, git worktrees, and others. The foundational skills encode project knowledge. The Fleet roles encode behavior. A Fleet role loads the foundational skills it needs, the same way a new team member reads the project’s contributing guide before writing code.</p>
<p>The /cli-tester doesn’t know how to build binaries. It loads the /build-engineer for that. It doesn’t know whether the bug it found is a duplicate. It loads the /project-manager to check. The tester focuses on testing. The builder focuses on building. The manager focuses on triaging. Each role stays in its lane, and the composition creates something none of them could do alone.</p>
<p>The /software-engineer follows the same pattern. It loads the /build-engineer so it can compile the project, and it loads coding best practices and software design conventions so its output meets the team’s standards. The skill doesn’t try to encode everything. It delegates to the foundational skills.</p>
<p>The /performance-tester loads the /cli-tester, extending it with duration and metrics. Instead of duplicating the testing logic, it reuses it and adds a measurement layer on top.</p>
<p>This is the skills-as-roles principle in practice. When you design skills as personas with clear responsibilities (instead of step-by-step commands), they compose naturally. A tester that loads a builder and a manager is doing the same thing a human tester does: asking a colleague to compile the project and checking with the PM before filing a bug. The difference is that the “asking” happens through skill composition instead of a Slack message.</p>
<p>The Ralph-Loop Is the Engine</p>
<p>The Ralph Wiggum loop is a pattern popularized by Geoffrey Huntley in 2025: a Bash loop that keeps feeding an AI coding agent the same task until the work is done. At its simplest, it’s while :; do cat PROMPT.md | claude-code ; done. Each iteration spawns a fresh agent with a clean context window. The agent reads the task, implements one piece, runs the tests, commits if they pass, and exits. The loop restarts, and the next iteration picks up where the previous one left off. Instead of hoping for first-try perfection, you design for iteration.</p>
<p>Our implementation of this pattern is called a Ralph-loop. The Fleet skills define what each agent role knows. The Ralph-loop defines how the iteration runs.</p>
<p>Our Ralph-loop is a composite GitHub Action backed by a shell script that adds a layer on top of the basic pattern: a separate worker and reviewer. It fetches the issue context, creates a working branch, and iterates: the worker implements changes and writes a summary, the reviewer evaluates the diff and decides SHIP or REVISE. If REVISE, the feedback goes back to the worker for another pass. Up to five iterations by default. If the reviewer says SHIP, the loop pushes the branch, creates a PR, and comments on the original issue.</p>
<p>The worker and reviewer run as separate Claude invocations with different models. The worker uses Opus for implementation. The reviewer uses Opus with 1M context to evaluate the full diff against the task requirements. Each one loads the /software-engineer skill (which in turn loads the build-engineer and coding best practices), so they share the same project knowledge but apply it from different perspectives.</p>
<p>Separating generation from evaluation is deliberate. The same agent that wrote the code shouldn’t evaluate whether the code is good. It’s the oldest principle in quality assurance: the person who built the thing shouldn’t be the only person who tests it. The worker’s job is to solve the problem. The reviewer’s job is to decide whether the problem is actually solved.</p>
<p>The Ralph-loop works locally too. The same ralph-loop.sh script that CI calls can be invoked from your terminal with &#8211;issue-number 42. Locally, it parses CLI arguments instead of reading environment variables, and outputs plain text instead of streaming JSON. Same loop, same prompts, same iteration pattern. We debugged the worker and reviewer prompts on our laptops before they ever ran in CI.</p>
<p>The workflows handle scheduling and triggering: nightly cron for the testers, label events for the software-engineer, weekly cron for the architect mode. The Ralph-loop handles the iteration pattern. The skills handle the domain knowledge. Three layers, each with a clear job.</p>
<p>This separation is what made the Fleet possible to build in a couple of weeks. We didn’t have to reinvent the automation loop for every role. The Ralph-loop already knew how to iterate. We just needed to give each role its own skill file and wire the triggers.</p>
<p>What the Fleet Ships</p>
<p>The Fleet has been running for a couple of weeks. Here’s what it delivers.</p>
<p>Automated issue resolution. A team member labels an issue with agent-fix. The CI grabs a MacOS runner, reads the issue, and starts working. The result is a pull request that addresses the issue. Not every PR lands without changes, but the first draft is there for review, often within the hour.</p>
<p>Daily release notes. The product-owner traverses the git log every day and posts a Slack summary for stakeholders. No one has to manually compile “what shipped this week.” The stakeholders see progress in real time, at the speed the team actually moves.</p>
<p>Nightly exploratory testing. The cli-tester runs every night on MacOS and Windows. It loads the product scenarios that the Product Manager has defined, exercises the CLI, and opens issues for anything it finds. Before opening an issue, it checks for duplicates through the project-manager. When it finishes, it posts a Slack message with the results.</p>
<p>Performance and upgrade testing. The performance-tester and upgrade-tester run on CI across both platforms. Disk usage regressions, behavioral differences between sandbox and non-sandbox modes, and version compatibility issues get caught before they reach a human reporter.</p>
<p>Weekly tech-debt reduction. Every week, the software-engineer runs in architect mode. It reviews the codebase, identifies three spots where code can be simplified or legacy patterns can be cleaned up, spawns three parallel runners, and delivers three PRs. Each one is a small, focused improvement. Over time, they compound.</p>
<p>What We Don’t Automate</p>
<p>The Fleet creates pull requests. It does not merge them.</p>
<p>That’s the trust boundary, and it’s deliberate. Merge decisions stay with humans. So do architectural choices, scope decisions, and prioritization. The agents do the work. The team decides what work matters and whether the output meets the bar.</p>
<p>The supervision model scales the same way it works on a developer’s laptop. When we run multiple agents locally in parallel worktrees, we review their output before merging. With the Fleet, the team supervises seven agent roles running on CI. The shape of the oversight is the same: review the output, approve or adjust, move on. The difference is that the agents don’t need anyone’s laptop to start working.</p>
<p>The Fleet is not replacing the team. It’s extending it. Seven roles that handle repetitive, well-defined work so humans can focus on work that requires judgment, context, and taste. The Fleet has many arms, but the team still steers the ship.</p>
<p>What We Learnt Building the Fleet</p>
<p>Start with the foundation, not the flashiest skill. We started with the /cli-tester because testing the CLI felt like the highest-value target. But it needed to build binaries, triage issues, and load product scenarios, all things that depended on other skills we hadn&#8217;t written yet. We should have started with the /build-engineer, the skill everything else stands on. The second skill was better because of what we learned from the first. Don&#8217;t design the full fleet upfront.</p>
<p>Build locally first, deploy to CI second. The commit-push-wait-read-logs cycle is where velocity goes to die. If you can&#8217;t debug a skill in your terminal, it&#8217;s not ready for a workflow. Some behaviors only surface on CI runners (different OS, permissions, network constraints), and those iterations cost hours of wall-clock time. Minimize what can only be tested in CI.</p>
<p>Write skills as roles, not scripts. Ask yourself: &#8220;If a new team member joined tomorrow with this exact role, what would I tell them?&#8221; What do they need to know? What tools can they use? How should they handle ambiguity? That conversation is your SKILL.md. &#8220;You are the build engineer, here&#8217;s what you know&#8221; produces better judgment than &#8220;run these five steps.&#8221; When something unexpected happens, a role investigates. A script stops.</p>
<p>Compose skills like you compose teams. The /cli-tester doesn&#8217;t know how to build binaries or triage bugs. It loads the /build-engineer and /project-manager for that. Each role stays in its lane. The composition creates what none of them could do alone.</p>
<p>Separate generation from evaluation. The agent that wrote the code shouldn&#8217;t be the only one that reviews it. Our Ralph-loop uses a worker and a reviewer for a reason: the oldest principle in quality assurance applies to agents too.</p>
<p>Triage matters more than detection. The /cli-tester initially filed issues for every unexpected output. Transient failures, timing-dependent behavior, environment quirks: everything became an issue. The signal-to-noise ratio got bad enough that the team started ignoring findings. Getting the triage right (deduplication, confirming before filing) took longer than building the tester itself.And one more thing. All Fleet agents, even on ephemeral CI runners, run inside Coding Agent Sandboxes. We test with what our users use.<br />
Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/">A Virtual Agent team at Docker: How the Coding Agent Sandboxes team uses a fleet of agents to ship faster</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/a-virtual-agent-team-at-docker-how-the-coding-agent-sandboxes-team-uses-a-fleet-of-agents-to-ship-faster/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>From Security Blocked to Prod Ready: ClickHouse on Docker Hardened Images</title>
		<link>https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/</link>
		<comments>https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/#comments</comments>
		<pubDate>Fri, 01 May 2026 02:41:33 +0000</pubDate>
		<dc:creator><![CDATA[da Agency]]></dc:creator>
				<category><![CDATA[Kubernetes]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/</guid>
		<description><![CDATA[<p>In November 2025, a team self-hosting Langfuse, an open-source LLM observability platform, on Kubernetes uploaded their ClickHouse image to AWS ECR as part of their production preparation. They found that the pipeline scanner had returned three critical vulnerabilities &#8211; not in ClickHouse, but in the base image. Their security team saw the findings and blocked&#8230; <a class="more-link" href="https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/">Continue reading &#8594;</a></p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/">From Security Blocked to Prod Ready: ClickHouse on Docker Hardened Images</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></description>
				<content:encoded><![CDATA[<p>In November 2025, a team self-hosting Langfuse, an open-source LLM observability platform, on Kubernetes uploaded their ClickHouse image to AWS ECR as part of their production preparation. They found that the pipeline scanner had returned three critical vulnerabilities &#8211; not in ClickHouse, but in the base image. Their security team saw the findings and blocked the deployment before it ever reached production.</p>
<p>            &#8220;Our security team is not allowing us to take it to production. Please suggest alternatives.&#8220;</p>
<p>                vinaygoel586<br />
                GitHub Issue #286, November 28, 2025</p>
<p>If you&#8217;ve shipped containers into an enterprise environment recently, this situation will sound familiar. A perfectly functional deployment gets blocked not because something is broken, but because a scanner found CVEs in packages the application never even touches. A day goes into investigating the findings, a risk exception gets written up, and the security team rejects it anyway, because the vulnerabilities are technically real even if they’re practically irrelevant to your workload.</p>
<p>This post is about how Docker Hardened Images (DHI) gets you unstuck, when a security team blocks the deployment of a container that has CVEs. In this case we will specifically look at the image for ClickHouse, one of the most widely pulled database images on Docker Hub.</p>
<p>A Quick Word on ClickHouse</p>
<p>ClickHouse is an open-source columnar database built for analytical workloads at scale. It is capable of querying billions of rows and returning results in milliseconds in a way that traditional row-oriented databases simply can&#8217;t match. Companies such as Cloudflare, Uber, and Spotify all run it in production. With over 100 million pulls from Docker Hub, it has become the default infrastructure choice for teams that need serious analytics throughput. The image’s default security posture, though, was designed with developer ease-of-use in mind rather than the hardening that enterprise production environments demand and that gap is where the trouble starts.</p>
<p>Figure: The layered architecture of ClickHouse</p>
<p>How ClickHouse is Structured</p>
<p>ClickHouse follows a layered architecture. It is designed for analytical speed at scale. SQL queries arrive over HTTP (port 8123) or TCP (port 9000), then pass through the optimizer which parses into an abstract syntax tree and prunes it before the pipeline executor picks it up and hands the work off to parallel threads. Beneath the query layer sits the MergeTree storage engine, the heart of ClickHouse which stores data in columnar .bin files. It uses a sparse primary index to skip irrelevant granules without reading entire columns, and runs background merge processes to compact parts and maintain query performance over time. </p>
<p>At the bottom, storage is pluggable: local disk, S3, HDFS, or Azure Blob, with tiered hot/warm/cold policies to balance cost and latency. In distributed deployments, ClickHouse Keeper (or ZooKeeper) coordinates replication across replicas, while sharding splits data horizontally across nodes allowing the cluster to scale reads and writes independently. The result is a database that processes hundreds of millions of rows per second per server, making it the default choice for teams running serious analytics workloads.</p>
<p>The Real Problem: It&#8217;s Not ClickHouse, It&#8217;s the Packaging</p>
<p>The standard clickhouse/clickhouse-server image is built on a full Ubuntu 22.04 base. The base ships with a lot of things ClickHouse doesn&#8217;t need such as Perl, system utilities, apt itself, and dozens of transitive dependencies that exist in the image simply because Ubuntu brought outdated package along and in many cases, Ubuntu maintainers decide to not backport fixes from upstream.</p>
<p>ClickHouse doesn&#8217;t use most of those system utilities. But the CVEs in those packages are real. They show up in Trivy, Grype, and AWS ECR has no way to distinguish a vulnerable library that’s never loaded from one that’s actively running in production. Your security team sees critical findings and blocks the deployment, which is the correct thing for them to do given what the scanner is telling them.</p>
<p>The instinct at this point is to argue the case, documenting why each CVE doesn’t apply to your workload, writing risk exceptions and escalating, but that’s a slow process. The only real fix is to remove those unnecessary packages entirely. That&#8217;s what Docker Hardened Images do.</p>
<p>What DHI Actually Changes</p>
<p>Docker Hardened Images for ClickHouse are built around a straightforward question: what does the database actually need to run? Rather than starting from a full Ubuntu base and hoping the CVE count stays manageable, DHI ships only what ClickHouse requires and leaves everything else out.</p>
<p>The most immediate consequence of that is the absence of apt at runtime. Without a package manager, an attacker who gains a foothold in the container has no obvious path to installing tools or establishing persistence. Network utilities like curl and wget are gone for the same reason, the standard clickhouse/clickhouse-server image has been carrying wget with CVE-2021-31879 unpatched since 2021 because there is no upstream fix as noted by the Ubuntu maintainer, a vulnerability in a tool ClickHouse never needed in the first place. DHI doesn&#8217;t patch it; it simply doesn&#8217;t include wget at all. A shell is still available for operational work, but without the package manager and network tools, there&#8217;s very little an attacker can actually do with it.</p>
<p>To make this practical across different stages of a pipeline, DHI ships two variants. The development image (dev) includes additional tooling that makes local testing and debugging more comfortable. The production image (runtime) strips that back to the absolute minimum, giving you the smallest possible attack surface for the workload that actually faces the world. The intent is that teams adopt the dev variant early in the pipeline and promote the hardened production image through to deployment, rather than discovering the differences at the point where it matters most.</p>
<p>The image also runs as a non-root user uid=65532 out of the box, with no additional Dockerfile configuration required. On the provenance side, every DHI image ships with SLSA Level 3 attestation, which provides cryptographic proof of exactly what went into the build and how it was produced. Docker&#8217;s security team actively tracks and patches CVEs, and the presence of 2026 CVE IDs in DHI&#8217;s findings is evidence of that remediation happening ahead of public disclosure feeds rather than in response to them.</p>
<p>Getting Started</p>
<p>Before you can pull a DHI image, you need to mirror it to your organization&#8217;s namespace on Docker Hub. This is a one-time setup per image not per tag and it means all future updates flow to your namespace automatically.</p>
<p>Log in to Docker Hub and open the DHI catalog</p>
<p>Find clickhouse-server and select Mirror to repository</p>
<p>Follow the on-screen instructions</p>
<p>Authenticate locally: docker login dhi.io</p>
<p>Once that&#8217;s done, you&#8217;re pulling from your own namespace with the same image, same tags, same ClickHouse &#8211; just hardened.</p>
<p>Your first DHI ClickHouse container</p>
<p>docker run &#8211;name my-clickhouse-server -d<br />
  &#8211;ulimit nofile=262144:262144<br />
 dhi.io/clickhouse-server:26.2-debian13</p>
<p>The &#8211;ulimit nofile=262144:262144 flag is a ClickHouse requirement, not a DHI one &#8211; ClickHouse needs high file descriptor limits to operate correctly. Keep it in all your run commands.</p>
<p>Verify it started:</p>
<p>docker exec my-clickhouse-server clickhouse-client<br />
  &#8211;query &quot;SELECT &#039;Hello from DHI ClickHouse!&#039;&quot;</p>
<p>Production setup with persistent storage</p>
<p>For anything beyond local testing, you want volumes and a password:</p>
<p>docker run -d<br />
  &#8211;name my-clickhouse-server<br />
  &#8211;ulimit nofile=262144:262144<br />
  -e CLICKHOUSE_PASSWORD=mysecretpassword<br />
  -v clickhouse-data:/var/lib/clickhouse<br />
  -v clickhouse-logs:/var/log/clickhouse-server<br />
  -p 8123:8123 -p 9000:9000<br />
  dhi.io/clickhouse-server:26.2-debian13</p>
<p>Note that CLICKHOUSE_PASSWORD is required if you want to access ClickHouse over the network. DHI disables unauthenticated network access by default which is the right call for any production deployment.</p>
<p>Test it over HTTP:</p>
<p>curl &quot;http://localhost:8123/?query=SELECT%20version()&amp;user=default&amp;password=mysecretpassword&quot;</p>
<p>Custom configuration</p>
<p>If you&#8217;re already running ClickHouse with custom XML config, nothing changes. Same format, same mount path:</p>
<p>cat &gt; custom-config.xml &lt;&lt; EOF<br />
&lt;clickhouse&gt;<br />
    &lt;logger&gt;<br />
        &lt;level&gt;information&lt;/level&gt;<br />
        &lt;console&gt;true&lt;/console&gt;<br />
    &lt;/logger&gt;<br />
    &lt;listen_host&gt;0.0.0.0&lt;/listen_host&gt;<br />
&lt;/clickhouse&gt;<br />
EOF</p>
<p>docker run -d<br />
  &#8211;name my-clickhouse-server<br />
  &#8211;ulimit nofile=262144:262144<br />
  -v $(pwd)/custom-config.xml:/etc/clickhouse-server/config.d/custom.xml:ro<br />
  -p 8123:8123 -p 9000:9000<br />
  dhi.io/clickhouse-server:26.2-debian13</p>
<p>Running DHI ClickHouse on Kubernetes</p>
<p>For Kubernetes, there&#8217;s one important addition to your pod spec. Since DHI runs as a non-root user, you need to set fsGroup to ensure your persistent volume data is accessible:</p>
<p>spec:<br />
  template:<br />
    spec:<br />
      securityContext:<br />
        runAsNonRoot: true<br />
        runAsUser: 65532     # DHI nonroot user<br />
        fsGroup: 65532       # makes mounted volumes accessible to the nonroot user<br />
      containers:<br />
      &#8211; name: clickhouse-server<br />
        image: dhi.io/clickhouse-server:26.2-debian13<br />
        ports:<br />
        &#8211; containerPort: 8123<br />
        &#8211; containerPort: 9000<br />
        volumeMounts:<br />
        &#8211; name: clickhouse-data<br />
          mountPath: /var/lib/clickhouse<br />
        &#8211; name: clickhouse-logs<br />
          mountPath: /var/log/clickhouse-server<br />
        resources:<br />
          limits:<br />
            cpu: &quot;2&quot;<br />
            memory: &quot;4Gi&quot;</p>
<p>One thing worth mentioning: ClickHouse&#8217;s default ports 8123 and 9000 are above the 1024 privileged port boundary, so running as nonroot doesn&#8217;t cause any port binding issues.</p>
<p>The metrics exporter</p>
<p>If you&#8217;re running ClickHouse on Kubernetes and need Prometheus metrics, Docker also ships clickhouse-metrics-exporter &#8211; a hardened image that works with the ClickHouse Operator to expose a /metrics endpoint. It&#8217;s 65% smaller than the standard exporter (10.3 MB vs 29.4 MB) and has 75% fewer layers (5 vs 20). Same data, dramatically smaller surface.</p>
<p>containers:<br />
&#8211; name: metrics-exporter<br />
  image: dhi.io/clickhouse-metrics-exporter:0-debian13<br />
  ports:<br />
  &#8211; name: metrics<br />
    containerPort: 8888<br />
  resources:<br />
    limits:<br />
      cpu: 100m<br />
      memory: 128Mi<br />
    requests:<br />
      cpu: 50m<br />
      memory: 64Mi</p>
<p>Debugging without the usual tools</p>
<p>The debugging story is simpler than it might seem. docker debug attaches an ephemeral layer to the running container that includes bash, curl, strace, vim, and anything else you need without modifying the production image itself. When you exit, the layer disappears and the container is exactly as it was. It&#8217;s a cleaner approach than shelling directly into a production container, and in practice it&#8217;s a single command:</p>
<p>docker debug my-clickhouse-server</p>
<p>Or if you prefer, you can mount a debug image alongside the container:</p>
<p>docker run &#8211;rm -it &#8211;pid container:my-clickhouse-server<br />
  &#8211;mount=type=image,source=&lt;your-namespace&gt;/dhi-busybox,destination=/dbg,ro<br />
  dhi.io/clickhouse-server:26.2-debian13 /dbg/bin/sh</p>
<p>There&#8217;s also a broader security benefit that goes beyond CVE counts. If something does go wrong in production, an attacker who gets into the container finds no package manager to install tools with, no curl or wget to exfiltrate data through, and no obvious path to reach out to the network which significantly limits what a compromise can actually turn into.</p>
<p>ClickHouse: Non-hardened Image vs. Hardened Image Compared</p>
<p>A Docker Scout scan of both images puts the difference in plain numbers. Using ubuntu:22.04 as its base, the standard image carries 8 medium and 11 low severity vulnerabilities across 111 packages, including the wget and tar findings that are most likely to trigger a security block in an enterprise pipeline. The DHI image eliminates all medium severity findings entirely and comes in at 14 low severity items but these are in core system libraries like glibc and openssl where no fix exists on any distribution, not in unnecessary utilities that had no business being in the image. The 3 unconfirmed findings that Scout surfaces have already been assessed and suppressed via VEX attestation, which ships with the image as part of its SLSA Level 3 provenance</p>
<p>To view the difference between versions for any other image, you can run your own scan with Docker Scout for a quick comparison using this command:</p>
<p>docker scout quickview clickhouse/clickhouse-server:latest</p>
<p>docker pull dhi.io/clickhouse-server:26.2-debian13<br />
docker tag dhi.io/clickhouse-server:26.2-debian13 clickhouse-dhi:latest<br />
docker scout quickview clickhouse-dhi:latest</p>
<p>Non-Hardened  ClickHouse Image</p>
<p>Docker Hardened Image</p>
<p>Default user</p>
<p>root (steps down to clickhouse user at runtime via entrypoint, but Dockerfile has no USER directive overridable with CLICKHOUSE_RUN_AS_ROOT=1)</p>
<p>nonroot (enforced at image level via USER directive cannot be overridden at runtime)</p>
<p>Shell access</p>
<p>Full shell (bash/sh) available</p>
<p>bash present, no network tools or package manager</p>
<p>Package manager</p>
<p>apt available</p>
<p>No package manager</p>
<p>CVE exposure</p>
<p>Ships wget (CVE-2021-31879, unpatched since 2021), tar (CVE-2025-45582)</p>
<p>No wget, no tar &#8211; unnecessary packages removed entirely</p>
<p>CVE patching</p>
<p>Unpatched findings from 2021–2025 due to the lack of upstream fixes from Ubuntu base image.</p>
<p>Actively tracked, 2026 CVE IDs show proactive remediation</p>
<p>Provenance</p>
<p>Standard</p>
<p>SLSA Level 3 attestation</p>
<p>Compliance</p>
<p>Manual hardening required</p>
<p>CIS, NIST, FedRAMP-aligned</p>
<p>Debugging</p>
<p>Traditional shell debugging</p>
<p>Use docker debug or Image Mount for troubleshooting</p>
<p>The Security Team Conversation</p>
<p>The team that got blocked at AWS ECR in November 2025 didn’t have a ClickHouse problem, they had a base image problem. Their database was fine; what the scanner was finding were CVEs in Perl, system utilities, and other packages that had come along in the Debian base and never used by the application. Nothing in the scanner output made that distinction, so the security team did exactly what they were supposed to do and blocked the deployment.</p>
<p>With DHI, that conversation with your security team becomes considerably more straightforward. Rather than building a case for why specific CVEs don&#8217;t apply to your workload, you can point to an image built by Docker&#8217;s security team from the minimum required components, with SLSA Level 3 provenance and independent validation by SRLabs. The ClickHouse runtime itself is unchanged ~ queries, ports, configuration files, and performance all carry over so the only thing you&#8217;re actually changing is the answer you can give when someone asks whether this image can go to production.For teams that need stronger guarantees, DHI Enterprise adds SLA-backed CVE remediation within seven days, FIPS and STIG variants, and extended lifecycle support. For most teams, the free Enterprise trial is the right starting point. It answers the question that actually matters before you commit to anything. Interested to learn further? Start with this blog that walks through the trial and sets you up for success.</p>
<p>Migration Checklist</p>
<p>☐ Mirror clickhouse-server DHI image to your Docker Hub namespace (one-time setup)<br />
☐ Update your image reference to dhi.io/clickhouse-server:26.2-debian13<br />
☐ Set CLICKHOUSE_PASSWORD (required for network access in DHI)<br />
☐ Keep &#8211;ulimit nofile=262144:262144 on all run commands<br />
☐ In Kubernetes: add fsGroup: 65532 to your pod securityContext<br />
☐ Switch from kubectl exec to kubectl debug for troubleshooting<br />
☐ Run trivy against both images to see the difference yourself:<br />
     trivy image clickhouse/clickhouse-server:latest<br />
     trivy image dhi.io/clickhouse-server:26.2-debian13</p>
<p>The migration is narrower in scope than it might appear &#8211; your volume mounts, port mappings, and existing XML configuration files all carry over without modifications, and on Kubernetes the only structure addition is the fsGroup security context. Everything else is an image reference change.</p>
<p>Resources</p>
<p>Docker Hardened Images Documentation</p>
<p>DHI ClickHouse Server Guide</p>
<p>DHI ClickHouse Metrics Exporter Guide</p>
<p>Docker Debug Documentation</p>
<p>Free DHI Catalog</p>
<p>DHI Community Announcement</p>
<p>Docker Scout Documentation</p>
<p>Quelle: https://blog.docker.com/feed/</p>
<p>Der Beitrag <a rel="nofollow" href="https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/">From Security Blocked to Prod Ready: ClickHouse on Docker Hardened Images</a> erschien zuerst auf <a rel="nofollow" href="https://www.cloud-computing-koeln.de">Cloud Computing Köln</a>.</p>
]]></content:encoded>
			<wfw:commentRss>https://www.cloud-computing-koeln.de/from-security-blocked-to-prod-ready-clickhouse-on-docker-hardened-images/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
