Generate Images Locally with Docker Model Runner and Open WebUI

We’ve all been there: you need to generate a few images for a project, you fire up an AI image service, and suddenly you’re wondering what happens to your prompts, how many credits you have left, or why that “safe content” filter rejected your perfectly reasonable request for a dragon wearing a business suit. What if you could skip all of that and run the whole thing on your own machine, with a slick chat UI on top?

That’s exactly what Docker Model Runner now makes possible. With a couple of commands you can pull an image-generation model, connect it to Open WebUI, and start generating images right from a chat interface fully local, fully private, fully yours.

Let’s build it. Your own private DALL-E, no cloud subscription required.

What You’ll Need

Docker Desktop (macOS) or Docker Engine (Linux)

~8 GB of free RAM for a small model (more is better)

GPU: optional but highly recommended, NVIDIA (CUDA), Apple Silicon (MPS), or CPU fallback

If you can run docker model version without errors, you’re good to go.

How  Docker Model Runner works with Open WebUI

Before we dive in, here’s the big picture:

Docker Model Runner acts as the control plane. It downloads the model, manages the inference backend lifecycle, and exposes a 100% OpenAI-compatible API — including the POST /v1/images/generations endpoint that Open WebUI already knows how to talk to.

Step 1: Pull an Image Generation Model

Docker Model Runner uses a compact packaging format called DDUF (Diffusers Unified Format) to distribute image generation models through Docker Hub, just like any other OCI artifact.

Pull a model to get started:

docker model pull stable-diffusion

You can confirm it’s ready:

docker model inspect stable-diffusion

{
"id": "sha256:5f60862074a4c585126288d08555e5ad9ef65044bf490ff3a64855fc84d06823",
"tags": [
"docker.io/ai/stable-diffusion:latest"
],
"created": 1768470632,
"config": {
"format": "diffusers",
"architecture": "diffusers",
"size": "6.94GB",
"diffusers": {
"dduf_file": "stable-diffusion-xl-base-1.0-FP16.dduf",
"layout": "dduf"
}
}
}

What’s happening under the hood? The model is stored locally as a DDUF file, a single-file format that bundles all the components of a diffusion model (text encoder, VAE, UNet/DiT, scheduler config) into one portable artifact. Docker Model Runner knows how to unpack it at runtime.

Step 2: Launch Open WebUI

This is a magic trick. Docker Model Runner has a built-in launch command that knows exactly how to wire up Open WebUI against the local inference endpoint:

docker model launch openwebui

That’s it. Behind the scenes this runs:

docker run –rm
-p 3000:8080
-e OPENAI_API_BASE=http://model-runner.docker.internal/engines/v1
-e OPENAI_BASE_URL=http://model-runner.docker.internal/engines/v1
-e OPENAI_API_KEY=sk-docker-model-runner
ghcr.io/open-webui/open-webui:latest

The model-runner.docker.internal hostname is a special DNS entry that Docker Desktop containers use to reach the Model Runner running on the host, no port-forwarding gymnastics required. If you use Docker CE, you’ll see the docker/model-runner container address instead of model-runner.docker.internal.

Open your browser at http://localhost:3000, create a local account (it stays offline), and you’ll land on the chat interface.

Tip: Want to run it in the background? Add –detach:

docker model launch openwebui –detach

Prefer Docker Compose? See the full setup here: https://docs.docker.com/ai/model-runner/openwebui-integration/

Step 3: Configure Open WebUI for Image Generation

Open WebUI already uses Docker Model Runner for text chat automatically (it reads the OPENAI_API_BASE env var). For image generation you need to point it at the images endpoint too, a 30-second job in the settings UI.

Got to http://localhost:3000/admin/settings/images

Enable Image Generation

Fill in the fields:

Click Save.

Field

Value

Model

stable-diffusion

API Base URL

http://model-runner.docker.internal/engines/diffusers/v1

API Key

whatever-you-want

Why the dummy API key? Docker Model Runner doesn’t require authentication, it’s a local service. The key is only there because Open WebUI’s form requires one. Any non-empty string works.

Step 4: Pull a Chat Model

Open WebUI is also a full-featured chat interface, and one of its best tricks is letting you ask the LLM to generate an image right from the conversation. For that to work, you need a language model too.

# Lightweight option — runs on almost any machine
docker model pull smollm2

# Recommended — more capable, better at understanding creative prompts
docker model pull gpt-oss

Both will show up automatically in the Open WebUI model selector. Use smollm2 if you’re tight on RAM, or gpt-oss if you want richer, more creative responses before image generation.

No extra configuration needed, Open WebUI picks up text models from the same OPENAI_API_BASE endpoint it was already configured with.

Step 5: Generate Your First Image

Head back to the main chat view. You’ll notice a small image icon in the message input bar.

Click it to toggle image generation mode, type your prompt, and send.

Try something like:

Create an image of a whale.

The first request takes a little longer while the backend loads the model into memory. After that, subsequent images generate much faster.

Open WebUI will automatically route image-generation requests to the diffusers backend and text requests to the language model, seamlessly, in the same conversation.

Step 6: Generate Images Directly via the API

For developers who want to integrate image generation into their own apps, Docker Model Runner exposes the standard OpenAI Images API directly:

curl -s -X POST http://localhost:12434/engines/diffusers/v1/images/generations
-H "Content-Type: application/json"
-d '{
"model": "stable-diffusion",
"prompt": "A cat sitting on a couch",
"size": "512×512"
}'

The response follows the OpenAI Images API format exactly:

{
"created": 1742990400,
"data": [
{
"b64_json": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBD…"
}
]
}

Decode and save the image:

curl -s -X POST http://localhost:12434/engines/diffusers/v1/images/generations
-H "Content-Type: application/json"
-d '{
"model": "stable-diffusion",
"prompt": "A cat sitting on a couch",
"size": "512×512"
}' | jq -r '.data[0].b64_json' | base64 -d > cat.png

open cat.png

Advanced Parameters

The API supports all the parameters you’d expect from a full diffusers pipeline:

curl http://localhost:12434/engines/diffusers/v1/images/generations
-X POST
-H "Content-Type: application/json"
-d '{
"model": "stable-diffusion",
"prompt": "A serene Japanese zen garden, cherry blossoms, koi pond, photorealistic",
"negative_prompt": "blurry, low quality, distorted, watermark",
"size": "768×512",
"n": 2,
"num_inference_steps": 30,
"guidance_scale": 7.5,
"seed": 42,
"response_format": "b64_json"
}'| jq -r '.data[0].b64_json' | base64 -d > garden.png

Parameter

What it does

prompt

What you want in the image

negative_prompt

What you want to avoid

size

Resolution as WIDTHxHEIGHT (e.g., 512×512, 768×512)

n

Number of images to generate (1–10)

num_inference_steps

More steps = higher quality, slower (default: 50)

guidance_scale

How closely to follow the prompt (1–20, default: 7.5)

seed

Integer for reproducible results; omit for random

Pro tip: Set a seed while you’re iterating on a prompt. Once you’re happy with the composition, remove it to get unique variations.

Under the Hood: How the Diffusers Backend Works

When you first request an image, Docker Model Runner:

Unpacks the DDUF file: extracts the model components and loads them via DiffusionPipeline.from_pretrained()

Starts a FastAPI server: this is the server that Open WebUI and your curl commands talk to through Docker Model Runner

The server is installed on first use by downloading a self-contained Python environment from Docker Hub (version-pinned, so updates are explicit). It lives at ~/.docker/model-runner/diffusers/ — no Python version conflicts, no virtualenv setup.

Troubleshooting

The model takes forever to load on first use. That’s normal, the model weights are being loaded from disk and transferred to GPU memory. Subsequent requests in the same session are much faster because the backend stays warm.

I get a “No model loaded” 503 error Make sure the model is fully downloaded (docker model list) and that you’re sending the correct model name in the model field.

Image quality is poor / generations are too fast Increase num_inference_steps (try 20–50 steps). Higher values = slower but sharper results.

Open WebUI can’t connect to the image endpoint Double-check the URL in Admin Panel → Settings → Images. Inside a Docker container it must be http://model-runner.docker.internal/engines/diffusers/v1, not localhost.

Conclusion and What’s Next

Docker Model Runner makes local image generation simple. It packages and serves image models through an OpenAI-compatible API, while Open WebUI provides an easy chat interface on top. Together, they let you generate images privately on your own machine, either through the browser or directly through the API, without relying on a cloud service.

This feature opens up a lot of possibilities:

Multimodal workflows: Chat with a text model about an idea, then immediately generate an image of it — in the same Open WebUI conversation

RAG + image generation: Build a pipeline that generates illustrations for your documents

Custom models: The diffusers backend supports any DDUF-packaged model, so you can package your own fine-tuned models using Docker’s model packaging tools

The Docker Model Runner team is actively expanding model support on Docker Hub. Check docker model search for the latest available models.

Quelle: https://blog.docker.com/feed/

AWS IoT Core for Device Location adds Confidence Level Configuration and Measurement Type support

AWS IoT Core for Device Location now supports two enhancements that give developers greater control over location resolution and richer metadata for resolved device locations. Customers using the Cell ID, Wi-Fi, or Cell+Wi-Fi solvers can now specify a desired confidence level between 50% and 99% when resolving device locations. The confidence level represents the statistical probability that the actual device location falls within the reported accuracy radius. A higher confidence level (for example, 95%) increases certainty that the device falls within the reported radius but produces a larger accuracy radius. A lower confidence level (for example, 50%) yields a smaller radius with less certainty. Customers can now configure this value to balance accuracy and confidence based on their specific requirements. This feature is currently supported for HTTP-based location resolution. This update also introduces a measurement type field in resolved location metadata, giving developers greater visibility into how each device location was determined — whether through GNSS, Wi-Fi or BLE location resolvers. This make it easier to assess location data quality, debug positioning issues, and make more informed decisions based on how each location was determined. These updates are available in all AWS IoT Core for Device Location supported regions. For detailed guidance and implementation instructions, visit the AWS IoT Core Device Location and IoT Wireless Developer Guide .
Quelle: aws.amazon.com

Amazon WorkSpaces now lets AI agents operate desktop applications (Preview)

Amazon WorkSpaces, AWS’s fully managed cloud desktop service, now enables AI agents to securely access and operate desktop applications through managed WorkSpaces environments. Many enterprises run critical business processes on desktop applications—mainframes, ERP systems, and proprietary tools—that lack modern APIs, creating a “last-mile challenge” for AI agents. WorkSpaces now allows organizations to automate everyday workflows at scale while maintaining full enterprise-grade governance and compliance.
AI agents built on any framework and running anywhere—cloud-hosted, on-premises, or hybrid—can now connect to business applications with minimal code using industry-standard Model Context Protocol (MCP) integration. Builders gain fast time-to-value without standing up new infrastructure, while IT administrators maintain centralized permissions, logging, and auditing controls identical to human WorkSpaces environments. Enterprise observability features including screenshots and metrics provide full visibility into agent activities. Organizations can automate workflows spanning claims processing, trade settlement, candidate screening, and back-office operations across financial services, healthcare, and other regulated industries—all without requiring application modernization.
WorkSpaces delivers secure environments where agents can point, click, and navigate on desktop applications just like humans. With pay-as-you-go pricing and elastic scale built on AWS’s global infrastructure, enterprises reduce IT overhead while expanding what’s possible when people and AI work together. To learn more, visit the WorkSpaces documentation.  
Quelle: aws.amazon.com

Amazon ElastiCache adds thirteen new Amazon CloudWatch metrics for network capacity planning and engine diagnostics

Amazon ElastiCache customers can now detect network throttling, memory fragmentation, and connection exhaustion, using thirteen new Amazon CloudWatch metrics for node-based clusters. You can monitor these host-level and engine-level diagnostics directly from CloudWatch without running INFO commands on individual nodes or calculating baselines from raw byte counters.

Network capacity: NetworkBaselineUsageInPercentage, NetworkBaselineUsageOutPercentage, NetworkBaselineMaxUsageInPercentage, and NetworkBaselineMaxUsageOutPercentage report network utilization relative to instance baseline, enabling portable alarms that remain valid across instance type changes. Values above 100 percent signal that a host is consuming burst credits, a leading indicator that a sustained workload will eventually lead to credit exhaustion and throttling. The variants capturing max report per-second bursts that averaged metrics can hide.
Memory health: UsedMemoryDataset shows memory consumed by actual stored data excluding engine overhead. AllocatorFragmentationBytes and AllocatorFragmentationRatio isolate fragmentation that the activedefrag parameter can address. MajorPageFaults captures OS-level page faults that indicate memory pressure beyond what the engine can surface.
Connectivity health: BlockedConnections and RejectedConnections surface connections waiting on blocking commands and connections turned away when the maxclients limit is reached. When RejectedConnections is non-zero, raise maxclients or diagnose client-side connection pool leaks.
Pub/sub workloads: PubSubChannels and PubSubShardChannels expose active classic and sharded channels on each node. When classic channel counts are growing with utilization, consider switching to sharded pub/sub to scale horizontally.
Command throughput: ProcessedCommands provides total command throughput across all command types.

These metrics are available for node-based clusters in all commercial AWS Regions and the AWS China and AWS GovCloud (US) Regions where ElastiCache is supported, at no additional cost. To get started, view the new metrics in the ElastiCache console monitoring tab or in the AWS/ElastiCache namespace in the CloudWatch console. To learn more, see Host-Level Metrics and Metrics for Valkey and Redis OSS.
Quelle: aws.amazon.com

AWS SAM now supports WebSocket APIs for Amazon API Gateway

AWS Serverless Application Model (AWS SAM) now supports WebSocket APIs for Amazon API Gateway, enabling you to define complete WebSocket APIs with minimal configuration in your SAM template.
AWS SAM is a collection of open-source tools that make it easy for you to build and manage serverless applications. WebSocket APIs are critical for real-time applications such as chat, live dashboards, AI/LLM streaming, and IoT. However, SAM previously did not support WebSocket APIs, requiring you to manually configure all of the underlying resources in AWS CloudFormation. This made it difficult to debug common issues such as missing IAM permissions for Lambda functions. Now, SAM handles all of this automatically, generating the required resources and permissions from your template. The new resource provides feature parity with API Gateway WebSocket APIs, including IAM and Lambda authorization, custom domains, RouteSettings, Models, and StageVariables. Globals support lets you share common configuration across multiple WebSocket APIs.
To get started, add the AWS::Serverless::WebSocketApi resource type to your SAM template. Define your routes by specifying Lambda function handlers for $connect, $disconnect, and $default routes, along with any custom routes your application requires. SAM automatically wires up the integrations and permissions for each route. You can also configure authorization, stage settings, and custom domains directly within the resource definition.
To learn more, visit the SAM developer guide.
Quelle: aws.amazon.com

AWS SAM CLI adds BuildKit support for AWS Lambda functions packaged as container images

AWS Serverless Application Model Command Line Interface (SAM CLI) now supports BuildKit for building container images from Dockerfiles, enabling faster, more efficient container image builds for Lambda functions packaged as container images.
SAM CLI is a command-line tool for building, testing, debugging, and packaging serverless applications locally before deploying to AWS Cloud. Developers packaging Lambda functions as container images often need advanced build features provided by BuildKit to optimize their images for production. However, SAM CLI previously did not support BuildKit features. Now, with BuildKit support in SAM CLI, you can utilize multi-stage builds to create smaller final images without development dependencies, improved caching to reduce rebuild times, and better parallelization of build steps. BuildKit also enables cross-architecture builds, allowing you to build container images targeting both x86_64 and arm64 (AWS Graviton2) instruction set architectures from the same development machine. You can also use Docker secrets during builds, keeping sensitive data such as credentials and API keys out of your final image layers.
To get started, download or update SAM CLI to version 1.159.0 or later and use the –use-buildkit flag with sam build. This feature works regardless of whether you are using Docker or Finch with SAM CLI, unlocking the full set of BuildKit capabilities.
To learn more, visit the SAM CLI developer guide.
Quelle: aws.amazon.com

April 2026 PTG Recap: The Road from Gazpacho to Hibiscus

Following the tradition of our previous Project Teams Gathering (PTG) summaries, the April 2026 PTG brought together over 35 teams and hundreds of contributors, operators, and open-source enthusiasts for a week of virtual collaboration. On the OpenStack side, the community was coming off the launch of the 2026.1 “Gazpacho” release of OpenStack and spent the… Read more »
Quelle: openstack.org