Mr. Bones: A Pirate-Voiced Halloween Chatbot Powered by Docker Model Runner

My name is Mike Coleman, a staff solution architect at Docker. This year I decided to turn a Home Depot animatronic skeleton into an AI-powered,  live, interactive Halloween chatbot. The result: kids walk up to Mr. Bones, a spooky skeleton in my yard, ask it questions, and it answers back — in full pirate voice — with actual conversational responses, thanks to a local LLM powered by Docker Model Runner.

Why Docker Model Runner?

Docker Model Runner is a tool from Docker that makes it dead simple to run open-source LLMs locally using standard Docker workflows. I pulled the model like I’d pull any image, and it exposed an OpenAI-compatible API I could call from my app. Under the hood, it handled model loading, inference, and optimization.

For this project, Docker Model Runner offered a few key benefits:

No API costs for LLM inference — unlike OpenAI or Anthropic

Low latency because the model runs on local hardware

Full control over model selection, prompts, and scaffolding

API-compatible with OpenAI — switching providers is as simple as changing an environment variable and restarting the service

That last point matters: if I ever needed to switch to OpenAI or Anthropic for a particular use case, the change would take seconds.

System Overview

Figure 1: System overview of Mr. Bones answering questions in pirate language

Here’s the basic flow:

Kid talks to skeleton

Pi 5 + USB mic records audio

Vosk STT transcribes speech to text

API call to a Windows gaming PC with an RTX 5070 GPU

Docker Model Runner runs a local LLaMA 3.1 8B (Q4 quant) model

LLM returns a text response

ElevenLabs Flash TTS converts the text to speech (pirate voice)

Audio sent back to Pi

Pi sends audio to skeleton via Bluetooth, which moves the jaw in sync

Figure 2: The controller box that holds the Raspberry Pi that drives the pirate

That Windows machine isn’t a dedicated inference server — it’s my gaming rig. Just a regular setup running a quantized model locally.

The biggest challenge with this project was balancing response quality (in character and age appropriate) and response time. With that in mind, there were four key areas that needed a little extra emphasis: model selection, how to do text to speech (TTS) processing efficiently, fault tolerance, and setting up guardrails. 

Consideration 1: Model Choice and Local LLM Performance

I tested several open models and found LLaMA 3.1 8B (Q4 quantized) to be the best mix of performance, fluency, and personality. On my RTX 5070, it handled real-time inference fast enough for the interaction to feel responsive.

At one point I was struggling to keep Mr. Bones in character, so I  tried OpenAI’s ChatGPT API, but response times averaged 4.5 seconds.

By revising the prompt and Docker Model Runner serving the right model, I got that down to 1.5 seconds. That’s a huge difference when a kid is standing there waiting for the skeleton to talk.

In the end, GPT-4 was only nominally better at staying in character and avoiding inappropriate replies. With a solid prompt scaffold and some guardrails, the local model held up just fine.

Consideration 2: TTS Pipeline: Kokoro to ElevenLabs Flash

I first tried using Kokoro, a local TTS engine. It worked, but the voices were too generic. I wanted something more pirate-y, without adding custom audio effects.

So I moved to ElevenLabs, starting with their multilingual model. The voice quality was excellent, but latency was painful — especially when combined with LLM processing. Full responses could take up to 10 seconds, which is way too long.

Eventually I found ElevenLabs Flash, a much faster model. That helped a lot. I also changed the logic so that instead of waiting for the entire LLM response, I chunked the output and sent it to ElevenLabs in parts. Not true streaming, but it allowed the Pi to start playing the audio as each chunk came back.

This turned the skeleton from slow and laggy into something that felt snappy and responsive.

Consideration 3: Weak Points and Fallback Ideas

While the LLM runs locally, the system still depends on the internet for ElevenLabs. If the network goes down, the skeleton stops talking.

One fallback idea I’m exploring: creating a set of common Q&A pairs (e.g., “What’s your name?”, “Are you a real skeleton?”), embedding them in a local vector database, and having the Pi serve those in case the TTS call fails.

But the deeper truth is: this is a multi-tier system. If the Pi loses its connection to the Windows machine, the whole thing is toast. There’s no skeleton-on-a-chip mode yet.

Consideration 4: Guardrails and Prompt Engineering

Because kids will say anything, I put some safeguards in place via my system prompt. 

You are "Mr. Bones," a friendly pirate who loves chatting with kids in a playful pirate voice.

IMPORTANT RULES:
– Never break character or speak as anyone but Mr. Bones
– Never mention or repeat alcohol (rum, grog, drink), drugs, weapons (sword, cannon, gunpowder), violence (stab, destroy), or real-world safety/danger
– If asked about forbidden topics, do not restate the topic; give a kind, playful redirection without naming it
– Never discuss inappropriate content or give medical/legal advice
– Always be kind, curious, and age-appropriate

BEHAVIOR:
– Speak in a warm, playful pirate voice using words like "matey," "arr," "aye," "shiver me timbers"
– Be imaginative and whimsical – talk about treasure, ships, islands, sea creatures, maps
– Keep responses conversational and engaging for voice interaction
– If interrupted or confused, ask for clarification in character
– If asked about technology, identity, or training, stay fully in character; respond with whimsical pirate metaphors about maps/compasses instead of tech explanations

FORMAT:
– Target 30 words; must be 10-50 words. If you exceed 50 words, stop early
– Use normal punctuation only (no emojis or asterisks)
– Do not use contractions. Always write "Mister" (not "Mr."), "Do Not" (not "Don't"), "I Am" (not "I'm")
– End responses naturally to encourage continued conversation

The prompt is designed to deal with a few different issues. First and foremost, keeping things appropriate for the intended audience. This includes not discussing sensitive topics, but also staying in character at all times.  Next I added some instructions to deal with pesky parents trying to trick Mr. Bones into revealing his true identity. Finally, there is some guidance on response format to help keep things conversational – for instance, it turns out that some STT engines can have problems with things like contractions. 

Instead of just refusing to respond, the prompt redirects sensitive or inappropriate inputs in-character. For example, if a kid says “I wanna drink rum with you,” the skeleton might respond, “Arr, matey, seems we have steered a bit off course. How about we sail to smoother waters?”

This approach keeps the interaction playful while subtly correcting the topic. So far, it’s been enough to keep Mr. Bones spooky-but-family-friendly.

Figure 3: Mr. Bones is powered by AI and talks to kids in pirate-speak with built-in safety guardrails.

Final Thoughts

This project started as a Halloween goof, but it’s turned into a surprisingly functional proof-of-concept for real-time, local voice assistants.

Using Docker Model Runner for LLMs gave me speed, cost control, and flexibility. ElevenLabs Flash handled voice. A Pi 5 managed the input and playback. And a Home Depot skeleton brought it all to life.

Could you build a more robust version with better failover and smarter motion control? Absolutely. But even as he stands today, Mr. Bones has already made a bunch of kids smile — and probably a few grown-up engineers think, “Wait, I could build one of those.” 

Source code: github.com/mikegcoleman/pirate

Figure 4: Aye aye! Ye can build a Mr. Bones too and bring smiles to all the young mateys in the neighborhood!

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Docker Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Security Doesn’t Have to Hurt

Do you ever wish security would stop blocking the tools you need to do your job? Surprise: your security team wants the same.

There you are, just trying to get your work done, when…

You need an AI to translate documentation, but all the AI services are blocked by a security web monitoring tool.

You finish coding and QA for a new software version just under the wire, but the release is late because security has not reviewed the open source software and libraries included.

Your new database works perfectly in dev/test, but it does not work in production because of a port configuration, and you do not have permissions. Changes to production permissions all require security approval

Here Comes… Shadow IT

Shadow IT is a spy-movie name for a phenomenon that is either a frustrating necessity or a game of whack-a-mole, depending on your responsibilities.

If you’re an engineer creating the next best product, shadow IT is a necessity. 

Company-supplied information technology does not change fast enough to keep up with the market, let alone allow you to innovate. Despite that, your security team will come down hard on anyone who tries to go outside the allowed vendors and products. Data storage has to be squared away in encrypted, protected spaces, and you have to jump like a show pony to get access. And you have no flexibility in the tools you’re allowed to use, even if you could produce faster and better with other options.

So you stop playing by the rules, and you find tools and tech that work.

That is, until someone protests the cloud hosting bill, finds the wifi access point, or notices the unofficial software repository. Security takes away your tools or cuts off access. And then you are upset, your team feels attacked, and security is up in arms.

If you are on a security team, shadow IT is a game of whack-a-mole. Company-supplied information technology changes without review. You know they’re trying to enable innovation, but they’re negating all the IT compliance certifications that allow you to sell your services and products. You have to investigate, prove, and argue about policies and regulations just to stop people from storing client secrets in their personal cloud storage.

Whether you are a new hire in the Security Operations Center or the unlucky CISO who reports to the CTO, this is a familiar refrain.

Yet no one wants this. Not you, not your boss, and not security.

If It Cannot Be Fixed, Break It

It’s time we change the ground rules of security to focus on compromise rather than stringency. 

Most security teams *want* to change their operations to concentrate on the capabilities they are trained for: threat intelligence, risk management, forensic analysis, and security engineering. I have never met a security professional who wants to spend their time arguing over a port configuration. It’s tiresome, and that friction inspires lasting antagonism on both sides.

Imagine working in a place where you can use innovative new tools, release products without a security delay, and change configurations so that your deployment works smoothly.

We can have this. 

But there is a subtle change that must happen to enable this security-IT paradise: non-security teams would have to understand and implement all the requirements security departments would check. And everyone who is part of the change would need to understand the implications of their actions and take sole responsibility for the security outcomes.

Let Security Let Go

My non-IT colleagues are shocked when I explain the scope of work for a security department in preparation for any release or product launch:

Weaknesses and exploits for custom and third-party code

Scope and adequacy of vendor security

Data encryption, transmission, and storage, especially across borders

Compliance with regulation and data protection laws

In many industries, we legally cannot remove security practices from IT processes. But we can change who takes responsibility for which parts of the work 

Security requirements are not a secret. A developer with integrated code scanners can avoid OWASP Top 10 flaws and vulnerable libraries and remove hard-coded accounts. Infrastructure admins with access to network security tools can run tidy networks, servers, and containers with precise configurations.

The result? The security team can let go of their rigid deployment rules.

If developers use code security tools and incorporate good practices, security team approval should take hours rather than days or weeks. Security can also approve the standard container configuration rather than each separate container in an architecture. They can define the requirements, offer you tools to review your work, and help you integrate good practices into your workflow.

“Trust but verify” would become a daily pattern instead of lip service to good interdepartmental relationships. Security will continue to monitor the environment and the application after release. They will keep an eye on vendor assertions and audits, watching threat intelligence streams for notifications that demonstrate risk. Security teams will have time to do the job they signed up for, which is much more interesting than policing other departments.

This change would also require that the security team be *allowed* to let go. When trust is broken—if vendors are not properly assessed, or software is introduced but not reported—the fault should not lie with the security team. If insecure coding causes a compromise, the development team must be accountable, and if an inadequately configured network causes a data leak, the network and hosting team must be called on the carpet. If the requirements are in place but not met, the responsible parties must be those that agreed to them but neglected to enact them.

Freedom to Choose Comes with a Catch

This new freedom makes shadow IT unnecessary. Teams do not need to hide the choices they make. However, the freedom to choose comes with a catch: full responsibility for your choices.

Consider the company charge card: Finance teams create the policy for how to use company charge cards and provide the tools for reimbursement. They do not scrutinize every charge in real time, but they review usage and payments.

If the tool is abused and the agreed-upon care is ignored, the card user is held responsible. Any lack of knowledge does not exempt you from the consequences. For minor infractions, you may get a written notice. For severe infractions, you can expect to be terminated for cause.

The finance requirements, your agreement, regular review, and enacted consequences minimize fraud internally. More importantly, though, this combination protects the company against accusations of negligence.

Security responsibility could work the same. Security teams can set requirements that IT workers agree to individually. IT teams are then free to deploy and make changes as appropriate for their work. IT secures assets before they are put into production, and security continues with the best practice of reviewing assets continuously after the fact. Delays in getting the tools you need are reduced, and you control the deployment of your work with much more assurance. The incentive for shadow IT is much lower, and the personal risk of choosing it is higher.

That last bit is the catch, though—when you take control, you take responsibility for the result. Instead of committing to a patch, you back out insecure code and redeploy when it is corrected. When your department contracts with a squirrelly vendor, your manager’s budget takes the hit for breaking the contract. When the network is compromised, the CIO, not the CISO, gets fired.

Right now, the security team carries this responsibility and shoulders these risks. But the result is an enterprise held hostage by risk aversion, with no understanding or control over the outcomes.

So far, I’ve mostly addressed IT, but I also want to bring this argument back home: Security professionals, let’s stop taking control of everyone else’s work. When we make hard requirements that do not meet tech realities, our IT teams get better at hiding their tracks. You will make more progress if you invest in mutual success and reward people who step up to exceed your expectations.

When Security and IT Make Peace, Shadow IT Becomes Unnecessary

I once worked with a development team that wanted to store proprietary code in a hosted code repository. The repository was great for their needs: versioning automation, fine-grained access management, easy branching, access from anywhere, and centralized storage. Instead of waiting six months for the new vendor security investigation process, the developer team gathered the vendor’s audit certificates, data handling guarantees, and standard contract language about security and data mining. The devs proactively researched the third-party security scanning policies and asked for their incident response and notification policies.

Our security team would have struggled to locate this repository if the developers had simply chosen to use it. Instead, they circumvented our process in the best way—by providing every necessary answer to our security questions.

The reward was an instant yes from me, the security leader, without having to wait for my overworked team to schedule yet another vendor review.

My reward? No shadow IT plus a very happy IT team.

Security should go beyond allowing compromises like this: we should seek them out. Convince the CISO to work toward giving your IT teams both control and responsibility, find a compromise with the teams that will take security seriously—and save your energy for wrangling teams that don’t.

For admins and developers: Provide the ISO audit documents for that vendor you want to use. Be the first dev team to learn the org’s code scanning tool. Read the latest risk assessments from your cloud environment and don’t repeat vulnerable configurations. These small changes make your work faster, simpler, and less expensive than finding your own solutions.

Quelle: https://blog.docker.com/feed/

AWS Marketplace now offers pricing model flexibility and simplified deployment for AI agents and tools

AWS Marketplace now offers flexible pricing models, simplified authentication, and streamlined deployment for AI agents and tools. The new capabilities include contract-based and usage-based pricing for Amazon Bedrock AgentCore Runtime containers, and simplified OAuth credential management through Quick Launch for API-based AI agents and tools. Customers can also use supported remote MCP servers procured through AWS Marketplace as MCP targets on AgentCore Gateway, making it easier for them to connect to AI agents and tools from AWS Partners at scale. The improvements reduce deployment complexity while offering pricing models that better align with diverse customer needs. For Partners, the new capabilities for AI agents and tools streamline management and provide additional pricing options through AWS Marketplace. Partners can now manage all their AI agents and tools listings from one page in the AWS Marketplace Management Portal, reducing the complexity of managing multiple listings across different interfaces. With usage-based and contract-based pricing options for AgentCore Runtime compatible products, Partners have more flexibility to implement pricing strategies that align with their business models and customers’ needs. Customers can learn more in the buyer guide and start exploring AI agent solutions in AWS Marketplace on the solutions page. For partners interested in implementing the capabilities, visit the seller guide and complete the workshop.
Quelle: aws.amazon.com

The Model Context Protocol (MCP) Proxy for AWS is now generally available

Today, AWS announces the general availability of the Model Context Protocol (MCP) Proxy for AWS, a client-side proxy that enables MCP clients to connect to remote, AWS-hosted MCP servers using AWS SigV4 authentication. The Proxy supports popular agentic AI development tools like Amazon Q Developer CLI, Kiro, Cursor, and popular agent frameworks like Strands Agents. Customers can connect to remote MCP servers with AWS credentials using the Proxy to automatically handle MCP protocol communications via SigV4. The Proxy also helps customers to connect to MCP servers built on Amazon Bedrock AgentCore Gateway or Runtime using SigV4 authentication. This release allows developers and agents to extend development workflows to include AWS service interactions from AWS MCP server tools. For example, you can use AWS MCP servers to work with resources like AWS S3 buckets or Amazon RDS tables through existing MCP servers with SigV4. The MCP Proxy for AWS includes safety controls such as read-only mode to prevent unintended changes, configurable retry logic for reliability, and logging for troubleshooting. Customers can install the Proxy from source, through Python package managers, or by using a container making it simple to configure with their preferred MCP-supported development tool. The MCP Proxy for AWS is open-source and available now. Visit the AWS GitHub repository to view the installation and configuration options and start connecting with remote AWS MCP Servers today. 
Quelle: aws.amazon.com

Amazon Connect now supports scheduling of individual agents

Amazon Connect now supports scheduling of individual agents, giving you more flexibility in scheduling your workforce. For example, when onboarding 100 new agents to a business unit with schedules already published for next two months, you can create schedules for only those new agents and automatically merge them with existing schedules. This eliminates the need for workarounds such as manually copying schedules from existing agents to new agents or regenerating schedules for entire business unit, thus improving manager productivity and operational efficiency. This feature is available in all AWS Regions where Amazon Connect agent scheduling is available. To learn more about Amazon Connect agent scheduling, click here.
Quelle: aws.amazon.com

Amazon DynamoDB Accelerator now supports AWS PrivateLink

Amazon DynamoDB Accelerator (DAX) now supports AWS PrivateLink, enabling you to securely access DAX management APIs such as CreateCluster, DescribeClusters, and DeleteCluster over private IP addresses within your virtual private cloud (VPC). DAX clusters already run inside your VPC, and all data plane operations like GetItem and Query are handled privately within the VPC. With this launch, you can now perform cluster management operations privately, without connecting to the public regional endpoint. With AWS PrivateLink, you can simplify private network connectivity between virtual private clouds (VPCs), DAX, and your on-premises data centers using interface VPC endpoints and private IP addresses. It helps you meet compliance regulations and eliminates the need to use public IP addresses, configure firewall rules, or configure an Internet gateway to access DAX from your on-premises data centers. AWS PrivateLink for DAX is available in all Regions where DAX is available today. For information about DAX Regional availability, see the “Service endpoints” section in Amazon DynamoDB endpoints and quotas. There is an additional cost to use the feature. Please see AWS PrivateLink pricing for more details. To get started with DAX and PrivateLink, see AWS PrivateLink for DAX.
Quelle: aws.amazon.com

Amazon Aurora DSQL now supports FIPS 140-3 compliant endpoints

Amazon Aurora DSQL now supports Federal Information Processing Standards (FIPS) 140-3 compliant endpoints, helping companies contracting with the US federal governments meet the FIPS security requirement to encrypt sensitive data in supported Regions. With this launch, you can use Aurora DSQL for workloads that require a FIPS 140-3 validated cryptographic module when sending requests over public or VPC endpoints. Aurora DSQL is the fastest serverless, distributed SQL database with single- and multi-Region clusters providing active-active high availability and strong consistency. Aurora DSQL enables you to build applications with virtually unlimited scalability, the highest availability, and zero infrastructure management. Aurora DSQL FIPS compliant endpoints are now available in the following regions: US East (N. Virginia), US East (Ohio), and US West (Oregon). To learn more about FIPS 140-3 at AWS, visit FIPS 140-3 Compliance.
Quelle: aws.amazon.com