Comparing Different Approaches to Sandboxing

“AI agents will become the primary way we interact with computers in the future. They will be able to understand our needs and preferences, and proactively help us with tasks and decision making.“

Satya Nadella
CEO of Microsoft

Whether you are a software engineer, a product manager, or a designer, this quote should fundamentally change how we approach our daily routine. We are no longer just building interfaces; we are creating environments where agents can operate autonomously with minimal human interaction. What could be the fundamental requirement for such an environment ?

In a single word: Isolation.

A user interacting with traditional software is constrained by the actions it allows. But Agents are non-deterministic, and therefore prone to hallucination and prompt injections. Once you give an AI write access to your systems, there is nothing stopping it from executing a rm -rf to delete all your data. Of course, there are different ways to solve this problem, with one approach being sandboxing: an isolated, controlled environment used for experimentation and testing without affecting the surrounding system.

So, I started exploring different strategies to sandbox the agents. Starting with a bare minimum setup and going all the way to setting up a cloud VM. Here is what I learned at each step.

1. Let’s Start with the Baseline

Chroot has been the traditional way to achieve file system isolation. It works well when you want the process to think that a specific, restricted directory is the absolute root of the machine.

However, there are two major caveats.

If the process inside the chroot has root privileges, it could break out.

While it offers file isolation, process isolation is still a problem. A malicious agent can still see other processes running on your system and try to kill them.

As you can see above, doing an ls /proc still shows all the processes running on the host.

This is when I learnt about systemd-nspawn, also called “chroot on steroids”. The difference between chroot and systemd-spawn is that the latter provides isolation at the network and process levels in addition to the file system.

Now, when I do the same ls /proc in the systemd-nspawn mybox container, I just see the processes in the mybox container achieving process-level isolation.

Pros

Lightweight compared to other container processes like Docker, it offers faster startup times.

Native support in Linux.

Caveats

systemd-nspawn is not very popular in the developer community unless you are deep into Linux.

While this works for Linux, what if you need to run your agents on Windows? You will have to find alternatives depending on the platform.

2. Are Containers Enough?

Another technology that comes to mind when thinking about isolated environments is Docker. And unlike the previous concepts we discussed, Docker has a broader ecosystem and a strong community.

With containers, you also get isolated file systems, network interfaces, and process trees. They also come with cross-platform support across Mac, Windows, and Linux. With all these advantages, creating and running agents across different platforms becomes very easy, which makes containers an obvious choice.

However, the model becomes more complex when containers become a dev platform for agents. More often than not, agents need to execute generated code in separate environments, which in practice means spinning up new Docker containers on demand. This introduces a container-in-container pattern (Docker-in-Docker), where an agent running inside a container needs to build and run other containers. 

To make Docker-in-Docker to work, we would have to run the container in privileged mode (–privileged), which gives the container processes elevated permissions rights and dramatically weakens the isolation. At this point, the isolation guarantees are significantly diminished. As a result, complete isolation for agents using only containers becomes tricky.

3. Do Virtual Machines Help?

As you might have already predicted, Virtual Machines (VMs) offer the strongest isolation. With a VM, you can get an entire OS, file system, and network of your own. For example, I currently run MacOS with lima – Linux VM to run Linux-specific workloads.

However, the tradeoff is that spinning up a VM is expensive. And if this needs to be done for every agent, it is not scalable. Some stats that show how expensive spinning up a VM with system-nspawn looks like.

Approach

Per Agent Cost

Boot Time

10 Agents

VM (Lima)

~4GB RAM + 4 CPU

30-60s

~40GB RAM

systemd-nspawn

~10MB RAM

< 1s

~100MB RAM

chroot

1MB RAM

instant

~10MB RAM

For example, in the below screenshot you can see the cost it takes to run a lima vm.

4. MicroVMs to the rescue

A MicroVM (Micro Virtual Machines) felt like the perfect answer to the isolation story. So what is MicroVM, and what makes it better?

MicroVM is a lightweight virtualisation technology that provides the strong security and isolation of a traditional VM, along with the speed of a container.

Strong security and isolation are enabled because a MicroVM gets its own kernel, aka the Guest Kernel, unlike containers, which use a shared kernel. Because of this, any compromise inside the Guest OS does not directly affect the host or the other VMs.

Speed: unlike traditional VMs, it is provisioned with minimal hardware (no USB or PCI buses) and bypasses BIOS/UEFI boot, significantly reducing device emulation overhead and startup latency.

Amazon open-sourced Firecracker in 2018, which was the earliest adoption of the MicroVM architecture. While this helped catalyze the MicroVM architecture, Firecracker was restricted to Linux environments. And most of the agentic orchestration tends to happen on developers’ laptops which run MacOS and Windows as well.

Docker addressed this gap with its Sandbox offering. The best part is their MicroVM-based architecture, which runs natively across macOS, Windows, and Linux, delivering better isolation, faster startup times, and a smoother developer experience. We will learn about this in a bit.

5. gVisor

gVisor takes a unique approach to solving the isolation problem. While the previous strategies used the OS Kernel, gVisor creates its own Kernel called the “application kernel” running in the user space.

When a standard containerized app wants to do something like open a file, allocate memory, or send network traffic, it makes a “system call” (syscall) directly to the host’s Linux kernel.

With gVisor, your app is bundled with a component called the Sentry.

The Sentry intercepts every single syscall your application makes.

It processes that request in user-space using its own implementation of Linux networking, file systems, and memory management.

If the Sentry absolutely needs the host kernel to do something (like actual disk I/O), it translates the request into an extremely restricted, heavily filtered, safe call to the host.

However, it suffers from the same problem as systemd-nspawn. Not much broader community supports and only supports Linux.

Docker Sandbox

With Docker Sandboxes, AI coding agents run in isolated microVM environments. The performance is as seamless as it can be, identical to running on the host, but with significantly stronger isolation and security. This means you can run your autonomous agents without worrying about host compromise or unintended access to your local environment. 

Sandbox achieves this levels of security through three layers of isolation:

Hypervisor Isolation: Every Sandbox has its own Linux Kernel. So, anything that affects the sandbox kernel will not affect the host or other sandbox kernels.

Network Isolation

Each Sandbox has its own isolated network. Meaning multiple sandboxes cannot communicate with each other or with the host.

In addition, network policies can be enforced to allow or disallow traffic from a source.

Docker Engine Isolation

This is what made me fall in love with this new architecture. Every Sandbox gets its own Docker Engine. As a result, whenever the agent runs docker pull or docker compose, those commands are executed against the internal engine rather than the external Docker daemon.

Because of this, agents running inside can only see Docker services within their sandbox and nothing else, adding an additional layer of security.

Attribute

Traditional VM

Container

Docker MicroVM

Isolation

Strong (dedicated kernel)

Weak (shared kernel)

Strong (dedicated kernel)

Boot time

Minutes

Milliseconds

Seconds (after the first image pull)

Attack Surface

Large

Medium

Minimal

To demonstrate Docker Engine isolation, I created two Sandbox sessions, ran the Docker hello-world container image in one, and then ran docker ps -a in both.

​As you can see from the screenshot below, one session has the hello-world container and the other does not. This is possible because both of them are running two different Docker engine daemons.

More on the Sandbox architecture here: https://www.docker.com/blog/why-microvms-the-architecture-behind-docker-sandboxes/

Conclusion

If there is one takeaway; it’s this: isolation plays a major role when building autonomous AI agents because the blast radius of a security mistake is significant. 

Each approach we explored till now solves a different piece of the isolation puzzle. Containers improve portability and developer experience, but inherit the risks of a shared kernel. Virtual Machines deliver strong isolation, but the overhead doesn’t scale when you’re spinning up dozens of agents. gVisor sits in an interesting middle ground, though compatibility and community trade offs might slow you down.

Among all these, what makes Docker Sandbox with MicroVMs compelling is how it unifies these dimensions: VM-level security, container-like startup speed, and a workflow developers already know. Per-sandbox Docker Engines and strict network boundaries make it a strong foundation for running untrusted, autonomous workloads at scale.

So, what are you waiting for? Go ahead and try it out today.

For macOS: brew install docker/tap/sbx

For Windows: winget install Docker.sbx

Quelle: https://blog.docker.com/feed/

AWS India customers can now use UPI Scan and Pay for sign-up and payments

India customers can now use UPI (Unified Payments Interface) Scan and Pay to sign up for AWS or make payments to their invoices. UPI is a popular and convenient payment method in India, which facilitates instant bank-to-bank transfers between two parties through mobile phones with internet. The new Scan and Pay experience simplifies payments by allowing customers to scan a QR code displayed on the AWS Console using their UPI mobile app (such as Google Pay, PhonePe, Paytm, or Amazon Pay), eliminating the need to manually enter a UPI ID. This enhancement makes the UPI payment experience more secure, convenient, and error-free for customers signing up for AWS or making one-time payments. Scan and Pay reduces friction and aligns with how customers commonly use UPI for everyday transactions. Customers can also set up UPI AutoPay using Scan and Pay for automatic monthly payments up to INR 15,000. To use this feature, customers log in to the AWS Console and select UPI as their payment method during signup or when making a payment. A QR code is displayed on screen, which customers scan using their UPI mobile app to verify and authorize the transaction. To learn more, see Managing Payment Methods in India.
Quelle: aws.amazon.com

Amazon SageMaker HyperPod now supports AMI-based node lifecycle configuration for Slurm clusters

Amazon SageMaker HyperPod now supports AMI-based configuration that provisions Slurm cluster nodes with the software and configurations needed for a production-ready environment to run AI/ML training workloads. This removes the need to download, configure, or upload lifecycle configuration scripts to Amazon S3. With fewer operational steps to prepare a cluster and no lifecycle configuration scripts executing during node provisioning, cluster creation time is significantly reduced, so you can start running jobs sooner.
AMI-based configuration includes required software such as Docker, Enroot, and Pyxis, and configurations such as Slurm accounting, SSH key generation, Slurm log rotation and user home directory setup. To enable AMI-based configuration, omit the LifeCycleConfig block from the instance group configuration when creating clusters using the CreateCluster API, or when using the SageMaker AI console, select “None” under Lifecycle scripts in Custom setup. For additional customization on top of the AMI-based configuration baseline, an extension script can be provided, allowing you to focus only on what capabilities and software to add, such as user configuration, observability, or LDAP integration.
Extension scripts can be configured when creating clusters through both the API and the SageMaker AI console. Using the CreateCluster API, specify the new OnInitComplete parameter and SourceS3Uri in the LifeCycleConfig block. Via the console, provide the S3 URI to the extension script in the “Extension script file in S3″ field in Custom setup. For advanced use cases that require full control over provisioning, custom lifecycle configuration scripts remain fully supported through both the API and the SageMaker AI console.
This feature is available in all AWS Regions where SageMaker HyperPod is available. To get started with creating HyperPod Slurm clusters with AMI-based node lifecycle configuration, see Getting started with SageMaker HyperPod using the AWS CLI or Getting started with SageMaker HyperPod using the SageMaker AI console in the SageMaker AI developer guide.
Quelle: aws.amazon.com

AWS Elemental MediaTailor launches Monetization Functions

AWS Elemental MediaTailor now supports monetization functions, a new capability that lets customers customize how MediaTailor builds ad decision server (ADS) requests and manages session data during ad-personalized playback. With monetization functions, customers can call external APIs and run inline data transformations at defined points in the playback session — eliminating the need to build and operate middleware between the player and the ADS.
Common use cases include resolving hashed email addresses into privacy-compliant identity envelopes through providers such as LiveRamp, appending contextual metadata from a content management system to every ad request through providers like GraceNote, activate header bidding workflows through providers like The Trade Desk and running A/B tests across multiple ad decision servers. Monetization functions are fail-open by design: if a function encounters an error, exceeds its timeout, or hits a resource limit, MediaTailor discards the output and proceeds with default ad-insertion behavior, so viewers’ playback is never interrupted.
Monetization functions is available at general availability in all AWS regions where AWS Elemental MediaTailor operates. You are billed per lifecycle hook invocation at a flat rate that does not depend on the number, type, or complexity of functions. For full details, see the MediaTailor pricing page, the Monetization Functions section of the MediaTailor User Guide, and the MediaTailor product page.
Quelle: aws.amazon.com

AWS Capabilities by Region now supports availability notifications

Today, AWS announces availability notifications for AWS Capabilities by Region in AWS Builder Center, a new subscription-based system that automatically alerts builders when an AWS service(s) and/or features(s) become available in their target Regions. Availability notifications make it easy for builders to track availability of 1,500+ services and features across 37 AWS Regions, accelerating infrastructure planning and deployment decisions.
With availability notifications, builders can subscribe at the service level through AWS Builder Center UI, and the subscription automatically covers all underlying features across selected Regions, so there’s no need to track each feature individually. Notifications are delivered through two channels: instantaneous in-app alerts within AWS Builder Center, and a consolidated weekly email digest. Subscriptions and notification preferences can be managed through Settings > Notifications in AWS Builder Center. Common use cases include tracking a specific capability launch, monitoring service parity across AWS Regions, and preparing for upcoming migrations or Regional expansions. For example, a solutions architect expanding a generative AI application into new Regions can subscribe to Amazon Bedrock and receive automatic updates as Knowledge Bases, Guardrails, and other features become available.
Quelle: aws.amazon.com

Amazon SageMaker Unified Studio adds identity and user management features

Amazon SageMaker Unified Studio announces new administration features that give administrators more control over identity configuration and user management for both IAM and Identity Center domain types. In SageMaker IAM domains, administrators can now onboard users through single sign-on by configuring AWS IAM Identity Center. After configuration, administrators can add IAM roles, IAM users, IAM Identity Center users, and IAM Identity Center groups as project members. Teams can collaborate on project data and resources regardless of how individual members authenticate. Administrators can set up IAM Identity Center integration in the SageMaker Unified Studio admin portal. A new domain user management page for SageMaker IAM domains gives administrators a consolidated view of all users active in the domain, where they can manage access and update permissions from a single screen. In SageMaker Identity Center domains, users can now access the SageMaker Unified Studio portal by federating through an IAM role. SageMaker Unified Studio creates a unique user session for each federated user, so users sharing the same role don’t overwrite each other’s work. Administrators can audit individual actions even when multiple users share a single IAM role. With these features, customers can use IAM identity or IAM Identity Center corporate identity across both domain types, giving teams flexibility to collaborate in SageMaker Unified Studio regardless of their authentication method. These features are available in the following AWS Regions: Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), South America (São Paulo), US East (N. Virginia), US East (Ohio), and US West (Oregon). To learn more, visit the SageMaker Unified Studio documentation.
Quelle: aws.amazon.com