Microsoft Sovereign Cloud adds governance, productivity, and support for large AI models securely running even when completely disconnected

As digital sovereignty becomes a strategic requirement, organizations are rethinking how they deploy critical infrastructure and AI capabilities under tighter regulatory expectations and higher risk conditions. Microsoft’s approach to sovereignty is grounded in enabling enterprises, public sectors, and regulated industries to participate in the digital economy securely, independently, and on their own terms. The Microsoft Sovereign Cloud brings together productivity, security, and cloud workloads to span both public and private environments so organizations can choose the right level of control, capability, and compliance. Customers can choose the right control posture for each workload, through a continuum of sovereign options protecting against fragmenting their architecture or increasing operational risk. Trust is built on confidence: confidence that data stays protected, controls are enforceable, and operations can continue under real-world conditions.

To support these confidential environments, Microsoft offers full stack capabilities that support customers across connected, intermittently connected, and fully disconnected modes. Today’s expansion of capabilities includes three major updates:

Azure Local disconnected operations (now available) – Organizations can now run mission-critical infrastructure with Azure governance and policy control, with no cloud connectivity, optimizing continuity for sovereign, classified or isolated environments.

Microsoft 365 Local disconnected (now available) – Core productivity workloads, Exchange Server, SharePoint Server, and Skype for Business Server, can run fully inside the customer’s sovereign operational boundary on Azure Local, keeping teams productive even when disconnected from the cloud.

Foundry Local adds large model and modern infrastructure capabilities – Organizations can now bring large AI models into fully disconnected, sovereign environments with Foundry Local. Using modern infrastructure from partners like NVIDIA, customers with sovereign needs will now be able to run multimodal models locally on their own hardware, inside strict sovereign boundaries enabling powerful, local AI inferencing in fully disconnected environments.

This delivers a truly localized full stack experience built on Azure Local infrastructure and Microsoft 365 Local workloads, designed to stay resilient across any connectivity condition, with large models being part of Foundry Local extending the stack to run advanced multimodal models locally, securely, even when fully disconnected. Customers can now help maintain uninterrupted operations, keep mission critical workloads protected, and apply consistent governance and policy enforcement, while keeping data, identities, and operations within their sovereign boundaries.

Azure Local runs critical infrastructure locally, even when disconnected

For workloads with specialized requirements, Azure Local provides the on-premises foundation with consistent Azure governance and policy controls. With Azure Local disconnected operations, management, policy, and workload execution stay within the customer-operated environments, so services continue running securely even when environments must be isolated or connectivity is not available. Using familiar Azure experiences and consistent policies, organizations can deploy and govern workloads locally without depending on continuous connection to public cloud services. Azure Local is designed to scale with mission-critical needs from smaller deployments to larger footprints that support data-intensive and AI-driven workloads. Customers can start fast, expand over time, and maintain a unified operational model, all within their sovereign boundary.

Operating in disconnected environments surfaces constraints that go beyond traditional cloud assumptions: external dependencies may be unacceptable, connectivity may be intentionally restricted, and operational continuity is a business imperative.

“The availability of Azure Local disconnected operations represents a breakthrough for organizations that need control over their data without sacrificing the power of the Microsoft Cloud. For Luxembourg, where digital sovereignty is not just a principle but a strategic necessity, this model offers the resilience, autonomy, and trust our market expects. By combining Microsoft’s technological leadership with Proximus NXT’s sovereign cloud expertise, we are enabling our customers to innovate confidently – even in fully disconnected mode,”said Gerard Hoffmann, CEO Proximus Luxembourg.

Microsoft 365 Local keeps productivity and collaboration available in fully disconnected environments

As sovereign environments move into disconnected environments, keeping people productive becomes just as critical as keeping infrastructure online. Building on more than a decade of delivering and supporting these services, Microsoft 365 Local disconnected brings that continuity to the productivity layer, delivering Microsoft’s core server workloads—Exchange Server, SharePoint Server,and Skype for Business Server supported through at least 2035—directly into the customer’s sovereign private cloud.

With Microsoft 365 Local, teams can communicate, share information, and collaborate securely within the same controlled boundary as their infrastructure and AI workloads. Everything runs locally, under customer-owned policies, with full control of data resiliency, access, and compliance. By operating with Azure-consistent management and governance, customers get the productivity experience they rely on, designed to stay resilient and secure even when offline.

Bringing large models and modern infrastructure to Foundry Local

With the availability of larger models and modern infrastructure as part of the Foundry Local portfolio, Microsoft is enabling customers with highly secure environments the ability to run multimodal, large models directly inside their sovereign private cloud environments. This brings the richness of Microsoft’s enterprise AI capabilities to on-premises systems, complete with local inferencing and APIs that operate completely within customer-controlled data boundaries.

Expanding beyond small models, the integration of Foundry Local with Azure Local is specifically designed to support large-scale models utilizing the latest GPUs from partners such as NVIDIA. Microsoft will provide comprehensive support for deployments, updates, and operational health. Even as inferencing demands increase overtime, customers retain complete control over their data and hardware.

Choice and control without added complexity

Customers facing strict sovereignty and regulatory requirements are clear that a fully disconnected sovereign private cloud is a key business need. Microsoft Sovereign Private Cloud is designed to meet these needs head-on, enabling secure, compliant operations even in environments with no external connectivity. At the same time, we recognize that disconnected environments are not one-size-fits-all; some customers operate across connected, hybrid, and disconnected modes based on mission, risk, and regulation. Our approach helps customers to meet strict sovereign requirements in fully disconnected scenarios without compromising simplicity, while retaining flexibility where connectivity is possible. Together, Azure Local disconnected operations, Microsoft 365 Local, and Foundry Local help organizations choose where workloads run and how environments are managed, while standardizing governance and operational practices across connected and disconnected deployments.

Next steps

Get started

Azure Local disconnected operations and Microsoft 365 Local disconnected are now available worldwide, and large models on Foundry Local are available to qualified customers.

Explore the Microsoft Sovereign Cloud

Learn more about Azure Local disconnected operations

The post Microsoft Sovereign Cloud adds governance, productivity, and support for large AI models securely running even when completely disconnected appeared first on Microsoft Azure Blog.
Quelle: Azure

Open WebUI + Docker Model Runner: Self-Hosted Models, Zero Configuration

We’re excited to share a seamless new integration between Docker Model Runner (DMR) and Open WebUI, bringing together two open source projects to make working with self-hosted models easier than ever.

With this update, Open WebUI automatically detects and connects to Docker Model Runner running at localhost:12434. If Docker Model Runner is enabled, Open WebUI uses it out of the box, no additional configuration required.

The result: a fully Docker-managed, self-hosted model experience running in minutes.

Note for Docker Desktop users:If you are running Docker Model Runner via Docker Desktop, make sure TCP access is enabled. Open WebUI connects to Docker Model Runner over HTTP, which requires the TCP port to be exposed:

docker desktop enable model-runner –tcp

Better Together: Docker Model Runner and Open WebUI

Docker Model Runner and Open WebUI come from the same open source mindset. They’re built for developers who want control over where their models run and how their systems are put together, whether that’s on a laptop for quick experimentation or on a dedicated GPU host with more horsepower behind it.

Docker Model Runner focuses on the runtime layer: a Docker-native way to run and manage self-hosted models using the tooling developers already rely on. Open WebUI focuses on the experience: a clean, extensible interface that makes those models accessible and useful.

Now, the two connect automatically.

No manual endpoint configuration. No extra flags.

That’s the kind of integration open source does best, separate projects evolving independently, but designed well enough to fit together naturally.

Zero-Config Setup

If Docker Model Runner is enabled, getting started with Open WebUI is as simple as:

docker run -p 3000:8080 openwebui

That’s it.

Open WebUI will automatically connect to Docker Model Runner and begin using your self-hosted models, no environment variables, no manual endpoint configuration, no extra flags.

Visit: http://localhost:3000 and create your account:

And you’re ready to interact with your models through a modern web interface:

Open by design

One of the nice things about this integration is that it didn’t require special coordination or proprietary hooks. Docker Model Runner and Open WebUI are both open source projects with clear boundaries and well-defined interfaces. They were built independently, and they still fit together cleanly.

Docker Model Runner focuses on running and managing models in a way that feels natural to anyone already using Docker.

Open WebUI focuses on making those models usable. It provides the interface layer, conversation management, and extensibility you’d expect from a modern web UI.

Because both projects are open, there’s no hidden contract between them. You can see how the connection works. You can modify it if you need to. You can deploy the pieces separately or together. The integration isn’t a black box, it’s just software speaking a clear interface.

Works with Your Setup

One of the practical benefits of this approach is flexibility.

Docker Model Runner doesn’t dictate where your models run. They might live on your laptop during development, on a more powerful remote machine, or inside a controlled internal environment. As long as Docker Model Runner is reachable, Open WebUI can connect to it.

That separation between runtime and interface is intentional. The UI doesn’t need to know how the model is provisioned. The runtime doesn’t need to know how the UI is presented. Each layer does its job.

With this integration, that boundary becomes almost invisible. Start the container, open your browser, and everything lines up.

You decide where the models run. Open WebUI simply meets them there.

Summary

Open WebUI and Docker Model Runner make self-hosted AI simple, flexible and fully under your control. Docker powers the runtime. Open WebUI delivers a modern interface on top. 

With automatic detection and zero configuration, you can go from enabling Docker Model Runner to interact with your models in minutes. 

Both projects are open source and built with clear boundaries, so you can run models wherever you choose and deploy the pieces together or separately. We can’t wait to see what you build next! 

How You Can Get Involved

The strength of Docker Model Runner lies in its community and there’s always room to grow. We need your help to make this project the best it can be. To get involved, you can:

Star the repository: Show your support and help us gain visibility by starring the Docker Model Runner repo.

Contribute your ideas: Have an idea for a new feature or a bug fix? Create an issue to discuss it. Or fork the repository, make your changes, and submit a pull request. We’re excited to see what ideas you have!

Spread the word: Tell your friends, colleagues, and anyone else who might be interested in running AI models with Docker.

We’re incredibly excited about this new chapter for Docker Model Runner, and we can’t wait to see what we can build together. Let’s get to work!

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Docker Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Announcing new metal sizes for Amazon EC2 M8gn and M8gb instances

Today, AWS announces the general availability of metal-24xl and metal-48xl sizes for Amazon Elastic Compute Cloud (Amazon EC2) M8gn and M8gb instances. These instances are powered by AWS Graviton4 processors to deliver up to 30% better compute performance than AWS Graviton3 processors. M8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network optimized EC2 instances. M8gb offers up to 300 Gbps of EBS bandwidth to provide higher EBS performance compared to same-sized equivalent Graviton4-based instances.
M8gn and M8gb instances offer instance sizes up to 48xlarge and metal-48xl, with up to 768 GiB of memory. M8gn instances offer up to 600 Gbps of networking bandwidth, up to 60 Gbps of bandwidth to Amazon Elastic Block Store (EBS), and are ideal for network-intensive workloads such as high-performance file systems, distributed web scale in-memory caches, caching fleets, real-time big data analytics, Telco applications such as 5G User Plane Function (UPF). M8gb instances offer up to 300 Gbps of EBS bandwidth, up to 400 Gbps of networking bandwidth, and are ideal for workloads requiring high block storage performance such as high-performance databases and NoSQL databases.
M8gn and M8gb instances support Elastic Fabric Adapter (EFA) networking on 16xlarge, 24xlarge, 48xlarge, metal-24xl, and metal-48xl sizes. EFA networking enables lower latency and improved cluster performance for workloads deployed on tightly coupled clusters.
The new metal-24xl and metal-48xl sizes are available in the AWS US East (N. Virginia) region. 
To begin your Graviton journey, visit the Level up your compute with AWS Graviton page. To get started, see AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs.
Quelle: aws.amazon.com