Project Flash update: Advancing Azure Virtual Machine availability monitoring

Previously, we shared an update on Project Flash as part of our Advancing Reliability blog series, reaffirming our commitment to helping Azure customers detect and diagnose virtual machine (VM) availability issues with speed and precision. This year, we’re excited to unveil the latest innovations that take VM availability monitoring to the next level—enabling customers to operate their workloads on Azure with even greater confidence. I’ve asked Yingqi (Halley) Ding, Technical Program Manager from the Azure Core Compute team, to walk us through the newest investments powering the next phase of Project Flash.
— Mark Russinovich, CTO, Deputy CISO, and Technical Fellow, Microsoft Azure.

Project Flash is a cross-division initiative at Microsoft. Its vision is to deliver precise telemetry, real-time alerts, and scalable monitoring—all within a unified, user-friendly experience designed to meet the diverse observability needs of virtual machine (VM) availability.

Flash addresses both platform-level and user-level challenges. It enables rapid detection of issues originating from the Azure platform, helping teams respond quickly to infrastructure-related disruptions. At the same time, it equips you with actionable insights to diagnose and resolve problems within your own environment. This dual capability supports high availability and helps ensure your business Service-Level Agreements are consistently met. It’s our mission to ensure you can:

Gain clear visibility into disruptions, such as VM reboots and restarts, application freezes due to network driver updates, and 30-second host OS updates—with detailed insights into what happened, why it occurred, and whether it was planned or unexpected.

Analyze trends and set alerts to speed up debugging and track availability over time.

Monitor at scale and build custom dashboards to stay on top of the health of all resources.

Receive automated root cause analyses (RCAs) that explain which VMs were affected, what caused the issue, how long it lasted, and what was done to fix it.

Receive real-time notifications for critical events, such as degraded nodes requiring VM redeployment, platform-initiated service healing, or in-place reboots triggered by hardware issues—empowering your teams to respond swiftly and minimize user impact.

Adapt recovery policies dynamically to meet changing workload needs and business priorities.

During our team’s journey with Flash, it has garnered widespread adoption from some of the world’s leading companies spanning from e-commerce, gaming, finance, hedge funds, and many other sectors. Their extensive utilization of Flash underscores its effectiveness and value in meeting the diverse needs of high-profile organizations.

At BlackRock, VM reliability is critical to our operations. If a VM is running on degraded hardware, we want to be alerted quickly so we have the maximum opportunity to mitigate the issue before it impacts users. With Project Flash, we receive a resource health event integrated into our alerting processes the moment an underlying node in Azure infrastructure is marked unallocatable, typically due to health degradation. Our infrastructure team then schedules a migration of the affected resource to healthy hardware at an optimal time. This ability to predictively avoid abrupt VM failures has reduced our VM interruption rate and improved the overall reliability of our investment platform.
— Eli Hamburger, Head of Infrastructure Hosting, BlackRock.

Learn more about Project Flash

Suite of solutions available today

The Flash initiative has evolved into a robust, scalable monitoring framework designed to meet the diverse needs of modern infrastructure—whether you’re managing a handful of VMs or operating at massive scale. Built with reliability at its core, Flash empowers you to monitor what matters most, using the tools and telemetry that align with your architecture and operational model.

Flash publishes VM availability states and resource health annotations for detailed failure attribution and downtime analysis. The guide below outlines your options so you can choose the right Flash monitoring solution for your scenario.

SolutionDescriptionAzure Resource Graph (general availability)For investigations at scale, centralized resource repositories, and historical lookups, you can periodically consume resource availability telemetry across all workloads at once using Azure Resource Graph (ARG).Event Grid system topic (public preview)To trigger time-sensitive and critical mitigations, such as redeploying or restarting VMs to prevent end-user impact, you can receive alerts within seconds of critical changes in resource availability via Event Handlers in Event Grid.Azure Monitor – Metrics (public preview)To track trends, aggregate platform metrics (e.g., CPU, disk), and configure precise threshold-based alerts, you can consume an out-of-the-box VM availability metric via Azure Monitor.Resource Health (general availability)To perform instantaneous and convenient per-resource health checks in the Portal UI, you can quickly view the RHC blade. You can also access a 30-day historical view of health checks for that resource to support fast and effective troubleshooting.

Figure 1: Flash endpoints

What’s new?

Public preview: User vs platform dimension introduced for VM availability metric

Many customers have emphasized the need for user-friendly monitoring solutions that provide real-time, scalable access to compute resource availability data. This information is essential for triggering timely mitigation actions in response to availability changes.

Designed to satisfy this critical need, the VM availability metric is well-suited for tracking trends, aggregating platform metrics (such as CPU and disk usage), and configuring precise threshold-based alerts. You can utilize this out-of-the-box VM availability metric in Azure Monitor.

Figure 2: VM availability metric

Now you can use the Context dimension to identify whether VM availability was influenced by Azure or user-orchestrated activity. This dimension indicates, during any disruption or when the metric drops to zero, whether the cause was platform-triggered or user-driven. It can assume values of Platform, Customer, or Unknown.

Figure 3: Context dimension

The new dimension is also supported in Azure Monitor alert rules as part of the filtering process.

Figure 4: Azure Monitor alert rule

Public preview: Enable sending health resources events to Azure Monitor alerts in Event Grid

Azure Event Grid is a highly scalable, fully managed Pub/Sub message distribution service that offers flexible message consumption patterns. Event Grid enables you to publish and subscribe to messages to support Internet of Things (IoT) solutions. Through HTTP, Event Grid enables you to build event-driven solutions, where a publisher service (such as Project Flash) announces its system state changes (events) to subscriber applications.

Figure 5: Event Grid system topics

With the integration of Azure Monitor alerts as a new event handler, you can now receive low-latency notifications—such as VM availability changes and detailed annotations—via SMS, email, push notifications, and more. This combines Event Grid’s near real-time delivery with Azure Monitor’s direct alerting capabilities.

Figure 6: Event Grid subscription

To get started, simply follow the step-by-step instructions and begin receiving real-time alerts with Flash’s new offering.

What’s next?

Looking ahead, we plan to broaden our focus to include scenarios such as inoperable top-of-rack switches, failures in accelerated networking, and new classes of hardware failure prediction. In addition, we aim to continue enhancing data quality and consistency across all Flash endpoints—enabling more accurate downtime attribution and deeper visibility into VM availability.

For comprehensive monitoring of VM availability—including scenarios such as routine maintenance, live migration, service healing, and degradation—we recommend leveraging both Flash Health events and Scheduled Events (SE).

Flash Health events offer real-time insights into ongoing and historical availability disruptions, including VM degradation. This facilitates effective downtime management, supports automated mitigation strategies, and enhances root cause analysis.

Scheduled Events, in contrast, provide up to 15 minutes of advance notice prior to planned maintenance, enabling proactive decision-making and preparation. During this window, you may choose to acknowledge the event or defer actions based on your operational readiness.

For upcoming updates on the Flash initiative, we encourage you to follow the advancing reliability series!
The post Project Flash update: Advancing Azure Virtual Machine availability monitoring appeared first on Microsoft Azure Blog.
Quelle: Azure

Microsoft Azure AI Foundry Models and Microsoft Security Copilot achieve ISO/IEC 42001:2023 certification

Microsoft has achieved ISO/IEC 42001:2023 certification—a globally recognized standard for Artificial Intelligence Management Systems (AIMS) for both Azure AI Foundry Models and Microsoft Security Copilot. This certification underscores Microsoft’s commitment to building and operating AI systems responsibly, securely, and transparently. As responsible AI is rapidly becoming a business and regulatory imperative, this certification reflects how Microsoft enables customers to innovate with confidence.

Create with Azure AI Foundry Models

Raising the bar for responsible AI with ISO/IEC 42001

ISO/IEC 42001, developed by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), establishes a globally recognized framework for the management of AI systems. It addresses a broad range of requirements, from risk management and bias mitigation to transparency, human oversight, and organizational accountability. This international standard provides a certifiable framework for establishing, implementing, maintaining, and improving an AI management system, supporting organizations in addressing risks and opportunities throughout the AI lifecycle.

By achieving this certification, Microsoft demonstrates that Azure AI Foundry Models, including Azure OpenAI models, and Microsoft Security Copilot prioritize responsible innovation and are validated by an independent third party. It provides our customers with added assurance that Microsoft Azure’s application of robust governance, risk management, and compliance practices across Azure AI Foundry Models and Microsoft Security Copilot are developed and operated in alignment with Microsoft’s Responsible AI Standard.

Supporting customers across industries

Whether you are deploying AI in regulated industries, embedding generative AI into products, or exploring new AI use cases, this certification helps customers:

Accelerate their own compliance journey by leveraging certified AI services and inheriting governance controls aligned with emerging regulations.

Build trust with their own users, partners, and regulators through transparent, auditable governance evidenced with the AIMS certification for these services.

Gain transparency into how Microsoft manages AI risks and governs responsible AI development, giving users greater confidence in the services they build on.

Engineering trust and responsible AI into the Azure platform

Microsoft’s Responsible AI (RAI) program is the backbone of our approach to trustworthy AI and includes four core pillars—Govern, Map, Measure, and Manage—which guides how we design, customize, and manage AI applications and agents. These principles are embedded into both Azure AI Foundry Models and Microsoft Security Copilot, resulting in services designed to be innovative, safe and accountable.

We are committed to delivering on our Responsible AI promise and continue to build on our existing work which includes:

Our AI Customer Commitments to assist our customers on their responsible AI journey.

Our inaugural Responsible AI Transparency Report that enables us to record and share our maturing practices, reflect on what we have learned, chart our goals, hold ourselves accountable, and earn the public’s trust.

Our Transparency Notes for Azure AI Foundry Models and Microsoft Security Copilot help customers understand how our AI technology works, its capabilities and limitations, and the choices system owners can make that influence system performance and behavior.

Our Responsible AI resources site which provides tools, practices, templates and information we believe will help many of our customers establish their responsible AI practices.

Supporting your responsible AI journey with trust

We recognize that responsible AI requires more than technology; it requires operational processes, risk management, and clear accountability. Microsoft supports customers in these efforts by providing both the platform and the expertise to operational trust and compliance. Microsoft remains steadfast in our commitment to the following:

Continually improving our AI management system.

Understanding the needs and expectations of our customers.

Building onto the Microsoft RAI program and AI risk management.

Identifying and actioning upon opportunities that allow us to build and maintain trust in our AI products and services. 

Collaborating with the growing community of responsible AI practitioners, regulators, and researchers on advancing our responsible AI approach.  

ISO/IEC 42001:2023 joins Microsoft’s extensive portfolio of compliance certifications, reflecting our dedication to operational rigor and transparency, helping customers build responsibly on a cloud platform designed for trust. From a healthcare organization striving for fairness to a financial institution overseeing AI risk, or a government agency advancing ethical AI practices, Microsoft’s certifications enable the adoption of AI at scale while aligning compliance with evolving global standards for security, privacy, and responsible AI governance.

Microsoft’s foundation in security and data privacy and our investments in operational resilience and responsible AI shows our dedication to earning and preserving trust at every layer. Azure is engineered for trust, powering innovation on a secure, resilient, and transparent foundation that gives customers the confidence to scale AI responsibly, navigate evolving compliance needs, and stay in control of their data and operations.

Learn more with Microsoft

As AI regulations and expectations continue to evolve, Microsoft remains focused on delivering a trusted platform for AI innovation, built with resiliency, security, and transparency at its core. ISO/IEC 42001:2023 certification is a critical step on that path, and Microsoft will continue investing in exceeding global standards and driving responsible innovations to help customers stay ahead—securely, ethically, and at scale.

Explore how we put trust at the core of cloud innovation with our approach to security, privacy, and compliance at the Microsoft Trust Center. View this certification and report, as well as other compliance documents on the Microsoft Service Trust Portal.

Azure AI Foundry Models
Find the right model from exploration to deployment all in one place.

Discover more >

The ISO/IEC 42001:2023 certification for Azure AI Foundry: Azure AI Foundry Models and Microsoft Security Copilot was issued by Mastermind, an ISO-accredited certification body by the International Accreditation Service (IAS). 

The post Microsoft Azure AI Foundry Models and Microsoft Security Copilot achieve ISO/IEC 42001:2023 certification appeared first on Microsoft Azure Blog.
Quelle: Azure

Databricks runs best on Azure

Azure Databricks has clear advantages over other cloud service providers

This blog is a supplement to the Azure Databricks: Differentiated Synergy blog post and continues to define the differentiation for Azure Databricks in the cloud data analytics and AI landscape.

Azure Databricks: Powering analytics for the data-driven enterprise

In today’s data-driven world, organizations are seeking analytics platforms that simplify management, offer seamless scalability, and deliver consistent performance. While Databricks is available across major cloud service providers (CSPs), not all implementations are equal. Azure Databricks is a first party Microsoft offering co-engineered by Microsoft and Databricks, which stands out for its superior integration, performance, and governance capabilities. It not only delivers strong performance for workloads like decision support systems (DSSs), but it also seamlessly integrates with the Microsoft ecosystem, including solutions such as Azure AI Foundry, Microsoft Power BI, Microsoft Purview, Microsoft Power Platform, Microsoft Copilot Studio, Microsoft Entra ID, Microsoft Fabric, and much more. Choosing Azure Databricks can streamline your entire data lifecycle—from data engineering and Extract Transform Load (ETL) workloads to machine learning (ML), AI, and business intelligence (BI)—within a single, scalable environment.

Maximize the value of your data assets for all analytics and AI use cases

Performance that matters

Principled Technologies (PT), a third-party technology assessment firm, recently analyzed the performance of Azure Databricks and Databricks on Amazon Web Services (AWS). PT stated that Azure Databricks, the Microsoft first-party Databricks service, outperformed Databricks on AWS—it was up to 21.1% faster for single query streams and saved over 9 minutes on four concurrent query streams.

Faster execution for a single query stream demonstrates the better experience a lone user would have. For example, data engineers, scientists, and analysts, and other key users could save time when running multiple detailed reports, tasking the system to handle heavy analytical queries without resource competition.

Faster concurrent query performance demonstrates the better experience multiple users would have while running analyses at the same time. For example, your analysts from different departments can save time when running reports or dashboards simultaneously, sharing cluster resources.

With or without autoscale?1, 2

If cost is a top priority, we recommend autoscaling your Azure Databricks cluster. When certain parts of your data pipeline are more computationally intensive, autoscale enables Azure Databricks to add compute resources and then remove them when the intensity cools down. This can help reduce your costs compared to static compute sizing. Considering the total cost of ownership (TCO) for data and AI platforms is essential, in addition to their integration and optimization capabilities combined with data gravity. An autoscaling cluster is often the most cost-effective option, though it may not be the fastest. If consistent performance is a top priority, consider disabling autoScale.

Key differences: Azure Databricks versus Databricks on other clouds deployed as third party

While all three CSPs offer Databricks, several factors distinguish Azure Databricks:

Underlying infrastructure: Azure Databricks is deeply optimized for Azure Data Lake Storage (ADLS), while AWS uses S3 and Google Cloud uses its own storage solution.

Control plane: Management layers differ, affecting billing, access control, and resource management.

Ecosystem integrations: Azure Databricks natively integrates with Microsoft services like Power BI, Microsoft Fabric, Microsoft Purview, Azure AI Foundry, Power Platform, Copilot Studio, Entra ID, and more.

Pricing: Each CSP has different pricing models, so it’s important to calculate projected costs based on your needs.

Azure-Native features: Anchoring data and AI

Azure Databricks delivers a range of Azure-native features that streamline analytics, governance, and security:

Centralized billing and support: Manage everything through the Azure portal, with unified support from Microsoft and Databricks.

Identity and access management: Use Microsoft Entra ID for seamless authentication and Azure role-based access control (RBAC) for fine-grained access control.

Azure DevOps integration: Native support for Git (Azure Repos) and continuous integration and continuous delivery/deployment (CI/CD) (Azure Pipelines) simplifies deployment and collaboration.

Credential passthrough: Enforces user-specific permissions when accessing ADLS.

Azure Key Vault: Securely manage secrets directly within Databricks notebooks.

ML integration: Deep integration with Azure Machine Learning for experiment tracking, model registry, and one-click deployment from Databricks to Azure ML endpoints.

Azure confidential computing: Protect data in use with hardware-based Trusted Execution Environments, preventing unauthorized access—even by cloud operators.

Azure Monitor: After signing on with Microsoft Entra ID, users can access Azure Databricks, Azure Data Lake Storage, and Azure Monitor from a single pane of glass for an efficient, cohesive, and secure analytics ecosystem in Azure.

Cross-cloud governance: One platform, multiple clouds

Azure Databricks now supports cross-cloud data governance, allowing direct access and management of AWS S3 data via Unity Catalog—without the need for data migration or duplication. This unified approach means you can standardize policies, access controls, and auditing across both Azure and AWS, simplifying operations and enhancing security in hybrid and multicloud environments.

Seamless integration with the Microsoft ecosystem

Azure Databricks is the only Databricks offering that is deeply integrated with the Microsoft ecosystem and some latest integrations are as follows:

Mirrored Azure Databricks Catalog in Microsoft Fabric: This feature enables access to Databricks Unity Catalog metadata and tables directly from Microsoft Fabric, enabling unified governed analytics and eliminating the need for data movement or duplication, especially for serving to Power BI via Direct Lake mode. 

Power Platform Connector: Instantly connect Power Apps, Power Automate, and Copilot Studio to Azure Databricks, enabling real-time, governed access to enterprise data and empowering users to build intelligent, data-driven applications without custom configuration or data duplication.

Azure AI Foundry data connection: Native connector that allow organizations to leverage real-time Azure Databricks data for building responsible, governed AI solutions.

What it means to you

Azure Databricks offers exceptional performance, cost efficiency, and deep integration with Microsoft’s trusted cloud ecosystem and solutions. With features like centralized management, advanced security, cross-cloud governance, and performance advantages, organizations can scale their analytics and AI workloads, unlock faster insights, and drive operational efficiency with Azure Databricks.

Get started with Azure Databricks today and experience why it’s the best home for your data and AI workloads.

 Check out the full Principled Technologies report for more information on Azure Databricks performance.

Explore how Azure Databricks functions and find additional information about the service via Databricks.com.

Learn more about why Databricks runs best on Azure:

Azure Databricks: Differentiated synergy.

5 Reasons Why Azure Databricks is the Best Data + AI Platform on Azure.

Explore and get started with the Azure Databricks Skilling Plan.

Azure Databricks Essentials—Virtual Workshop.

E-Book: Experimentation and AI with Azure Databricks.

E-Book: Modernize Your Data Estate by Migrating to Azure Databricks.

What’s New with Azure Databricks: Unified Governance, Open Formats, and AI-Native Workloads | Databricks Blog.

Azure Databricks
Enable data, analytics, and AI use cases on an open data lake

Discover more >

1Azure, “Best practices for cost optimization,” June 6, 2025, https://learn.microsoft.com/en-us/azure/databricks/lakehouse-architecture/cost-optimization/best-practices.

2Azure, “Best practices for performance efficiency,” June 6, 2025, https://learn.microsoft.com/en-us/azure/databricks/lakehouse-architecture/performance-efficiency/best-practices.

The post Databricks runs best on Azure appeared first on Microsoft Azure Blog.
Quelle: Azure

Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning

State of the art architecture redefines speed for reasoning models

Microsoft is excited to unveil a new edition to the Phi model family: Phi-4-mini-flash-reasoning. Purpose-built for scenarios where compute, memory, and latency are tightly constrained, this new model is engineered to bring advanced reasoning capabilities to edge devices, mobile applications, and other resource-constrained environments. This new model follows Phi-4-mini, but is built on a new hybrid architecture, that achieves up to 10 times higher throughput and a 2 to 3 times average reduction in latency, enabling significantly faster inference without sacrificing reasoning performance. Ready to power real world solutions that demand efficiency and flexibility, Phi-4-mini-flash-reasoning is available on Azure AI Foundry, NVIDIA API Catalog, and Hugging Face today.

Azure AI Foundry
Create without boundaries—Azure AI Foundry has everything you need to design, customize, and manage AI applications and agents

Explore solutions

Efficiency without compromise 

Phi-4-mini-flash-reasoning balances math reasoning ability with efficiency, making it potentially suitable for educational applications, real-time logic-based applications, and more. 

Similar to its predecessor, Phi-4-mini-flash-reasoning is a 3.8 billion parameter open model optimized for advanced math reasoning. It supports a 64K token context length and is fine-tuned on high-quality synthetic data to deliver reliable, logic-intensive performance deployment.  

What’s new?

At the core of Phi-4-mini-flash-reasoning is the newly introduced decoder-hybrid-decoder architecture, SambaY, whose central innovation is the Gated Memory Unit (GMU), a simple yet effective mechanism for sharing representations between layers.  The architecture includes a self-decoder that combines Mamba (a State Space Model) and Sliding Window Attention (SWA), along with a single layer of full attention. The architecture also involves a cross-decoder that interleaves expensive cross-attention layers with the new, efficient GMUs. This new architecture with GMU modules drastically improves decoding efficiency, boosts long-context retrieval performance and enables the architecture to deliver exceptional performance across a wide range of tasks. 

Key benefits of the SambaY architecture include: 

Enhanced decoding efficiency.

Preserves linear prefiling time complexity.

Increased scalability and enhanced long context performance.

Up to 10 times higher throughput.

Our decoder-hybrid-decoder architecture taking Samba [RLL+25] as the self-decoder. Gated Memory Units (GMUs) are interleaved with the cross-attention layers in the cross-decoder to reduce the decoding computation complexity. As in YOCO [SDZ+24], the full attention layer only computes the KV cache during the prefilling with the self-decoder, leading to linear computation complexity for the prefill stage.

Phi-4-mini-flash-reasoning benchmarks 

Like all models in the Phi family, Phi-4-mini-flash-reasoning is deployable on a single GPU, making it accessible for a broad range of use cases. However, what sets it apart is its architectural advantage. This new model achieves significantly lower latency and higher throughput compared to Phi-4-mini-reasoning, particularly in long-context generation and latency-sensitive reasoning tasks. 

This makes Phi-4-mini-flash-reasoning a compelling option for developers and enterprises looking to deploy intelligent systems that require fast, scalable, and efficient reasoning—whether on premises or on-device. 

The top plot shows inference latency as a function of generation length, while the bottom plot illustrates how inference latency varies with throughput. Both experiments were conducted using the vLLM inference framework on a single A100-80GB GPU with tensor parallelism (TP) set to 1.

A more accurate evaluation was used where Pass@1 accuracy is averaged over 64 samples for AIME24/25 and 8 samples for Math500 and GPQA Diamond. In this graph, Phi-4-mini-flash-reasoning outperforms Phi-4-mini-reasoning and is better than models twice its size.

What are the potential use cases? 

Thanks to its reduced latency, improved throughput, and focus on math reasoning, the model is ideal for: 

Adaptive learning platforms, where real-time feedback loops are essential.

On-device reasoning assistants, such as mobile study aids or edge-based logic agents.

Interactive tutoring systems that dynamically adjust content difficulty based on a learner’s performance.

Its strength in math and structured reasoning makes it especially valuable for education technology, lightweight simulations, and automated assessment tools that require reliable logic inference with fast response times. 

Developers are encouraged to connect with peers and Microsoft engineers through the Microsoft Developer Discord community to ask questions, share feedback, and explore real-world use cases together. 

Microsoft’s commitment to trustworthy AI 

Organizations across industries are leveraging Azure AI and Microsoft 365 Copilot capabilities to drive growth, increase productivity, and create value-added experiences. 

We’re committed to helping organizations use and build AI that is trustworthy, meaning it is secure, private, and safe. We bring best practices and learnings from decades of researching and building AI products at scale to provide industry-leading commitments and capabilities that span our three pillars of security, privacy, and safety. Trustworthy AI is only possible when you combine our commitments, such as our Secure Future Initiative and our responsible AI principles, with our product capabilities to unlock AI transformation with confidence.  

Phi models are developed in accordance with Microsoft AI principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness.  

The Phi model family, including Phi-4-mini-flash-reasoning, employs a robust safety post-training strategy that integrates Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF). These techniques are applied using a combination of open-source and proprietary datasets, with a strong emphasis on ensuring helpfulness, minimizing harmful outputs, and addressing a broad range of safety categories. Developers are encouraged to apply responsible AI best practices tailored to their specific use cases and cultural contexts. 

Read the model card to learn more about any risk and mitigation strategies.  

Learn more about the new model 

Try out the new model on Azure AI Foundry.

Find code samples and more in the Phi Cookbook.

Read the Phi-4-mini-flash-reasoning technical paper on Arxiv.

If you have questions, sign up for the Microsoft Developer “Ask Me Anything”. 

Create with Azure AI Foundry

Get started with Azure AI Foundry, and jump directly into Visual Studio Code.

Download the Azure AI Foundry SDK .

Take the Azure AI Foundry learn courses.

Review the Azure AI Foundry documentation.

Keep the conversation going in GitHub and Discord.

The post Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning appeared first on Microsoft Azure Blog.
Quelle: Azure

Introducing Azure Accelerate: Fueling transformation with experts and investments across your cloud and AI journey

As technology continues to reshape every industry, organizations are transforming with cloud, data, and AI—ultimately resulting in smarter operations, better outcomes, and more impactful customer experiences.

From global enterprises to fast-scaling startups, customers are putting Azure to work in powerful ways. To date, more than 26,000 customer projects have leveraged our offerings—Azure Migrate and Modernize and Azure Innovate—to drive faster time to value.

Introducing Azure Accelerate

As customer cloud and AI adoption needs change, we’re evolving our offerings to match your business priorities. Today, we’re excited to introduce Azure Accelerate, a new simplified offering designed to fuel transformation with experts and investments across the cloud and AI journey. Azure Accelerate brings together the full capabilities of Azure Migrate and Modernize, Azure Innovate, and Cloud Accelerate Factory in one place to assist customers from initial planning to full implementation. With Azure Accelerate, customers can:

Access trusted experts: The deep expertise of Azure’s specialized partner ecosystem ensures your projects launch smoothly and scale efficiently. You can also choose to augment your project by taking advantage of a new benefit within Azure Accelerate—our Cloud Accelerate Factory. Cloud Accelerate Factory provides Microsoft experts at no additional cost to deliver hands-on deployment assistance and get projects up and running on Azure faster.

Unlock Microsoft investments: Tap into funding designed to maximize value while minimizing risk. Azure Accelerate helps reduce the costs of engagements with Microsoft investment via partner funding and Azure credits. Microsoft also invests in your long-term success by supporting the skilling of your internal teams. Empower them with free resources available on Microsoft Learn, or develop training programs tailored to your needs with a qualified partner. Azure Accelerate supports projects of all sizes, from the migration of a few servers or virtual machines to the largest initiatives, with no cap on investments.

Succeed with comprehensive coverage: Navigate every stage of your cloud and AI journey with confidence through the robust, end-to-end support of Azure Accelerate. Start your journey with an in-depth assessment using AI-enhanced tools like Azure Migrate to gain critical insights. Design and validate new ideas in sandbox environments and then test solutions through funded pilots or proof-of-value projects before scaling. When you’re ready, start your Azure implementation by having experts build an Azure landing zone. Then, move workloads into Azure at scale following best practices for migrating or building new solutions.

The Cloud Accelerate Factory is a new benefit within Azure Accelerate and is designed to help you jumpstart your Azure projects with zero-cost deployment assistance from Microsoft experts. Through a joint delivery model with Azure partners, these experts can provide hands on deployment of over 30 Azure services using proven strategies developed across thousands of customer engagements. This benefit empowers customers to maximize their investments by offloading predictable technical tasks to our Factory team, enabling internal teams or partners to focus on the more custom and highly detailed elements of a project.

For those organizations who seek guidance and technical best practices, Azure Accelerate is backed by Azure Essentials, which brings together curated, solution-aligned guidance from proven methodologies and tools such as the Microsoft Cloud Adoption Framework, Azure Well-Architected Framework, reference architectures, and more in a single location.

Realizing business value with Azure offerings

Here are just a few examples of how organizations are turning ambition into action:

Modernizing for agility: Global financial leader UBS is using Azure to modernize its infrastructure, enhancing agility and resilience while laying the foundation for future innovation. This modernization has enabled UBS to respond more quickly to market and regulatory changes, while reducing operational complexity.

Unifying data for impact: Humanitarian nonprofit Médecins Sans Frontières UK centralized its data platform using Azure SQL, Dynamics 365, and Power BI. This has resulted in streamlined reporting, faster emergency response, and improved donor engagement—all powered by timely, self-service insights.

Scaling AI for global reach: Bayer Crop Science, in partnership with EY and Microsoft, built a generative AI assistant using Azure OpenAI and Azure AI Search. This natural language tool delivers real-time agronomy insights to farmers worldwide, helping unlock food productivity and accessibility at scale.

Enhancing insights with AI: OneDigital partnered with Microsoft and Data Science Dojo through Azure Innovate to co-develop custom AI agents using Azure OpenAI and Ejento AI. This solution streamlined research, saving 1,000 person-hours annually, and enabled consultants to deliver faster, more personalized client insights, improving retention through high-impact interactions.

Get started with Azure Accelerate

Azure Accelerate is designed to fuel your cloud and AI transformation. It’s how you move faster, innovate smarter, and lead in a cloud-first, AI-powered world.

We’re excited to partner with you on this journey and can’t wait to see what you’ll build next with Azure. To get started, visit Azure Accelerate to learn more or connect with your Microsoft account team or a specialized Azure partner to plan your next steps.
The post Introducing Azure Accelerate: Fueling transformation with experts and investments across your cloud and AI journey appeared first on Microsoft Azure Blog.
Quelle: Azure

Introducing Deep Research in Azure AI Foundry Agent Service

Unlock enterprise-scale web research automation

Today we’re excited to announce the public preview of Deep Research in Azure AI Foundry—an API and software development kit (SDK)-based offering of OpenAI’s advanced agentic research capability, fully integrated with Azure’s enterprise-grade agentic platform.

With Deep Research, developers can build agents that deeply plan, analyze, and synthesize information from across the web—automate complex research tasks, generate transparent, auditable outputs, and seamlessly compose multi-step workflows with other tools and agents in Azure AI Foundry.

Create with Azure AI Foundry

AI agents and knowledge work: Meeting the next frontier of research automation

Generative AI and large language models have made research and analysis faster than ever, powering solutions like ChatGPT Deep Research and Researcher in Microsoft 365 Copilot for individuals and teams. These tools are transforming everyday productivity and document workflows for millions of users.

As organizations look to take the next step—integrating deep research directly into their business apps, automating multi-step processes, and governing knowledge at enterprise scale—the need for programmable, composable, and auditable research automation becomes clear.

This is where Azure AI Foundry and Deep Research come in: offering the flexibility to embed, extend, and orchestrate world-class research as a service across your entire enterprise ecosystem—and connect it with your data and your systems.

Deep Research capabilities in Azure AI Foundry Agent Service

Deep Research in Foundry Agent Service is built for developers who want to move beyond the chat window. By offering Deep Research as a composable agent tool via API and SDK, Azure AI Foundry enables customers to:

Automate web-scale research using a best-in-class research model grounded with Bing Search, with every insight traceable and source-backed.

Programmatically build agents that can be invoked by apps, workflows, or other agents—turning deep research into a reusable, production-ready service.

Orchestrate complex workflows: Compose Deep Research agents with Logic Apps, Azure Functions, and other Foundry Agent Service connectors to automate reporting, notifications, and more.

Ensure enterprise governance: With Azure AI Foundry’s security, compliance, and observability, customers get full control and transparency over how research is run and used.

Unlike packaged chat assistants, Deep Research in Foundry Agent Service can evolve with your needs—ready for automation, extensibility, and integration with future internal data sources as we expand support.

How it works: Architecture and agent flow

Deep Research in Foundry Agent Service is architected for flexibility, transparency, and composability—so you can automate research that’s as robust as your business demands.

At its core, the Deep Research model, o3-deep-research, orchestrates a multi-step research pipeline that’s tightly integrated with Grounding with Bing Search and leverages the latest OpenAI models:

Clarifying intent and scoping the task:When a user or downstream app submits a research query, the agent uses GPT-series models including GPT-4o and GPT-4.1 to clarify the question, gather additional context if needed, and precisely scope the research task. This ensures the agent’s output is both relevant and actionable, and that every search is optimized for your business scenario.

Web grounding with Bing Search:Once the task is scoped, the agent securely invokes the Grounding with Bing Search tool to gather a curated set of high-quality, recent web data. This ensures the research model is working from a foundation of authoritative, up-to-date sources—no hallucinations from stale or irrelevant content.

Deep Research task execution:The o3-deep-research model starts the research task execution. This involves thinking, analyzing, and synthesizing information across all discovered sources. Unlike simple summarization, it reasons step-by-step, pivots as it encounters new insights, and composes a comprehensive answer that’s sensitive to nuance, ambiguity, and emerging patterns in the data.

Transparency, safety, and compliance:The final output is a structured report that documents not only the answer, but also the model’s reasoning path, source citations, and any clarifications requested during the session. This makes every answer fully auditable—a must-have for regulated industries and high-stakes use cases.

Programmatic integration and composition:By exposing Deep Research as an API, Azure AI Foundry empowers you to invoke research from anywhere—custom business apps, internal portals, workflow automation tools, or as part of a larger agent ecosystem. For example, you can trigger a research agent as part of a multi-agent chain: one agent performs deep web analysis, another generates a slide deck with Azure Functions, while a third emails the result to decision makers with Azure Logic Apps. This composability is the real game-changer: research is no longer a manual, one-off task, but a building block for digital transformation and continuous intelligence.

This flexible architecture means Deep Research can be seamlessly embedded into a wide range of enterprise workflows and applications. Already, organizations across industries are evaluating how these programmable research agents can streamline high-value scenarios—from market analysis and competitive intelligence, to large-scale analytics and regulatory reporting.

Pricing for Deep Research (model: o3-deep-research) is as follows: 

Input: $10.00 per 1M tokens.

Cached Input: $2.50 per 1M tokens.

Output: $40.00 per 1M tokens.

Search context tokens are charged input token prices for the model being used. You’ll separately incur charges for Grounding with Bing Search and the base GPT model being used for clarifying questions.  

Get started with Deep Research

Deep Research is available now in limited public preview for Azure AI Foundry Agent Service customers. To get started:

Sign up for the limited public preview to gain early access.

Visit our documentation to learn more about the feature.

Visit our learning modules to build your first agent with Azure AI Foundry Agent Service.

Start building your agents today in Azure AI Foundry.

We can’t wait to see the innovative solutions you’ll build. Stay tuned for customer stories, new features, and future enhancements that will continue to unlock the next generation of enterprise AI agents.

Azure AI Foundry
Get enterprise AI without enterprise complexity.

Learn more >

The post Introducing Deep Research in Azure AI Foundry Agent Service appeared first on Microsoft Azure Blog.
Quelle: Azure

Running high-performance PostgreSQL on Azure Kubernetes Service

In the ever-evolving world of cloud-native technologies, PostgreSQL continues to solidify its position as a top-tier database choice among workloads running on Kubernetes. According to the Kubernetes in the Wild 2025 report, PostgreSQL now powers 36% of all database workloads running on Kubernetes—up 6 points since 2022—signaling its rising popularity and growing trust among the Kubernetes community1. However, running data-intensive PostgreSQL workloads on Kubernetes has its own set of challenges. These include managing Kubernetes primitives like StatefulSets and deployments, as well as achieving optimal performance by configuring storage, replication, and database settings, but this is fast evolving to a simplified experience.

We now provide two options for deploying stateful PostgreSQL workloads based on performance needs. To support databases with stringent latency and scalable transaction requirements, you can leverage Azure Container Storage to orchestrate Kubernetes volume deployment on local NVMe to scale up IOPS while maintaining extremely low sub-ms latency. For scenarios where optimized price-performance is a priority, Premium SSD v2 is the go-to choice. Additionally, working with CloudNativePG, we integrated a robust open-source operator for PostgreSQL to support a high availability database deployment model on Azure Kubernetes Service (AKS). Our advanced storage options combined with CloudNativePG make AKS a robust platform for high-performance PostgreSQL workloads.

Deploy PostgreSQL on AKS

Breakthrough PostgreSQL performance with local NVMe

For performance-critical PostgreSQL workloads, such as those handling massive concurrent transactions or demanding, low-latency data access, local NVMe directly attached to Azure Virtual Machine (VM) SKUs is your best bet. Using local NVMe drives with Kubernetes used to be complicated—it often required setting up RAID across the drives and manually managing static volume orchestrators. Azure Container Storage effectively addresses this challenge.

Azure Container Storage is a fully managed, container-native storage solution, designed specifically for Kubernetes. Developers can simply request a Kubernetes volume, and Azure will dynamically provision storage backed by the available local NVMe drives on AKS nodes. This gives PostgreSQL users direct attach block storage IOPS and latency within a managed, orchestrated cloud environment. Whether you’re powering payment systems, gaming backends, or real-time personalization engines, you get the best of both speed and simplicity. Azure Container Storage also supports Azure Disk and Elastic SAN (Preview), so you can choose backing storage with different durability, scale, or cost as your needs evolve—all under a consistent, Kubernetes-native control plane.

Our benchmark results have shown PostgreSQL achieving close to 15,000 transactions per second (TPS) with single-digit millisecond end-to-end query latency with the Standard_L16s_v3 VM. When scaling up to larger VM SKUs like Standard_L64s_v3, we observed TPS reaching up to 26,000 while maintaining low latency. For more details of our benchmark runs, refer to the comparison of storage options section below.

Optimize price-performance with Premium SSD v2

Azure Premium SSD v2 offers an optimal balance of price-performance and a flexible deployment model, making it especially well-suited for production environments that need to scale over time. With Premium SSD v2, you can configure IOPS, throughput, and size independently—enabling PostgreSQL deployments to scale dynamically with demand while minimizing upfront costs and avoiding resource overprovisioning.

Whether you’re running multi-tenant SaaS platforms, production systems that scale with business needs, or applications with spiky traffic, this flexibility leads to real savings without sacrificing performance. With up to 80,000 IOPS and 1,200 MB/s per volume, Premium SSD v2 supports highly demanding PostgreSQL workloads on an infrastructure that adapts to your app.

Comparison of storage options

To help you assess the two storage options outlined above, we conducted benchmark runs using the CloudNativePG operator setups on AKS with similar core and memory consumption, with both backing storage options as the only variable: one leveraging local NVMe with Azure Container Storage, and the other using Premium SSD v2 with Disk CSI driver.

For the first configuration, we used Standard_D16d_v5 SKU and provisioned two Premium SSD v2 32 GiB disks each having 3000 IOPS and 125 MB/s throughput for log and data files. In the second setup, we ran on Standard_L16s_v3 nodes with local NVMe storage included. The test environment was configured to closely simulate a real-world production database scenario. TPS measures how many individual transactions (such as INSERT, UPDATE, DELETE, or SELECT) a system can handle per second. Latency refers to the time delay between issuing a request to the database and receiving a response, which is especially critical for applications requiring real-time or near-real-time responsiveness, such as financial systems, online gaming, or high-performance analytics.

Local NVMe on Standard_L16s_v3 delivered 14,812 TPS with an average latency of 4.321 milliseconds. PremiumV2_LRS on Standard_D16ds_v5 recorded 8,600 TPS at 7.417 milliseconds latency. See pricing comparison below:

*Monthly costs are based on the base 3000 IOPS and 125 MB/s throughput. You can adjust the performance (capacity, throughput, and IOPS) of Premium SSD v2 disks at any time, allowing workloads to be cost efficient while meeting workload size and performance requirements.

**With 3 VMs of L16s_v3, you get 11.52 TB of storage allocated by default that is used to serve the volumes created for PostgreSQL workload. For other VM sizes in the L-Series family, the price per month and allocated storage will vary.

For PostgreSQL workloads, the choice between using local NVMe and Premium SSD v2 depends on balancing performance, cost, and data durability. Local NVMe via Azure Container Storage offers extremely low latency and high throughput, making it suitable for performance-sensitive PostgreSQL deployments. The costs are higher with local NVMe, and there is less flexibility to scale independently of workload characteristics. Conversely, Premium SSD v2 provides better price-performance efficiency and flexible scalability, making it a viable option for PostgreSQL instances that require handling increased scale or applications having unpredictable surges in demand or usage. In terms of data durability, Premium SSD v2 offers locally redundancy by default, while for local NVMe, it is recommended to use a replica-based architecture managed by CloudNativePG operator and an object storage-based backup approach to prevent data loss.

Built for high availability with CloudNativePG on Azure Kubernetes Service

For teams deploying PostgreSQL in production, high availability and backups are non-negotiable. With the open-source CloudNativePG operator, a highly available PostgreSQL cluster on AKS can easily be deployed with:

Built-in replication and automated failover.

Application consistent backup with native integration with Azure Blob Storage.

Seamless integration with Azure Container Storage.

Flexible storage options: choose Premium SSD v2 or local NVMe based on workload needs.

Whether you’re supporting internal business apps or customer-facing platforms, this gives you peace of mind without the hassle of hand-building custom high availability logic and separate backup workflows. Get started with deploying highly available PostgreSQL on AKS with CloudNativePG operator using our step-by-step reference guide.

Ready for the future

PostgreSQL is just one of many stateful workloads that organizations are now confidently running on Azure Kubernetes Service. From databases to message queues, AI inferencing, and enterprise applications, AKS is evolving to meet the needs of persistent, data-heavy applications in production.

Whether you’re deploying Redis, MongoDB, Kafka, or even ML-serving pipelines with GPU-backed nodes, AKS provides the foundation to manage these workloads with performance, consistency, and operational ease, along with clear end-to-end guidance.

With innovations like Azure Container Storage for local NVMe and Premium SSD v2 for scalable persistent storage, we’re making it easier than ever to build stateful applications that are: reliable, performant, and cost efficient for mission critical workloads.

Modernize your data layer on Kubernetes today. Whether you’re running PostgreSQL or any stateful tier, Azure delivers the performance and manageability to make it happen. Explore proven patterns and deployment options in the AKS Stateful Workloads Overview.

1Kubernetes in the Wild 2025 report
The post Running high-performance PostgreSQL on Azure Kubernetes Service appeared first on Microsoft Azure Blog.
Quelle: Azure

Building secure, scalable AI in the cloud with Microsoft Azure

Generative AI is a transformative force, redefining how modern enterprises operate. It has quickly become central to how businesses drive productivity, innovate, and deliver impact. The pressure is on: organizations must move fast to not only adopt AI, but to unlock real value at scale or risk falling behind.  

Achieving enterprise-wide deployment of AI securely and efficiently is no easy feat. Generative AI is like rocket fuel. It can propel businesses to new heights, but only with the right infrastructure and controls in place. To accelerate safely and strategically, enterprises are turning to Microsoft Azure as mission control. Tapping into Azure’s powerful cloud infrastructure and advanced security solutions allows teams to effectively build, deploy, amplify, and see real results from generative AI. 

To understand how businesses are preparing for AI, we commissioned Forrester Consulting to survey Azure customers. The resulting 2024 Forrester Total Economic ImpactTM study uncovers the steps businesses take to become AI-ready, the challenges of adopting generative AI in the cloud, and how Azure’s scalable infrastructure and built-in security helps businesses deploy AI with confidence. 

Read the full study to learn how organizations are leveraging Azure for AI-readiness and to run generative AI securely in the cloud

Challenges with scaling generative AI on-premises 

Scaling generative AI is like designing transportation systems for a rapidly growing city. Just as urban expansion demands modern transportation infrastructure to function efficiently, AI leaders understand that implementing AI in a meaningful way requires a cloud foundation that is powerful, flexible, and built to handle future demand. AI leaders recognize that the power and agility of the cloud is needed to achieve their desired outcomes.  

In fact, 72% of surveyed respondents whose organization migration to Azure for AI-readiness reported that the migration was necessary or reduced the barriers to enabling AI.

65% of business leaders agreed that deploying generative AI in the cloud would meet their organizational objectives to avoid restrictions and limitations of on-prem deployments. 

Businesses that run most or all of their generative AI workloads on-premises face significant roadblocks. On-premises systems, often lacking the agility offered by the cloud, resemble outdated roadways—prone to congestion, difficult to maintain, expensive to expand, and ill-equipped for today’s demands. Businesses attempting to scale AI in these environments encounter complicated obstacles—including infrastructure limitations, a shortage of specialized talent, and integration challenges that slow innovation—that are frustrating to overcome. Challenges like limited network bandwidth and fragmented data environments further complicate adoption.

Deploying generative AI safely is crucial to protecting sensitive data, maintaining compliance, and mitigating risk. Surveyed decision-makers identified four key areas of concerns: 

Data privacy risks, especially with the proliferation of AI-generated content.

Lack of expertise regarding generative AI security best practices.

Compliance complexities with evolving regulations around AI use and data protection.

Shadow IT risks, as users turn to unauthorized tools and apps, exposing organizations to vulnerabilities.

To overcome these challenges, it’s important to partner with a cloud platform that provides built-in security and regulatory compliance. Cloud migration provides the scalable infrastructure, integrated applications, and AI-ready data foundation necessary for generative AI success. Survey respondents who have already transitioned many or all AI workloads to Azure report enhanced global reach, scalability, and flexibility, all major advantages in today’s rapidly evolving AI landscape. 

Why enterprise chooses Azure for AI-readiness 

Infrastructure limitations are a barrier to scaling generative AI. On-premises environments often hinder performance, increase costs, and slow innovation. According to our survey, 75% of organizations migrating to Azure for AI-readiness reported that the migration was necessary or it significantly reduced barriers to generative AI adoption. 

While the benefits of deploying generative AI in the cloud are clear, teams still face hurdles in adopting AI responsibly. Vulnerabilities, limited expertise in AI security, and data privacy risks are the most prominent concerns. Azure addresses these concerns with comprehensive frameworks that safeguard generative AI workloads end-to-end, from development to runtime. 

Surveyed leaders cited Azure’s colocation strategy as a top reason for partnering with Azure for deploying generative AI, eliminating data silos and optimizing performance. Microsoft Defender for Cloud and Microsoft Sentinel enhance protection and make Azure a trusted platform for safe, enterprise-grade generative AI deployment. 

4 key differentiators for deploying generative AI with Azure

1. Enterprise-grade security and compliant solutions

Security concerns are a primary challenge when deploying generative AI in the cloud. Azure protects AI workloads from code to cloud. Azure’s multi-layered approach helps modern organizations meet compliance standards and minimizes risks across the entire AI lifecycle. Key solutions including Defender for Cloud, Microsoft Sentinel, Microsoft Azure Key Vault, and infrastructure as a service (IaaS) provide end-to-end protection for generative AI workloads, ensuring data privacy, development lifecycle protection, and threat management. Backed by Microsoft’s enterprise-grade security, compliance, and responsible AI commitments, Azure empowers teams to build AI solutions that are not only powerful but also ethical, transparent, and compliant. 

2. Scalable cloud infrastructure

Azure’s cloud infrastructure allows businesses to avoid the constraints of legacy environments, enabling them to launch AI projects efficiently and securely. Azure brings a suite of advanced AI and machine learning tools to the table that are mission critical for generative AI success, enabling organizations to break free from siloed data, outdated security frameworks, and infrastructure bottlenecks. By deploying generative AI in the cloud, businesses can accelerate innovation, streamline operations, and build AI-powered solutions with confidence. 

3. Unified data and AI management

Effective AI starts with a solid data foundation. Azure’s data integration and management solutions—Microsoft Fabric, Azure Synapse Analytics, and Azure Databricks—enable organizations to centralize data, improve governance, and optimize AI model performance. By moving beyond the limitations of legacy on-premises environments, businesses gain seamless data access, better compliance, and the scalability needed to drive AI innovation for enterprise. With Azure, organizations can harness high-quality, well-governed data to power more accurate and reliable AI outcomes. 

4. Faster innovation

By adopting Azure, resources can be redirected from infrastructure maintenance to AI-powered innovation. Azure’s flexible, secure cloud environment enables businesses to experiment, adapt, and evangelize AI solutions with less risk than traditional on-premises deployments. Surveyed organizations using Azure reported more than twice the confidence in their ability to build and refine AI and machine learning applications compared to those relying on on-premises infrastructure. Key benefits include greater flexibility, reduced risk when modifying AI solutions, and the ability to reinvest infrastructure resources into AI upskilling and innovation. 

The business impact of secure generative AI on Azure 

Migrating to Azure for AI deployment enhances performance and operational efficiency. Benefits include: 

Optimized resource allocation: Migrating to the cloud frees IT teams from infrastructure management, allowing them to focus on strategic initiatives—such as developing generative AI use cases—that drive meaningful business impact.

Accelerated time to value: Azure AI services empower data scientists, AI and machine learning engineers, and developers, helping them to deliver high-quality models faster.

Enhanced security and compliance: Azure’s integrated security tools protect workloads, reduce breach risks, and meet evolving compliance standards.

Higher AI application performance: Deploying generative AI with Azure improves application performance—driving innovation and growth. 

Innovation without compromise 

As IT professionals and digital transformation leaders navigate the complexities of AI adoption, Azure stands out as a trusted partner for enterprise AI-readiness. With advanced infrastructure, safe and responsible AI practices, and built-in security, Azure offers a secure and scalable foundation for building and running generative AI in the cloud. With Azure, organizations can unlock the full potential of generative AI to drive innovation, accelerate growth, and lasting business value.

Forrester Research
Microsoft customers rely on Azure for AI-readiness to build and run generative AI securely in the cloud

Read the full study

The post Building secure, scalable AI in the cloud with Microsoft Azure appeared first on Microsoft Azure Blog.
Quelle: Azure

Microsoft Planetary Computer Pro: Unlocking AI-powered geospatial insights for enterprises across industries

A proliferation of satellite constellations and connectivity to hyperscale clouds has made geospatial data available for a wide variety of sectors and use cases: from coordinating supply chains, to managing climate risk, and planning urban infrastructure, just to name a few. Yet despite its growing importance, geospatial data remains notoriously complex and siloed across a variety of sources, including satellites, drones, and other sensors—often accessible only to experts.  

To help solve this challenge, Microsoft has invested in simplifying the complex geospatial landscape—and we are excited to introduce the Public Preview of Microsoft Planetary Computer Pro, a comprehensive platform that makes it dramatically easier for organizations to harness geospatial data for real-world impact. Microsoft Planetary Computer Pro is a next-generation platform designed to bring geospatial insights into the mainstream analytics workflow. It empowers organizations to ingest, catalog, store, process, and disseminate large volumes of private geospatial data in Microsoft Azure, using familiar tools and AI-driven insights. The result? Easier access, optimized datasets, unified security, identity, and governance, and faster time to insight.   

Geospatial insights at your fingertips with Microsoft Planetary Computer Pro

Industries are already realizing the benefits. For example, energy companies are using earth observation data to help monitor infrastructure health and anticipate maintenance needs. In agriculture, organizations are optimizing crop yields by analyzing soil conditions, weather trends, and land use patterns. Retailers are refining site selection strategies by combining demographic data with mobility and footfall analytics. 

These are not isolated cases; they reflect a broader shift. As enterprises face rising pressure to become more efficient, resilient, and sustainable, the ability to operationalize geospatial data is becoming a defining competitive advantage. 

Partner momentum: A thriving ecosystem 

Microsoft’s commitment to working with partners is foundational to our mission.  

Microsoft has been collaborating closely with Esri to integrate ArcGIS Pro and Enterprise into the platform. Esri users will be able to directly access managed content for use in imagery analysis workflows at any scale. This partnership enables geographic information system (GIS) professionals to continue using their preferred tools while benefiting from the scalability and AI capabilities of the Microsoft cloud. 

Microsoft partner Xoople is a start-up launching an end-to-end Earth Intelligence system powered by a new Xoople satellite constellation and Microsoft’s Planetary Computer Pro. With the help of Planetary Computer’s efficient data ingestion, indexing, management, and processing, Xoople plans to transform the datasets and deliver the latest industry insights to end customers via the Azure Marketplace and specialized ISVs. 

Microsoft’s partnerships are also helping provide value to organizations working around the world to enable a more sustainable future.  

Space Intelligence provides customers with audit-grade data on forest coverage and carbon storage for nature-based projects. Space Intelligence uses geospatial data analysis and machine learning through Microsoft Planetary Computer Pro to support zero deforestation and mass restoration. Space Intelligence required easy access in their AI/ML pipelines to a large-scale catalog of input data, both public and private, to process petabytes of data annually. Microsoft Planetary Computer Pro enabled them to scale their AI data storage layer with high-speed access, integrate through APIs, visualize data efficiently with an on-demand tiling stack, and maintain alignment between their open and closed data sources. 

Impact Observatory uses Planetary Computer Pro, Azure Batch, and proprietary models to optimize the production of their land-use land cover map product. By moving their inference pipeline on to Azure and using Azure Batch, Impact Observatory was able to run their model in parallel on 1000 VMs, utilizing a total of 1 million core hours. In less than a week, they produced their global land-use land cover map.  

EY Consulting has emerged as a pivotal force in revolutionizing geospatial capabilities across diverse industries. Their strategic collaboration with Microsoft has empowered supported customers by integrating leading cutting-edge geospatial into Azure. Through their experienced expertise in geospatial data analytics, EY Consulting has made significant strides in embedding these insights into business operations, effectively redefining the geospatial landscape. 

Looking forward: Mainstreaming geospatial insights with AI-ready infrastructure

Microsoft Planetary Computer Pro helps break down the barriers of complexity by integrating directly with tools like Microsoft Fabric, Azure AI Foundry, and Power BI—along with third-party platforms. This interoperability means data analysts, developers, and business users can access and act on geospatial data from mainstream analytics workflow. More than just access, Planetary Computer Pro sets the stage for applied AI—standardizing diverse datasets in a secure, cloud-native environment to enable advanced modeling, forecasting, and decision support. This is the foundation for a future where geospatial insights can help power everyday decisions across nearly every industry. 

Satellite image of Western Washington captured by Landsat 8.

Conclusion: Geospatial insights at your fingertips 

By helping make geospatial insights more accessible, actionable, and AI-ready, Microsoft Planetary Computer Pro empowers organizations to make better decisions for their business and the planet. 

The public preview of Microsoft Planetary Computer Pro is available now in select Azure regions. 

Microsoft Planetary Computer Pro
Unify geospatial data with enterprise AI and analytics to enhance business decisions.

Discover more >

To get started: 

Visit Microsoft Planetary Computer Pro. 

Review our documentation on Microsoft Planetary Computer Pro.

Contact us at MPCPro@microsoft.com. 

As the world grapples with complex challenges, Microsoft Planetary Computer Pro helps ensure that geospatial insights are no longer a luxury for specialists, but accessible to all.
The post Microsoft Planetary Computer Pro: Unlocking AI-powered geospatial insights for enterprises across industries appeared first on Microsoft Azure Blog.
Quelle: Azure

Maximize your ROI for Azure OpenAI

When you’re building with AI, every decision counts—especially when it comes to cost. Whether you’re just getting started or scaling enterprise-grade applications, the last thing you want is unpredictable pricing or rigid infrastructure slowing you down. Azure OpenAI is designed with that in mind: flexible enough for early experiments, powerful enough for global deployments, and priced to match how you actually use it.

From startups to the Fortune 500, more than 60,000 customers are choosing Azure AI Foundry, not just for access to foundational and reasoning models—but because it meets them where they are, with deployment options and pricing models that align to real business needs. This is about more than just AI—it’s about making innovation sustainable, scalable, and accessible.

Azure OpenAI deployment types and pricing options

This blog breaks down the available pricing and deployment options, and tools that support scalable, cost-conscious AI deployments.

Flexible pricing models that match your needs

Azure OpenAI supports three distinct pricing models designed to meet different workload profiles and business requirements:

Standard—For bursty or variable workloads where you want to pay only for what you use.

Provisioned—For high-throughput, performance-sensitive applications that require consistent throughput.

Batch—For large-scale jobs that can be processed asynchronously at a discounted rate.

Each approach is designed to scale with you—whether you’re validating a use case or deploying across business units.

Standard

The Standard deployment model is ideal for teams that want flexibility. You’re charged per API call based on tokens consumed, which helps optimize budgets during periods of lower usage.

Best for: Development, prototyping, or production workloads with variable demand.

You can choose between:

Global deployments: To ensure optimal latency across geographies.

OpenAI Data Zones: For more flexibility and control over data privacy and residency.

With all deployment selections, data is stored at rest within the Azure chosen region of your resource.

Batch

The Batch model is designed for high-efficiency, large-scale inference. Jobs are submitted and processed asynchronously, with responses returned within 24 hours—at up to 50% less than Global Standard pricing. Batch also features large scale workload support to process bulk requests with lower costs. Scale your massive batch queries with minimal friction and efficiently handle large-scale workloads to reduce processing time, with 24-hour target turnaround, at up to 50% less cost than global standard.

Best for: Large-volume tasks with flexible latency needs.

Typical use cases include:

Large-scale data processing and content generation.

Data transformation pipelines.

Model evaluation across extensive datasets.

Customer in action: Ontada

Ontada, a McKesson company, used the Batch API to transform over 150 million oncology documents into structured insights. Applying LLMs across 39 cancer types, they unlocked 70% of previously inaccessible data and cut document processing time by 75%. Learn more in the Ontada case study.

Provisioned

The Provisioned model provides dedicated throughput via Provisioned Throughput Units (PTUs). This enables stable latency and high throughput—ideal for production use cases requiring real-time performance or processing at scale. Commitments can be hourly, monthly, or yearly with corresponding discounts.

Best for: Enterprise workloads with predictable demand and the need for consistent performance.

Common use cases:

High-volume retrieval and document processing scenarios.

Call center operations with predictable traffic hours.

Retail assistant with consistently high throughput.

Customers in action: Visier and UBS

Visier built “Vee,” a generative AI assistant that serves up to 150,000 users per hour. By using PTUs, Visier improved response times by three times compared to pay-as-you-go models and reduced compute costs at scale. Read the case study.

UBS created ‘UBS Red’, a secure AI platform supporting 30,000 employees across regions. PTUs allowed the bank to deliver reliable performance with region-specific deployments across Switzerland, Hong Kong, and Singapore. Read the case study.

Deployment types for standard and provisioned

To meet growing requirements for control, compliance, and cost optimization, Azure OpenAI supports multiple deployment types:

Global: Most cost-effective, routes requests through the global Azure infrastructure, with data residency at rest.

Regional: Keeps data processing in a specific Azure region (28 available today), with data residency both at rest and processing in the selected region.

Data Zones: Offers a middle ground—processing remains within geographic zones (E.U. or U.S.) for added compliance without full regional cost overhead.

Global and Data Zone deployments are available across Standard, Provisioned, and Batch models.

Dynamic features help you cut costs while optimizing performance

Several dynamic new features designed to help you get the best results for lower costs are now available.

Model router for Azure AI Foundry: A deployable AI chat model that automatically selects the best underlying chat model to respond to a given prompt. Perfect for diverse use cases, model router delivers high performance while saving on compute costs where possible, all packaged as a single model deployment.

Batch large scale workload support: Processes bulk requests with lower costs. Efficiently handle large-scale workloads to reduce processing time, with 24-hour target turnaround, at 50% less cost than global standard.

Provisioned throughput dynamic spillover: Provides seamless overflowing for your high-performing applications on provisioned deployments. Manage traffic bursts without service disruption.

Prompt caching: Built-in optimization for repeatable prompt patterns. It accelerates response times, scales throughput, and helps cut token costs significantly.

Azure OpenAI monitoring dashboard: Continuously track performance, usage, and reliability across your deployments.

To learn more about these features and how to leverage the latest innovations in Azure AI Foundry models, watch this session from Build 2025 on optimizing Gen AI applications at scale.

Integrated Cost Management tools

Beyond pricing and deployment flexibility, Azure OpenAI integrates with Microsoft Cost Management tools to give teams visibility and control over their AI spend.

Capabilities include:

Real-time cost analysis.

Budget creation and alerts.

Support for multi-cloud environments.

Cost allocation and chargeback by team, project, or department.

These tools help finance and engineering teams stay aligned—making it easier to understand usage trends, track optimizations, and avoid surprises.

Built-in integration with the Azure ecosystem

Azure OpenAI is part of a larger ecosystem that includes:

Azure AI Foundry—Everything you need to design, customize, and manage AI applications and agents.

Azure Machine Learning—For model training, deployment, and MLOps.

Azure Data Factory—For orchestrating data pipelines.

Azure AI services—For document processing, search, and more.

This integration simplifies the end-to-end lifecycle of building, customizing, and managing AI solutions. You don’t have to stitch together separate platforms—and that means faster time-to-value and fewer operational headaches.

A trusted foundation for enterprise AI

Microsoft is committed to enabling AI that is secure, private, and safe. That commitment shows up not just in policy, but in product:

Secure future initiative: A comprehensive security-by-design approach.

Responsible AI principles: Applied across tools, documentation, and deployment workflows.

Enterprise-grade compliance: Covering data residency, access controls, and auditing.

Get started with Azure AI Foundry

Build custom generative AI models with Azure OpenAI in Foundry Models.

Documentation for Deployment types.

Learn more about Azure OpenAI pricing.

Design, customize, and manage AI applications with Azure AI Foundry.

Azure OpenAI
Deploy the latest reasoning series and foundational models.

Learn more >

The post Maximize your ROI for Azure OpenAI appeared first on Microsoft Azure Blog.
Quelle: Azure