Running high-performance PostgreSQL on Azure Kubernetes Service

In the ever-evolving world of cloud-native technologies, PostgreSQL continues to solidify its position as a top-tier database choice among workloads running on Kubernetes. According to the Kubernetes in the Wild 2025 report, PostgreSQL now powers 36% of all database workloads running on Kubernetes—up 6 points since 2022—signaling its rising popularity and growing trust among the Kubernetes community1. However, running data-intensive PostgreSQL workloads on Kubernetes has its own set of challenges. These include managing Kubernetes primitives like StatefulSets and deployments, as well as achieving optimal performance by configuring storage, replication, and database settings, but this is fast evolving to a simplified experience.

We now provide two options for deploying stateful PostgreSQL workloads based on performance needs. To support databases with stringent latency and scalable transaction requirements, you can leverage Azure Container Storage to orchestrate Kubernetes volume deployment on local NVMe to scale up IOPS while maintaining extremely low sub-ms latency. For scenarios where optimized price-performance is a priority, Premium SSD v2 is the go-to choice. Additionally, working with CloudNativePG, we integrated a robust open-source operator for PostgreSQL to support a high availability database deployment model on Azure Kubernetes Service (AKS). Our advanced storage options combined with CloudNativePG make AKS a robust platform for high-performance PostgreSQL workloads.

Deploy PostgreSQL on AKS

Breakthrough PostgreSQL performance with local NVMe

For performance-critical PostgreSQL workloads, such as those handling massive concurrent transactions or demanding, low-latency data access, local NVMe directly attached to Azure Virtual Machine (VM) SKUs is your best bet. Using local NVMe drives with Kubernetes used to be complicated—it often required setting up RAID across the drives and manually managing static volume orchestrators. Azure Container Storage effectively addresses this challenge.

Azure Container Storage is a fully managed, container-native storage solution, designed specifically for Kubernetes. Developers can simply request a Kubernetes volume, and Azure will dynamically provision storage backed by the available local NVMe drives on AKS nodes. This gives PostgreSQL users direct attach block storage IOPS and latency within a managed, orchestrated cloud environment. Whether you’re powering payment systems, gaming backends, or real-time personalization engines, you get the best of both speed and simplicity. Azure Container Storage also supports Azure Disk and Elastic SAN (Preview), so you can choose backing storage with different durability, scale, or cost as your needs evolve—all under a consistent, Kubernetes-native control plane.

Our benchmark results have shown PostgreSQL achieving close to 15,000 transactions per second (TPS) with single-digit millisecond end-to-end query latency with the Standard_L16s_v3 VM. When scaling up to larger VM SKUs like Standard_L64s_v3, we observed TPS reaching up to 26,000 while maintaining low latency. For more details of our benchmark runs, refer to the comparison of storage options section below.

Optimize price-performance with Premium SSD v2

Azure Premium SSD v2 offers an optimal balance of price-performance and a flexible deployment model, making it especially well-suited for production environments that need to scale over time. With Premium SSD v2, you can configure IOPS, throughput, and size independently—enabling PostgreSQL deployments to scale dynamically with demand while minimizing upfront costs and avoiding resource overprovisioning.

Whether you’re running multi-tenant SaaS platforms, production systems that scale with business needs, or applications with spiky traffic, this flexibility leads to real savings without sacrificing performance. With up to 80,000 IOPS and 1,200 MB/s per volume, Premium SSD v2 supports highly demanding PostgreSQL workloads on an infrastructure that adapts to your app.

Comparison of storage options

To help you assess the two storage options outlined above, we conducted benchmark runs using the CloudNativePG operator setups on AKS with similar core and memory consumption, with both backing storage options as the only variable: one leveraging local NVMe with Azure Container Storage, and the other using Premium SSD v2 with Disk CSI driver.

For the first configuration, we used Standard_D16d_v5 SKU and provisioned two Premium SSD v2 32 GiB disks each having 3000 IOPS and 125 MB/s throughput for log and data files. In the second setup, we ran on Standard_L16s_v3 nodes with local NVMe storage included. The test environment was configured to closely simulate a real-world production database scenario. TPS measures how many individual transactions (such as INSERT, UPDATE, DELETE, or SELECT) a system can handle per second. Latency refers to the time delay between issuing a request to the database and receiving a response, which is especially critical for applications requiring real-time or near-real-time responsiveness, such as financial systems, online gaming, or high-performance analytics.

Local NVMe on Standard_L16s_v3 delivered 14,812 TPS with an average latency of 4.321 milliseconds. PremiumV2_LRS on Standard_D16ds_v5 recorded 8,600 TPS at 7.417 milliseconds latency. See pricing comparison below:

*Monthly costs are based on the base 3000 IOPS and 125 MB/s throughput. You can adjust the performance (capacity, throughput, and IOPS) of Premium SSD v2 disks at any time, allowing workloads to be cost efficient while meeting workload size and performance requirements.

**With 3 VMs of L16s_v3, you get 11.52 TB of storage allocated by default that is used to serve the volumes created for PostgreSQL workload. For other VM sizes in the L-Series family, the price per month and allocated storage will vary.

For PostgreSQL workloads, the choice between using local NVMe and Premium SSD v2 depends on balancing performance, cost, and data durability. Local NVMe via Azure Container Storage offers extremely low latency and high throughput, making it suitable for performance-sensitive PostgreSQL deployments. The costs are higher with local NVMe, and there is less flexibility to scale independently of workload characteristics. Conversely, Premium SSD v2 provides better price-performance efficiency and flexible scalability, making it a viable option for PostgreSQL instances that require handling increased scale or applications having unpredictable surges in demand or usage. In terms of data durability, Premium SSD v2 offers locally redundancy by default, while for local NVMe, it is recommended to use a replica-based architecture managed by CloudNativePG operator and an object storage-based backup approach to prevent data loss.

Built for high availability with CloudNativePG on Azure Kubernetes Service

For teams deploying PostgreSQL in production, high availability and backups are non-negotiable. With the open-source CloudNativePG operator, a highly available PostgreSQL cluster on AKS can easily be deployed with:

Built-in replication and automated failover.

Application consistent backup with native integration with Azure Blob Storage.

Seamless integration with Azure Container Storage.

Flexible storage options: choose Premium SSD v2 or local NVMe based on workload needs.

Whether you’re supporting internal business apps or customer-facing platforms, this gives you peace of mind without the hassle of hand-building custom high availability logic and separate backup workflows. Get started with deploying highly available PostgreSQL on AKS with CloudNativePG operator using our step-by-step reference guide.

Ready for the future

PostgreSQL is just one of many stateful workloads that organizations are now confidently running on Azure Kubernetes Service. From databases to message queues, AI inferencing, and enterprise applications, AKS is evolving to meet the needs of persistent, data-heavy applications in production.

Whether you’re deploying Redis, MongoDB, Kafka, or even ML-serving pipelines with GPU-backed nodes, AKS provides the foundation to manage these workloads with performance, consistency, and operational ease, along with clear end-to-end guidance.

With innovations like Azure Container Storage for local NVMe and Premium SSD v2 for scalable persistent storage, we’re making it easier than ever to build stateful applications that are: reliable, performant, and cost efficient for mission critical workloads.

Modernize your data layer on Kubernetes today. Whether you’re running PostgreSQL or any stateful tier, Azure delivers the performance and manageability to make it happen. Explore proven patterns and deployment options in the AKS Stateful Workloads Overview.

1Kubernetes in the Wild 2025 report
The post Running high-performance PostgreSQL on Azure Kubernetes Service appeared first on Microsoft Azure Blog.
Quelle: Azure

Building secure, scalable AI in the cloud with Microsoft Azure

Generative AI is a transformative force, redefining how modern enterprises operate. It has quickly become central to how businesses drive productivity, innovate, and deliver impact. The pressure is on: organizations must move fast to not only adopt AI, but to unlock real value at scale or risk falling behind.  

Achieving enterprise-wide deployment of AI securely and efficiently is no easy feat. Generative AI is like rocket fuel. It can propel businesses to new heights, but only with the right infrastructure and controls in place. To accelerate safely and strategically, enterprises are turning to Microsoft Azure as mission control. Tapping into Azure’s powerful cloud infrastructure and advanced security solutions allows teams to effectively build, deploy, amplify, and see real results from generative AI. 

To understand how businesses are preparing for AI, we commissioned Forrester Consulting to survey Azure customers. The resulting 2024 Forrester Total Economic ImpactTM study uncovers the steps businesses take to become AI-ready, the challenges of adopting generative AI in the cloud, and how Azure’s scalable infrastructure and built-in security helps businesses deploy AI with confidence. 

Read the full study to learn how organizations are leveraging Azure for AI-readiness and to run generative AI securely in the cloud

Challenges with scaling generative AI on-premises 

Scaling generative AI is like designing transportation systems for a rapidly growing city. Just as urban expansion demands modern transportation infrastructure to function efficiently, AI leaders understand that implementing AI in a meaningful way requires a cloud foundation that is powerful, flexible, and built to handle future demand. AI leaders recognize that the power and agility of the cloud is needed to achieve their desired outcomes.  

In fact, 72% of surveyed respondents whose organization migration to Azure for AI-readiness reported that the migration was necessary or reduced the barriers to enabling AI.

65% of business leaders agreed that deploying generative AI in the cloud would meet their organizational objectives to avoid restrictions and limitations of on-prem deployments. 

Businesses that run most or all of their generative AI workloads on-premises face significant roadblocks. On-premises systems, often lacking the agility offered by the cloud, resemble outdated roadways—prone to congestion, difficult to maintain, expensive to expand, and ill-equipped for today’s demands. Businesses attempting to scale AI in these environments encounter complicated obstacles—including infrastructure limitations, a shortage of specialized talent, and integration challenges that slow innovation—that are frustrating to overcome. Challenges like limited network bandwidth and fragmented data environments further complicate adoption.

Deploying generative AI safely is crucial to protecting sensitive data, maintaining compliance, and mitigating risk. Surveyed decision-makers identified four key areas of concerns: 

Data privacy risks, especially with the proliferation of AI-generated content.

Lack of expertise regarding generative AI security best practices.

Compliance complexities with evolving regulations around AI use and data protection.

Shadow IT risks, as users turn to unauthorized tools and apps, exposing organizations to vulnerabilities.

To overcome these challenges, it’s important to partner with a cloud platform that provides built-in security and regulatory compliance. Cloud migration provides the scalable infrastructure, integrated applications, and AI-ready data foundation necessary for generative AI success. Survey respondents who have already transitioned many or all AI workloads to Azure report enhanced global reach, scalability, and flexibility, all major advantages in today’s rapidly evolving AI landscape. 

Why enterprise chooses Azure for AI-readiness 

Infrastructure limitations are a barrier to scaling generative AI. On-premises environments often hinder performance, increase costs, and slow innovation. According to our survey, 75% of organizations migrating to Azure for AI-readiness reported that the migration was necessary or it significantly reduced barriers to generative AI adoption. 

While the benefits of deploying generative AI in the cloud are clear, teams still face hurdles in adopting AI responsibly. Vulnerabilities, limited expertise in AI security, and data privacy risks are the most prominent concerns. Azure addresses these concerns with comprehensive frameworks that safeguard generative AI workloads end-to-end, from development to runtime. 

Surveyed leaders cited Azure’s colocation strategy as a top reason for partnering with Azure for deploying generative AI, eliminating data silos and optimizing performance. Microsoft Defender for Cloud and Microsoft Sentinel enhance protection and make Azure a trusted platform for safe, enterprise-grade generative AI deployment. 

4 key differentiators for deploying generative AI with Azure

1. Enterprise-grade security and compliant solutions

Security concerns are a primary challenge when deploying generative AI in the cloud. Azure protects AI workloads from code to cloud. Azure’s multi-layered approach helps modern organizations meet compliance standards and minimizes risks across the entire AI lifecycle. Key solutions including Defender for Cloud, Microsoft Sentinel, Microsoft Azure Key Vault, and infrastructure as a service (IaaS) provide end-to-end protection for generative AI workloads, ensuring data privacy, development lifecycle protection, and threat management. Backed by Microsoft’s enterprise-grade security, compliance, and responsible AI commitments, Azure empowers teams to build AI solutions that are not only powerful but also ethical, transparent, and compliant. 

2. Scalable cloud infrastructure

Azure’s cloud infrastructure allows businesses to avoid the constraints of legacy environments, enabling them to launch AI projects efficiently and securely. Azure brings a suite of advanced AI and machine learning tools to the table that are mission critical for generative AI success, enabling organizations to break free from siloed data, outdated security frameworks, and infrastructure bottlenecks. By deploying generative AI in the cloud, businesses can accelerate innovation, streamline operations, and build AI-powered solutions with confidence. 

3. Unified data and AI management

Effective AI starts with a solid data foundation. Azure’s data integration and management solutions—Microsoft Fabric, Azure Synapse Analytics, and Azure Databricks—enable organizations to centralize data, improve governance, and optimize AI model performance. By moving beyond the limitations of legacy on-premises environments, businesses gain seamless data access, better compliance, and the scalability needed to drive AI innovation for enterprise. With Azure, organizations can harness high-quality, well-governed data to power more accurate and reliable AI outcomes. 

4. Faster innovation

By adopting Azure, resources can be redirected from infrastructure maintenance to AI-powered innovation. Azure’s flexible, secure cloud environment enables businesses to experiment, adapt, and evangelize AI solutions with less risk than traditional on-premises deployments. Surveyed organizations using Azure reported more than twice the confidence in their ability to build and refine AI and machine learning applications compared to those relying on on-premises infrastructure. Key benefits include greater flexibility, reduced risk when modifying AI solutions, and the ability to reinvest infrastructure resources into AI upskilling and innovation. 

The business impact of secure generative AI on Azure 

Migrating to Azure for AI deployment enhances performance and operational efficiency. Benefits include: 

Optimized resource allocation: Migrating to the cloud frees IT teams from infrastructure management, allowing them to focus on strategic initiatives—such as developing generative AI use cases—that drive meaningful business impact.

Accelerated time to value: Azure AI services empower data scientists, AI and machine learning engineers, and developers, helping them to deliver high-quality models faster.

Enhanced security and compliance: Azure’s integrated security tools protect workloads, reduce breach risks, and meet evolving compliance standards.

Higher AI application performance: Deploying generative AI with Azure improves application performance—driving innovation and growth. 

Innovation without compromise 

As IT professionals and digital transformation leaders navigate the complexities of AI adoption, Azure stands out as a trusted partner for enterprise AI-readiness. With advanced infrastructure, safe and responsible AI practices, and built-in security, Azure offers a secure and scalable foundation for building and running generative AI in the cloud. With Azure, organizations can unlock the full potential of generative AI to drive innovation, accelerate growth, and lasting business value.

Forrester Research
Microsoft customers rely on Azure for AI-readiness to build and run generative AI securely in the cloud

Read the full study

The post Building secure, scalable AI in the cloud with Microsoft Azure appeared first on Microsoft Azure Blog.
Quelle: Azure

Microsoft Planetary Computer Pro: Unlocking AI-powered geospatial insights for enterprises across industries

A proliferation of satellite constellations and connectivity to hyperscale clouds has made geospatial data available for a wide variety of sectors and use cases: from coordinating supply chains, to managing climate risk, and planning urban infrastructure, just to name a few. Yet despite its growing importance, geospatial data remains notoriously complex and siloed across a variety of sources, including satellites, drones, and other sensors—often accessible only to experts.  

To help solve this challenge, Microsoft has invested in simplifying the complex geospatial landscape—and we are excited to introduce the Public Preview of Microsoft Planetary Computer Pro, a comprehensive platform that makes it dramatically easier for organizations to harness geospatial data for real-world impact. Microsoft Planetary Computer Pro is a next-generation platform designed to bring geospatial insights into the mainstream analytics workflow. It empowers organizations to ingest, catalog, store, process, and disseminate large volumes of private geospatial data in Microsoft Azure, using familiar tools and AI-driven insights. The result? Easier access, optimized datasets, unified security, identity, and governance, and faster time to insight.   

Geospatial insights at your fingertips with Microsoft Planetary Computer Pro

Industries are already realizing the benefits. For example, energy companies are using earth observation data to help monitor infrastructure health and anticipate maintenance needs. In agriculture, organizations are optimizing crop yields by analyzing soil conditions, weather trends, and land use patterns. Retailers are refining site selection strategies by combining demographic data with mobility and footfall analytics. 

These are not isolated cases; they reflect a broader shift. As enterprises face rising pressure to become more efficient, resilient, and sustainable, the ability to operationalize geospatial data is becoming a defining competitive advantage. 

Partner momentum: A thriving ecosystem 

Microsoft’s commitment to working with partners is foundational to our mission.  

Microsoft has been collaborating closely with Esri to integrate ArcGIS Pro and Enterprise into the platform. Esri users will be able to directly access managed content for use in imagery analysis workflows at any scale. This partnership enables geographic information system (GIS) professionals to continue using their preferred tools while benefiting from the scalability and AI capabilities of the Microsoft cloud. 

Microsoft partner Xoople is a start-up launching an end-to-end Earth Intelligence system powered by a new Xoople satellite constellation and Microsoft’s Planetary Computer Pro. With the help of Planetary Computer’s efficient data ingestion, indexing, management, and processing, Xoople plans to transform the datasets and deliver the latest industry insights to end customers via the Azure Marketplace and specialized ISVs. 

Microsoft’s partnerships are also helping provide value to organizations working around the world to enable a more sustainable future.  

Space Intelligence provides customers with audit-grade data on forest coverage and carbon storage for nature-based projects. Space Intelligence uses geospatial data analysis and machine learning through Microsoft Planetary Computer Pro to support zero deforestation and mass restoration. Space Intelligence required easy access in their AI/ML pipelines to a large-scale catalog of input data, both public and private, to process petabytes of data annually. Microsoft Planetary Computer Pro enabled them to scale their AI data storage layer with high-speed access, integrate through APIs, visualize data efficiently with an on-demand tiling stack, and maintain alignment between their open and closed data sources. 

Impact Observatory uses Planetary Computer Pro, Azure Batch, and proprietary models to optimize the production of their land-use land cover map product. By moving their inference pipeline on to Azure and using Azure Batch, Impact Observatory was able to run their model in parallel on 1000 VMs, utilizing a total of 1 million core hours. In less than a week, they produced their global land-use land cover map.  

EY Consulting has emerged as a pivotal force in revolutionizing geospatial capabilities across diverse industries. Their strategic collaboration with Microsoft has empowered supported customers by integrating leading cutting-edge geospatial into Azure. Through their experienced expertise in geospatial data analytics, EY Consulting has made significant strides in embedding these insights into business operations, effectively redefining the geospatial landscape. 

Looking forward: Mainstreaming geospatial insights with AI-ready infrastructure

Microsoft Planetary Computer Pro helps break down the barriers of complexity by integrating directly with tools like Microsoft Fabric, Azure AI Foundry, and Power BI—along with third-party platforms. This interoperability means data analysts, developers, and business users can access and act on geospatial data from mainstream analytics workflow. More than just access, Planetary Computer Pro sets the stage for applied AI—standardizing diverse datasets in a secure, cloud-native environment to enable advanced modeling, forecasting, and decision support. This is the foundation for a future where geospatial insights can help power everyday decisions across nearly every industry. 

Satellite image of Western Washington captured by Landsat 8.

Conclusion: Geospatial insights at your fingertips 

By helping make geospatial insights more accessible, actionable, and AI-ready, Microsoft Planetary Computer Pro empowers organizations to make better decisions for their business and the planet. 

The public preview of Microsoft Planetary Computer Pro is available now in select Azure regions. 

Microsoft Planetary Computer Pro
Unify geospatial data with enterprise AI and analytics to enhance business decisions.

Discover more >

To get started: 

Visit Microsoft Planetary Computer Pro. 

Review our documentation on Microsoft Planetary Computer Pro.

Contact us at MPCPro@microsoft.com. 

As the world grapples with complex challenges, Microsoft Planetary Computer Pro helps ensure that geospatial insights are no longer a luxury for specialists, but accessible to all.
The post Microsoft Planetary Computer Pro: Unlocking AI-powered geospatial insights for enterprises across industries appeared first on Microsoft Azure Blog.
Quelle: Azure

Maximize your ROI for Azure OpenAI

When you’re building with AI, every decision counts—especially when it comes to cost. Whether you’re just getting started or scaling enterprise-grade applications, the last thing you want is unpredictable pricing or rigid infrastructure slowing you down. Azure OpenAI is designed with that in mind: flexible enough for early experiments, powerful enough for global deployments, and priced to match how you actually use it.

From startups to the Fortune 500, more than 60,000 customers are choosing Azure AI Foundry, not just for access to foundational and reasoning models—but because it meets them where they are, with deployment options and pricing models that align to real business needs. This is about more than just AI—it’s about making innovation sustainable, scalable, and accessible.

Azure OpenAI deployment types and pricing options

This blog breaks down the available pricing and deployment options, and tools that support scalable, cost-conscious AI deployments.

Flexible pricing models that match your needs

Azure OpenAI supports three distinct pricing models designed to meet different workload profiles and business requirements:

Standard—For bursty or variable workloads where you want to pay only for what you use.

Provisioned—For high-throughput, performance-sensitive applications that require consistent throughput.

Batch—For large-scale jobs that can be processed asynchronously at a discounted rate.

Each approach is designed to scale with you—whether you’re validating a use case or deploying across business units.

Standard

The Standard deployment model is ideal for teams that want flexibility. You’re charged per API call based on tokens consumed, which helps optimize budgets during periods of lower usage.

Best for: Development, prototyping, or production workloads with variable demand.

You can choose between:

Global deployments: To ensure optimal latency across geographies.

OpenAI Data Zones: For more flexibility and control over data privacy and residency.

With all deployment selections, data is stored at rest within the Azure chosen region of your resource.

Batch

The Batch model is designed for high-efficiency, large-scale inference. Jobs are submitted and processed asynchronously, with responses returned within 24 hours—at up to 50% less than Global Standard pricing. Batch also features large scale workload support to process bulk requests with lower costs. Scale your massive batch queries with minimal friction and efficiently handle large-scale workloads to reduce processing time, with 24-hour target turnaround, at up to 50% less cost than global standard.

Best for: Large-volume tasks with flexible latency needs.

Typical use cases include:

Large-scale data processing and content generation.

Data transformation pipelines.

Model evaluation across extensive datasets.

Customer in action: Ontada

Ontada, a McKesson company, used the Batch API to transform over 150 million oncology documents into structured insights. Applying LLMs across 39 cancer types, they unlocked 70% of previously inaccessible data and cut document processing time by 75%. Learn more in the Ontada case study.

Provisioned

The Provisioned model provides dedicated throughput via Provisioned Throughput Units (PTUs). This enables stable latency and high throughput—ideal for production use cases requiring real-time performance or processing at scale. Commitments can be hourly, monthly, or yearly with corresponding discounts.

Best for: Enterprise workloads with predictable demand and the need for consistent performance.

Common use cases:

High-volume retrieval and document processing scenarios.

Call center operations with predictable traffic hours.

Retail assistant with consistently high throughput.

Customers in action: Visier and UBS

Visier built “Vee,” a generative AI assistant that serves up to 150,000 users per hour. By using PTUs, Visier improved response times by three times compared to pay-as-you-go models and reduced compute costs at scale. Read the case study.

UBS created ‘UBS Red’, a secure AI platform supporting 30,000 employees across regions. PTUs allowed the bank to deliver reliable performance with region-specific deployments across Switzerland, Hong Kong, and Singapore. Read the case study.

Deployment types for standard and provisioned

To meet growing requirements for control, compliance, and cost optimization, Azure OpenAI supports multiple deployment types:

Global: Most cost-effective, routes requests through the global Azure infrastructure, with data residency at rest.

Regional: Keeps data processing in a specific Azure region (28 available today), with data residency both at rest and processing in the selected region.

Data Zones: Offers a middle ground—processing remains within geographic zones (E.U. or U.S.) for added compliance without full regional cost overhead.

Global and Data Zone deployments are available across Standard, Provisioned, and Batch models.

Dynamic features help you cut costs while optimizing performance

Several dynamic new features designed to help you get the best results for lower costs are now available.

Model router for Azure AI Foundry: A deployable AI chat model that automatically selects the best underlying chat model to respond to a given prompt. Perfect for diverse use cases, model router delivers high performance while saving on compute costs where possible, all packaged as a single model deployment.

Batch large scale workload support: Processes bulk requests with lower costs. Efficiently handle large-scale workloads to reduce processing time, with 24-hour target turnaround, at 50% less cost than global standard.

Provisioned throughput dynamic spillover: Provides seamless overflowing for your high-performing applications on provisioned deployments. Manage traffic bursts without service disruption.

Prompt caching: Built-in optimization for repeatable prompt patterns. It accelerates response times, scales throughput, and helps cut token costs significantly.

Azure OpenAI monitoring dashboard: Continuously track performance, usage, and reliability across your deployments.

To learn more about these features and how to leverage the latest innovations in Azure AI Foundry models, watch this session from Build 2025 on optimizing Gen AI applications at scale.

Integrated Cost Management tools

Beyond pricing and deployment flexibility, Azure OpenAI integrates with Microsoft Cost Management tools to give teams visibility and control over their AI spend.

Capabilities include:

Real-time cost analysis.

Budget creation and alerts.

Support for multi-cloud environments.

Cost allocation and chargeback by team, project, or department.

These tools help finance and engineering teams stay aligned—making it easier to understand usage trends, track optimizations, and avoid surprises.

Built-in integration with the Azure ecosystem

Azure OpenAI is part of a larger ecosystem that includes:

Azure AI Foundry—Everything you need to design, customize, and manage AI applications and agents.

Azure Machine Learning—For model training, deployment, and MLOps.

Azure Data Factory—For orchestrating data pipelines.

Azure AI services—For document processing, search, and more.

This integration simplifies the end-to-end lifecycle of building, customizing, and managing AI solutions. You don’t have to stitch together separate platforms—and that means faster time-to-value and fewer operational headaches.

A trusted foundation for enterprise AI

Microsoft is committed to enabling AI that is secure, private, and safe. That commitment shows up not just in policy, but in product:

Secure future initiative: A comprehensive security-by-design approach.

Responsible AI principles: Applied across tools, documentation, and deployment workflows.

Enterprise-grade compliance: Covering data residency, access controls, and auditing.

Get started with Azure AI Foundry

Build custom generative AI models with Azure OpenAI in Foundry Models.

Documentation for Deployment types.

Learn more about Azure OpenAI pricing.

Design, customize, and manage AI applications with Azure AI Foundry.

Azure OpenAI
Deploy the latest reasoning series and foundational models.

Learn more >

The post Maximize your ROI for Azure OpenAI appeared first on Microsoft Azure Blog.
Quelle: Azure

IDC Business Value Study: A 306% ROI within 3 years using Ubuntu Linux on Azure

Businesses today are under pressure to innovate faster, reduce costs, and stay secure—all while preparing for an AI-driven future. As part of this shift, many organizations are turning to Microsoft Azure to modernize their infrastructure. In doing so, they find that migrating to Azure helps meet these evolving demands by improving agility, strengthening security, and laying the foundation for AI readiness.

Microsoft Azure supports your migration and modernization journey with services built for Linux and Open Source. Central to this transformation is Ubuntu, Canonical’s enterprise-grade Linux distribution, which integrates seamlessly with Azure’s IaaS and PaaS. Together, they deliver high performance, reliability, and enterprise support—plus a broad set of tools to make migration smooth and efficient.

Optimize your Ubuntu experience in Azure

To bring a data-driven perspective to these benefits, Microsoft commissioned International Data Corporation (IDC) to conduct a business value study* based on interviews with organizations that moved their Ubuntu workloads from on-premises to Azure. Study participants shared that Azure provides a more efficient and effective platform for their Ubuntu workloads, maximizing their value in core business functions and supporting new technology adoption. Using the data derived from these interviews, IDC analysts created a typical customer profile to represent common experiences and business outcomes. The consolidated data from study participants shows that running Canonical Ubuntu workloads on Azure delivers the following benefits:

306% three-year return on investment with an 11-month payback on investment.

35% lower three-year cost of operations.

63% faster to deploy new compute resources and 52% faster to scale to new business opportunities.

85% less unplanned downtime affecting users.

$30.63M higher revenue per organization per year.

Quantified benefits of Ubuntu on Microsoft Azure

IDC interviewed stakeholders involved with Ubuntu workloads on Azure, uncovering significant benefits cited by participants, including:

Run mission-critical workloads with robust performance and flexibility

Organizations running workloads such as data analytics, engineering simulations, and machine learning, experience increased agility and operational efficiency with Ubuntu on Azure. By leveraging Ubuntu on Azure, businesses can scale seamlessly and respond swiftly to changing market conditions, ensuring optimal application performance while accelerating innovation and maintaining a competitive edge.

“With Ubuntu on Azure, we’ve unlocked AI adoption. We can scale innovations and experiment with technologies like GenAI, ML, and big data analytics without infrastructure constraints.”

The study participants also highlighted the ease of migrating Ubuntu workloads to Azure and the ability to add or remove capacity as needed. Gains in agility and development were notable, with users able to adjust and scale their Ubuntu environments more rapidly and flexibly in Azure, reducing deployment-related friction on development and business activities.

“Scalability is one of the reasons we moved to Ubuntu on Azure. We now have rapid scaling and flexible deployment, which enhance our responsiveness to business needs by almost 40%.”

Strengthen security and empower your IT teams

Security was another standout benefit for organizations adopting Ubuntu on Azure. They experienced enhanced operational resilience and reduced exposure to security and performance risks. Azure’s built-in security tools, including Microsoft Defender for Cloud, offer continuous security assessment threat detection, and actionable recommendations. This enables IT teams to proactively identify vulnerabilities, respond swiftly to potential threats, and maintain robust protection, ultimately supporting business continuity and fostering trust with customers and stakeholders.

“Ubuntu on Azure provides built-in security features such as Microsoft Defender for Cloud, which is a continuous security assessment and actionable recommendations. This proactive approach helps us identify vulnerabilities before they can be exploited, which is what we all are looking out for.”

In addition, IT teams have been able to shift their focus from maintenance-heavy tasks to more strategic, innovation-driven efforts, including AI initiatives. The transition to Azure simplified operations, streamlined development cycles, and enabled teams to make faster progress on business-critical projects by leveraging built-in AI tools and infrastructure that support rapid experimentation and deployment.

“With Ubuntu on Azure, we leverage AI and refocus our IT team. Managing on-premises infrastructure was difficult, but Azure AI services enhanced our applications and drove innovation. We’ve shifted IT resources from maintenance to strategic projects, improving productivity by 25%.”

Reduce operational costs while scaling efficiently

Organizations also realized significant cost efficiencies with Ubuntu on Azure. By taking advantage of Azure’s pay-as-you-go pricing and removing hardware maintenance burdens, businesses achieved notable infrastructure and licensing savings.

IDC found that customers reduced the cost of running Ubuntu workloads by an average of 35% over three years, saving $6,500 per Azure VM. Many also saw a 29% reduction in annual infrastructure costs, equating to approximately $581,100 per year.

“Ubuntu on Azure has reduced our direct IT costs by 40%, and it also optimizes our resource allocation, so we have better operational efficiency and staff time savings.”

“Ubuntu on Azure offers significant cost savings and scalability compared to on-premises solutions. It also provides excellent integration and interoperability and helps address data challenges, enhancing completeness, accuracy, and availability to support business decisions.”

Learn more from the IDC study

Download the full study: The Business Value of Ubuntu on Microsoft Azure.

Register to attend the webinar and listen to our guests from IDC, Microsoft, and Canonical discuss the benefits of running Ubuntu Linux on Azure.

To learn more about Ubuntu on Azure, visit our website. 

The Business Value of Ubuntu on Microsoft Azure
Read the full International Data Corporation business value study.

Learn more >

*IDC White Paper, sponsored by Microsoft, The Business Value of Ubuntu on Microsoft Azure, doc # US52857024, January 2025.

The post IDC Business Value Study: A 306% ROI within 3 years using Ubuntu Linux on Azure appeared first on Microsoft Azure Blog.
Quelle: Azure

Celebrating innovation, scale, and real-world impact with Serverless Compute on Azure

Microsoft is named a Leader in The Forrester Wave™: Serverless Development Platforms, Q2 2025

We are thrilled to announce that Microsoft has been recognized as a leader in The Forrester Wave™: Serverless Development Platforms, Q2 2025. We believe this recognition is a testament to our relentless focus on empowering developers, driving innovation, and delivering real value at scale for organizations across industries with Azure Functions and Azure Container Apps. Download the full report here (Forrester subscription required).

Focus on code, not infrastructure with serverless

Build smarter, scale faster with serverless compute in the era of AI applications and agents

Microsoft’s vision for serverless has always been clear: enable every developer to build, deploy, and manage modern applications with unmatched productivity, security, and agility—no matter the architecture, language, or workload. With Azure’s end-to-end serverless platform, we have moved beyond function-as-a-service to a comprehensive environment where containers, event-driven architectures, AI, and cloud-native patterns come together seamlessly.

Build and deploy serverless apps at scale

Our serverless offerings are designed to do more than abstract infrastructure—they are the foundation for building next-generation intelligent apps. With deep integrations into AI services, robust event handling, and developer-centric tooling, Azure Functions and Azure Container Apps make it easy for teams to transform ideas into impactful solutions.

What sets Microsoft’s serverless compute platform apart?

Unified event-driven and container-based models: Azure Functions and Azure Container Apps let you run any code, anywhere, scaling instantly from zero to hyper-scale—supporting both serverless functions and fully managed serverless containers without worrying about underlying infrastructure.

AI integration at every layer: With native support for Azure OpenAI, serverless GPUs and AI toolchains, you can embed generative AI, retrieval-augmented generation (RAG) patterns, and agentic workflows directly into serverless workflows, accelerating innovation in every app.

Best-in-class developer experience: From Visual Studio and VS Code to GitHub Actions, GitHub Copilot for Azure and familiar open-source frameworks, Microsoft’s stack puts developer productivity first—backed by extensive documentation, templates, and integrated DevOps capabilities.

Enterprise-grade security and compliance: Azure offers comprehensive identity and access management, role-based controls, and regulatory compliance, ensuring your applications and data are always protected.

Flexible pricing and hosting: Choose between consumption-based serverless, dedicated compute, or adaptive models. Features like Flex Consumption Plan and serverless GPU let you optimize for cost, performance, and specific workload needs.

Seamless and instant scaling: Instantly scale from zero to global with negligible cold start delays—ensuring always-on performance and real-time responsiveness for AI-powered and event-driven workloads, without manual intervention or infrastructure management.

Industry impact: With over a decade of operating a reliable cloud platform, we support mission-critical workloads across financial services, manufacturing, media, retail, and beyond.

Fully managed serverless container platform

Real-world impact: Customer success stories

Our customers continue to inspire us, showing what’s possible with Azure Functions and Azure Container Apps:

Hera Space Mission: Hera Space Companion, in collaboration with Terra Mater Studios, European Space Agency and Impact AI, is using Azure Container Apps and Azure AI Foundry to power the Hera AI Companion—an interactive, multilingual experience that lets users converse with a spacecraft in deep space—while also enabling rapid satellite image analysis and streamlined AI model deployment to accelerate innovation in space-based environmental insights.

Coca Cola: By adopting Azure Container Apps and Azure Functions to orchestrate real-time interactions in its global “Create Real Magic” holiday campaign, Coca Cola created a serverless, AI-powered Santa to engage over a million consumers across 43 countries in 26 languages with personalized experiences.

NFL: The National Football League integrates Azure Container Apps into its scouting platform, NFL Combine, to deliver real-time, sideline-ready AI insights, transforming hours of manual analysis into seconds of actionable data for coaches and scouts—without managing infrastructure.to power advanced fan engagement platforms, delivering real-time updates, personalized content, and data analytics during live events—all at massive scale.

Indiana Pacers: The Pacers build a real-time, in-arena captioning system that delivers instant, accurate captions to fans, enhancing accessibility and redefining the live sports experience through serverless compute and AI.

Coldplay: The iconic band, Coldplay, partners with Pixel Artworks to deliver immersive, AI-driven visual experiences at live shows, blending creativity and technology in real time using Azure Functions.

Heineken: Heineken is leveraging Azure Functions to build secure, scalable AI agents that automate workflows and power real-time RAG experiences—enabling intelligent, cost-optimized innovation across its global operations.

These stories are just a glimpse into the transformative potential of serverless at Microsoft. Visit the Microsoft Customer Stories for deeper dives into how organizations are succeeding with Azure Functions and Azure Container Apps, and check out the latest Build updates for even more innovation highlights.

Innovation continues: Build what’s next with Microsoft serverless

This recognition as a leader isn’t just a milestone—it’s a launchpad for what’s next. We’re continuously investing in AI-powered development, seamless hybrid cloud, and flexible deployment models. Our recent updates at Microsoft Build highlight advanced AI apps and agents, new serverless GPU capabilities, and an ever-growing ecosystem of tools, templates, and partner solutions to help you modernize, build, and scale.

Whether you’re building intelligent agents, orchestrating real-time data, or delivering engaging digital experiences, Microsoft’s serverless platform provides the power, flexibility, and trust you need.

Join us on this journey. Explore the latest on Azure Functions and Azure Container Apps, and let’s build the future—together.

Forrester does not endorse any company, product, brand, or service included in its research publications and does not advise any person to select the products or services of any company or brand based on the ratings included in such publications. Information is based on the best available resources. Opinions reflect judgment at the time and are subject to change. For more information, read about Forrester’s objectivity here .
The post Celebrating innovation, scale, and real-world impact with Serverless Compute on Azure appeared first on Microsoft Azure Blog.
Quelle: Azure

FYAI: How to leverage AI to reimagine cross-functional collaboration with Yina Arenas

Microsoft Build 2025 showcased how Microsoft is reimagining the software development lifecycle with powerful new capabilities that redefine what’s possible with AI.

From streamlining enterprise workflows to accelerating scientific discovery, AI agents are transforming how developers build and how businesses operate.

15 million developers are using GitHub Copilot, using features like agent mode and code review to handle repetitive tasks, allowing them to focus on the fun, creative parts of software development.Hundreds of thousands of customers are using Microsoft 365 Copilot to assist with research, brainstorming, and solution development, allowing increased for efficiency.More than 230,000 organizations—including 90% of the Fortune 500—have used Microsoft Copilot Studio to build AI agents and automations to improve productivity and scale business quickly.More than 11,000 AI models are now available through Azure AI Foundry, including Microsoft-hosted and partner-hosted models. This extensive library of AI models provides unparalleled resources for organizations to innovate and scale their AI-powered solutions.In this edition of FYAI, a series where we dive deep on AI trends with Microsoft leaders, we hear from Yina Arenas, Vice President of Product, Azure AI Foundry, who is leading the work at Microsoft to empower every developer to shape the future with generative AI using breakthrough models and enterprise AI agents.

Explore Microsoft AIIn this Q&A, Yina shares her insights on the shifting AI landscape, including why businesses are getting stuck in the “proof of concept” phase and how Azure AI Foundry can meet organizations where they are and take their AI projects to the next level.

What shifts in the AI landscape are you seeing that are fundamentally changing how people—and organizations—build and scale AI?We’re seeing a profound shift from AI as a research experiment to AI as a core business capability. What’s exciting—and challenging—is that organizations are no longer just asking, “Can we build this?” but “How do we build this responsibly, at scale, and with real impact?” That shift requires new tools, new mindsets, and new ways of working across teams. At Microsoft, we’re focused on making AI more accessible and inclusive—so that everyone, from developers to domain experts, can contribute to building solutions that matter. It’s not just about the tech—it’s about empowering people to solve real problems with AI.

Why is it still so hard for businesses to move from experimentation to production with AI—and what needs to change to unlock that next wave of value?Azure AI Foundry is supporting open Agent2Agent (A2A) protocol

Learn howMany organizations get stuck in the “proof of concept” phase because the leap to production is complex. It’s not just about selecting the right model—it’s about integrating it into systems, ensuring it’s secure and responsible, and aligning it with business goals. What’s missing is a cohesive, end-to-end approach that brings together the right tools, governance, and collaboration in a developer-friendly environment. That’s where Azure AI Foundry comes in—it’s designed to help teams not only move faster but do so thoughtfully by providing a cohesive end-to-end platform and offering traceability across prompts, models, and runtime behavior. We’re making it easier and less complex for developers to build apps while also giving business decision makers the ability to see how these apps perform, measure their ROI, and meet compliance requirements. To unlock the next wave of value, we need to make AI development more collaborative, transparent, and outcome-driven.

How does Azure AI Foundry help bridge that gap—and how is it different from other approaches out there?Azure AI Foundry is built to meet organizations where they are—whether they’re just starting or scaling AI across the enterprise. It brings together the best of Microsoft’s AI capabilities from foundational models to orchestration and monitoring in a unified platform. What sets Azure AI Foundry apart is not only that it’s built on decades of world-class research but that it’s built with humans at the center, so whether you’re a data scientist, product manager, engineer, or business leader, our AI solutions work for you. It also bakes in responsible AI from the start by integrating tools, from testing to monitoring to governance, that support the entire life cycle.

Who is Azure AI Foundry built for, and how does it support cross-functional teams—from data scientists to decision-makers—to build together?Azure AI Foundry: Your AI App and agent factory

Learn moreAzure AI Foundry is designed for anyone looking to take their AI projects to the next level—whether you’re part of a big enterprise, a startup, or a software development company. It offers access to the leading frontier models, integrates orchestration frameworks, supports open protocols for multi-agent collaboration, and provides native observability tooling—all within a secure, governed environment. Whether it’s optimizing call centers, analyzing data, improving product searches, or automating workflows, Azure AI Foundry pulls everything—models, tools, and agents—into one user-friendly platform. With tools like GitHub, Visual Studio, and Copilot Studio, Azure AI Foundry makes it easy for developers, data scientists, IT pros, and decision-makers to shorten the journey from idea to production.

A close up of a spiralAzure AI FoundryDesign, customize, and manage AI apps and agents at scale.

Get started todayWhere are you seeing Azure AI Foundry already making an impact—and what kinds of transformation are customers unlocking?As the central hub for building, orchestrating, and managing AI solutions, Azure AI Foundry remains the centerpiece of our AI platform strategy. It is now used by developers at more than 70,000 enterprises and software development companies—including Atomicwork, Epic, Fujitsu, Gainsight, H&R Block, and LG Electronics—to design, customize, and manage their AI apps and agents. And just six months in, more than 10,000 organizations have used Azure AI Foundry Agent Service to build, deploy, and scale their agents. Developers are designing agents that act, reason, take initiative, and deliver measurable business outcomes.

Heineken, for example, used Azure AI Foundry to build a multi-agent platform called “Hoppy” that helps employees access data and tools across the company in their native language. Their implementation has already saved thousands of hours, reducing tasks that once took 20 minutes to just 20 seconds.

Fujitsu evaluated Azure AI Foundry Agent Service to automate sales proposal creation. This boosted productivity by 67%, letting their teams to focus on customer engagement. The AI agent integrates with existing Microsoft tools familiar to around 38,000 employees, retrieves dispersed knowledge, and lays the foundation for broader AI-powered innovation.

Draftwise, a digital native offering an AI-powered contract drafting and review platform, is using cutting edge models in Azure AI Foundry (Cohere multimodal and AOAI reasoning) to help streamline the contract drafting process by integrating with a lawyer’s document storage system.

What excites you most about what’s next—for Azure AI Foundry, and for how people can reimagine the way they work and create with AI?What excites me most about what’s next for Azure AI Foundry is how it’s unlocking a new era of creativity and empowerment—not just for developers, but for everyone. We’re moving beyond the idea of AI as a tool you use to AI as a copilot you build with. Azure AI Foundry is helping people imagine and create agents that understand their goals, adapt to their workflows, and evolve with their needs.

That shift—from writing code to orchestrating intelligence—is profound. It means that a product manager, a marketer, or a frontline worker can shape how AI works for them, without needing to be a machine learning expert. It’s about putting the power of AI into the hands of the many, not the few.

And what’s most inspiring is that we’re just getting started. The agents people are building today are solving real problems—automating complex processes, accelerating insights, and freeing up time for more meaningful work. But the agents of tomorrow? They’ll be collaborators in creativity, partners in problem-solving, and catalysts for innovation we haven’t even dreamed of yet.

That’s the future I see—and it’s being built right now, by people who are reimagining what’s possible with AI.

Design, customize, and manage AI apps and agents at scaleThrough leaders like Yina Arenas, Microsoft’s vision for the future of AI is both inspiring and deeply human-centered. With platforms like Azure AI Foundry, we’re entering a new era where AI becomes not just a tool, but a true collaborator—empowering everyone, regardless of technical expertise, to innovate and solve real-world problems. With Azure AI Foundry, the potential of AI is being unlocked by developers everywhere, sparking a wave of transformation and boundless possibilities.

Interested in learning more? Here are a few resources:

Build your first production-grade AI agent in under an hour: Azure AI FoundryLearn how Azure AI Foundry is supporting open Agent2Agent (A2A) protocolRead Azure AI Foundry Agent Service documentationEmpower your team to grow their AI skillsFYAI: How agents will transform business and daily work
The post FYAI: How to leverage AI to reimagine cross-functional collaboration with Yina Arenas appeared first on Microsoft Azure Blog.
Quelle: Azure

GitHub scales on demand with Azure Functions

GitHub is the home of the world’s software developers, with more than 100 million developers and 420 million total repositories across the platform. To keep everything running smoothly and securely, GitHub collects a tremendous amount of data through an in-house pipeline made up of several components. But even though it was built for fault tolerance and scalability, the ongoing growth of GitHub led the company to reevaluate the pipeline to ensure it meets both current and future demands. 

“We had a scalability problem, currently, we collect about 700 terabytes a day of data, which is heavily used for detecting malicious behavior against our infrastructure and for troubleshooting. This internal system was limiting our growth.”

—Stephan Miehe, GitHub Senior Director of Platform Security

GitHub worked with its parent company, Microsoft, to find a solution. To process the event stream at scale, the GitHub team built a function app that runs in Azure Functions Flex Consumption, a plan recently released for public preview. Flex Consumption delivers fast and large scale-out features on a serverless model and supports long function execution times, private networking, instance size selection, and concurrency control.

Azure Functions Flex Consumption
Find out how can scale fast with Azure Functions Flex Consumption Plan

Learn more

In a recent test, GitHub sustained 1.6 million events per second using one Flex Consumption app triggered from a network-restricted event hub.

“What really matters to us is that the app scales up and down based on demand. Azure Functions Flex Consumption is very appealing to us because of how it dynamically scales based on the number of messages that are queued up in Azure Event Hubs.”

—Stephan Miehe, GitHub Senior Director of Platform Security

In a recent test, GitHub’s new function app processed 1.6 million messages per second in the Azure Functions Flex Consumption plan.

A look back

GitHub’s problem lay in an internal messaging app orchestrating the flow between the telemetry producers and consumers. The app was originally deployed using Java-based binaries and Azure Event Hubs. But as it began handling up to 460 gigabytes (GB) of events per day, the app was reaching its design limits, and its availability began to degrade.

For best performance, each consumer of the old platform required its own environment and time-consuming manual tuning. In addition, the Java codebase was prone to breakage and hard to troubleshoot, and those environments were getting expensive to maintain as the compute overhead grew.

“We couldn’t accept the risk and scalability challenges of the current solution,“ Miehe says. He and his team began to weigh the alternatives. “We were already using Azure Event Hubs, so it made sense to explore other Azure services. Given the simple nature of our need—HTTP POST request—we wanted something serverless that carries minimal overhead.”

Familiar with serverless code development, the team focused on similar Azure-native solutions and arrived at Azure Functions.

“Both platforms are well known for being good for simple data crunching at large scale, but we don’t want to migrate to another product in six months because we’ve reached a ceiling.”

—Stephan Miehe, GitHub Senior Director of Platform Security

A function app can automatically scale the queue based on the amount of logging traffic. The question was how much it could scale. At the time GitHub began working with the Azure Functions team, the Flex Consumption plan had just entered private preview. Based on a new underlying architecture, Flex Consumption supports up to 1,000 partitions and provides a faster target-based scaling experience. The product team built a proof of concept that scaled to more than double the legacy platform’s largest topic at the time, showing that Flex Consumption could handle the pipeline.

“Azure Functions Flex Consumption gives us a serverless solution with 100% of the capacity we need now, plus all the headroom we need as we grow.”

—Stephan Miehe, GitHub Senior Director of Platform Security

Making a good solution great

GitHub joined the private preview and worked closely with the Azure Functions product team to see what else Flex Consumption could do. The new function app is written in Python to consume events from Event Hubs. It consolidates large batches of messages into one large message and sends it on to the consumers for processing.

Finding the right number for each batch took some experimentation, as every function execution has at least a small percentage of overhead. At peak usage times, the platform will process more than 1 million events per second. Knowing this, the GitHub team needed to find the sweet spot in function execution. Too high a number and there’s not enough memory to process the batch. Too small a number and it takes too many executions to process the batch and slows performance.

The right number proved to be 5,000 messages per batch. “Our execution times are already incredibly low—in the 100–200 millisecond range,” Miehe reports.

This solution has built-in flexibility. The team can vary the number of messages per batch for different use cases and can trust that the target-based scaling capabilities will scale out to the ideal number of instances. In this scaling model, Azure Functions determines the number of unprocessed messages on the event hub and then immediately scales to an appropriate instance count based on the batch size and partition count. At the upper bound, the function app scales up to one instance per event hub partition, which can work out to be 1,000 instances for very large event hub deployments.

“If other customers want to do something similar and trigger a function app from Event Hubs, they need to be very deliberate in the number of partitions to use based on the size of their workload, if you don’t have enough, you’ll constrain consumption.”

—Stephan Miehe, GitHub Senior Director of Platform Security

Azure Functions supports several event sources in addition to Event Hubs, including Apache Kafka, Azure Cosmos DB, Azure Service Bus queues and topics, and Azure Queue Storage.

Reaching behind the virtual network

The function as a service model frees developers from the overhead of managing many infrastructure-related tasks. But even serverless code can be constrained by the limitations of the networks where it runs. Flex Consumption addresses the issue with improved virtual network (VNet) support. Function apps can be secured behind a VNet and can reach other services secured behind a VNet—without degrading performance.

As an early adopter of Flex Consumption, GitHub benefited from improvements being made behind the scenes to the Azure Functions platform. Flex Consumption runs on Legion, a newly architected, internal platform as a service (PaaS) backbone that improves network capabilities and performance for high-demand scenarios. For example, Legion is capable of injecting compute into an existing VNet in milliseconds—when a function app scales up, each new compute instance that is allocated starts up and is ready for execution, including outbound VNet connectivity, within 624 milliseconds (ms) at the 50 percentile and 1,022 ms at the 90 percentile. That’s how GitHub’s messaging processing app can reach Event Hubs secured behind a virtual network without incurring significant delays. In the past 18 months, the Azure Functions platform has reduced cold start latency by approximately 53% across all regions and for all supported languages and platforms.

Working through challenges

This project pushed the boundaries for both the GitHub and Azure Functions engineering teams. Together, they worked through several challenges to achieve this level of throughput:

In the first test run, GitHub had so many messages pending for processing that it caused an integer overflow in the Azure Functions scaling logic, which was immediately fixed.

In the second run, throughput was severely limited due to a lack of connection pooling. The team rewrote the function code to correctly reuse connections from one execution to the next.

At about 800,000 events per second, the system appeared to be throttled at the network level, but the cause was unclear. After weeks of investigation, the Azure Functions team found a bug in the receive buffer configuration in the Azure SDK Advanced Message Queuing Protocol (AMQP) transport implementation. This was promptly fixed by the Azure SDK team and allowed GitHub to push beyond 1 million events per second.

Best practices in meeting a throughput milestone

With more power comes more responsibility, and Miehe acknowledges that Flex Consumption gave his team “a lot of knobs to turn,” as he put it. “There’s a balance between flexibility and the effort you have to put in to set it up right.”

To that end, he recommends testing early and often, a familiar part of the GitHub pull request culture. The following best practices helped GitHub meet its milestones:

Batch it if you can: Receiving messages in batches boosts performance. Processing thousands of event hub messages in a single function execution significantly improves the system throughput.

Experiment with batch size: Miehe’s team tested batches as large as 100,000 events and as small as 100 before landing on 5,000 as the max batch size for fastest execution.

Automate your pipelines: GitHub uses Terraform to build the function app and the Event Hubs instances. Provisioning both components together reduces the amount of manual intervention needed to manage the ingestion pipeline. Plus, Miehe’s team could iterate incredibly quickly in response to feedback from the product team.

The GitHub team continues to run the new platform in parallel with the legacy solution while it monitors performance and determines a cutover date. 

“We’ve been running them side by side deliberately to find where the ceiling is,” Miehe explains.

The team was delighted. As Miehe says, “We’re pleased with the results and will soon be sunsetting all the operational overhead of the old solution.“

Explore solutions with Azure Functions

Azure Functions Flex Consumption

Azure Functions

The post GitHub scales on demand with Azure Functions appeared first on Azure Blog.
Quelle: Azure

Elevate your AI deployments more efficiently with new deployment and cost management solutions for Azure OpenAI Service including self-service Provisioned

We’re excited to announce significant updates for Azure OpenAI Service, designed to help our 60,000 plus customers manage AI deployments more efficiently and cost-effectively beyond current pricing. With the introduction of self-service Provisioned deployments, we aim to help make your quota and deployment processes more agile, faster to market, and more economical. The technical value proposition remains unchanged—Provisioned deployments continue to be the best option for latency-sensitive and high-throughput applications. Today’s announcement includes self-service provisioning, visibility to service capacity and availability, and the introduction of Provisioned (PTU) hourly pricing and reservations to help with cost management and savings. 

Azure OpenAI Service deployment and cost management solutions walkthrough

What’s new? 

Self-Service Provisioning and Model Independent Quota Requests 

We are introducing self-service provisioning alongside standard tokens, allowing you to request Provisioned Throughput Units (PTUs) more flexibly and efficiently. This new feature empowers you to manage your Azure OpenAI Service quata deployments independently without relying on support from your account team. By decoupling quota requests from specific models, you can now allocate resources based on your immediate needs and adjust as your requirements evolve. This change simplifies the process and accelerates your ability to deploy and scale your applications. 

Visibility to service capacity and availability

Gain better visibility into service capacity and availability, helping you make informed decisions about your deployments. With this new feature, you can access real-time information about service capacity in different regions, ensuring that you can plan and manage your deployments more effectively. This transparency allows you to avoid potential capacity issues and optimize the distribution of your workloads across available resources, leading to improved performance and reliability for your applications. 

Provisioned hourly pricing and reservations 

We are excited to introduce two new self-service purchasing options for PTUs: 

Hourly no-commitment purchasing 

You can now create a Provisioned deployment for as little as an hour, with a flat hourly rate of $2 per unit per hour. This model-independent pricing makes it easy to deploy and tear down deployments as needed, offering maximum flexibility. This is ideal for testing scenarios or transitional periods without any long-term commitment. 

Monthly and yearly Azure reservations for Provisioned deployments

For production environments with steady request volumes, Azure OpenAI Service Provisioned Reservations offer significant cost savings. By committing to a monthly or yearly reservation, you can save up to 82% or 85%, respectively, over hourly rates. Reservations are now decoupled from specific models and deployments, providing unmatched flexibility. This approach allows enterprises to optimize costs while maintaining the ability to switch models and adjust deployments as needed. Read our technical blog on Reservations here.

Azure OpenAI Service
Build your own copilot and generative AI applications

Try today

Benefits for decision makers 

These updates are designed to provide flexibility, cost efficiency, and ease of use, making it simpler for decision-makers to manage AI deployments. 

Flexibility: With self-service provisioning and hourly pricing, you can scale your deployments up or down based on immediate needs without long-term commitments. 

Cost efficiency: Azure Reservations offer substantial savings for long-term use, enabling better budget planning and cost management. 

Ease of use: Enhanced visibility and simplified provisioning processes reduce administrative burdens, allowing your team to focus on strategic initiatives rather than operational details. 

Customer success stories 

Before we made self-service available, select customers started achieving benefits of these options. 

Visier Solutions: By leveraging Provisioned Throughput Units (PTUs) with Azure OpenAI Service, Visier Solutions has significantly enhanced their AI-powered people analytics tool, Vee. With PTUs, Visier guarantees rapid, consistent response times, crucial for handling the high volume of queries from their extensive customer base. This powerful synergy between Visier’s innovative solutions and Azure’s robust infrastructure not only boosts customer satisfaction by delivering swift and accurate insights but also underscores Visier’s commitment to using cutting-edge technology to drive transformational change in workforce analytics. Read the case study on Microsoft. 

An analytics and insights company: Switched from Standard Deployments to GPT-4 Turbo PTUs and experienced a significant reduction in response times, from 10–20 seconds to just 2–3 seconds. 

A Chatbot Services company: Reported improved stability and lower latency with Azure PTUs, enhancing the performance of their services. 

A visual entertainment company: Noted a drastic latency improvement, from 12–13 seconds down to 2–3 seconds, enhancing user engagement. 

Empowering all customers to build with Azure OpenAI Service

These new updates do not alter the technical excellence of Provisioned deployments, which continue to deliver low and predictable latency. Instead, they introduce a more flexible and cost-effective procurement model, making Azure OpenAI Service more accessible than ever. With self-service Provisioned, model-independent units, and both hourly and reserved pricing options, the barriers to entry have been drastically lowered. 

To learn more about enhancing the reliability, security, and performance of your cloud and AI investments, explore the additional resources below.

Additional Resources 

Azure Pricing Provisioned Reservations

Azure OpenAI Service Pricing 

More details about Provisioned

Documentation for On-Boarding 

PTU Calculator in Azure AI Studio 

Unveiling Azure OpenAI Service Provisioned reservations blog

The post Elevate your AI deployments more efficiently with new deployment and cost management solutions for Azure OpenAI Service including self-service Provisioned appeared first on Azure Blog.
Quelle: Azure

Announcing mandatory multi-factor authentication for Azure sign-in

Learn how multifactor authentication (MFA) can protect your data and identity and get ready for Azure’s upcoming MFA requirement. 

As cyberattacks become increasingly frequent, sophisticated, and damaging, safeguarding your digital assets has never been more critical. As part of Microsoft’s $20 billion dollar investment in security over the next five years and our commitment to enhancing security in our services in 2024, we are introducing mandatory multifactor authentication (MFA) for all Azure sign-ins.

The need for enhanced security

One of the pillars of Microsoft’s Secure Future Initiative (SFI) is dedicated to protecting identities and secrets—we want to reduce the risk of unauthorized access by implementing and enforcing best-in-class standards across all identity and secrets infrastructure, and user and application authentication and authorization. As part of this important priority, we are taking the following actions:

Protect identity infrastructure signing and platform keys with rapid and automatic rotation with hardware storage and protection (for example, hardware security module (HSM) and confidential compute).

Strengthen identity standards and drive their adoption through use of standard SDKs across 100% of applications.

Ensure 100% of user accounts are protected with securely managed, phishing-resistant multifactor authentication.

Ensure 100% of applications are protected with system-managed credentials (for example, Managed Identity and Managed Certificates).

Ensure 100% of identity tokens are protected with stateful and durable validation.

Adopt more fine-grained partitioning of identity signing keys and platform keys.

Ensure identity and public key infrastructure (PKI) systems are ready for a post-quantum cryptography world.

Ensuring Azure accounts are protected with securely managed, phishing-resistant multifactor authentication is a key action we are taking. As recent research by Microsoft shows that multifactor authentication (MFA) can block more than 99.2% of account compromise attacks, making it one of the most effective security measures available, today’s announcement brings us all one step closer toward a more secure future.

In May 2024, we talked about implementing automatic enforcement of multifactor authentication by default across more than one million Microsoft Entra ID tenants within Microsoft, including tenants for development, testing, demos, and production. We are extending this best practice of enforcing MFA to our customers by making it required to access Azure. In doing so, we will not only reduce the risk of account compromise and data breach for our customers, but also help organizations comply with several security standards and regulations, such as Payment Card Industry Data Security Standard (PCI DSS), Health Insurance Portability and Accountability Act (HIPAA), General Data Protection Regulation (GDPR), and National Institute of Standards and Technology (NIST).

Preparing for mandatory Azure MFA

Required MFA for all Azure users will be rolled out in phases starting in the 2nd half of calendar year 2024 to provide our customers time to plan their implementation: 

Phase 1: Starting in October, MFA will be required to sign-in to Azure portal, Microsoft Entra admin center, and Intune admin center. The enforcement will gradually roll out to all tenants worldwide. This phase will not impact other Azure clients such as Azure Command Line Interface, Azure PowerShell, Azure mobile app and Infrastructure as Code (IaC) tools. 

Phase 2: Beginning in early 2025, gradual enforcement for MFA at sign-in for Azure CLI, Azure PowerShell, Azure mobile app, and Infrastructure as Code (IaC) tools will commence.

Beginning today, Microsoft will send a 60-day advance notice to all Entra global admins by email and through Azure Service Health Notifications to notify the start date of enforcement and actions required. Additional notifications will be sent through the Azure portal, Entra admin center, and the M365 message center.

For customers who need additional time to prepare for mandatory Azure MFA, Microsoft will review extended timeframes for customers with complex environments or technical barriers.

How to use Microsoft Entra for flexible MFA

Organizations have multiple ways to enable their users to utilize MFA through Microsoft Entra:

Microsoft Authenticator allows users to approve sign-ins from a mobile app using push notifications, biometrics, or one-time passcodes. Augment or replace passwords with two-step verification and boost the security of your accounts from your mobile device.

FIDO2 security keys provide access by signing in without a username or password using an external USB, near-field communication (NFC), or other external security key that supports Fast Identity Online (FIDO) standards in place of a password.

Certificate-based authentication enforces phishing-resistant MFA using personal identity verification (PIV) and common access card (CAC). Authenticate using X.509 certificates on smart cards or devices directly against Microsoft Entra ID for browser and application sign-in.

Passkeys allow for phishing-resistant authentication using Microsoft Authenticator.

Finally, and this is the least secure version of MFA, you can also use a SMS or voice approval as described in this documentation.

External multifactor authentication solutions and federated identity providers will continue to be supported and will meet the MFA requirement if they are configured to send an MFA claim.

Moving forward

At Microsoft, your security is our top priority. By enforcing MFA for Azure sign-ins, we aim to provide you with the best protection against cyber threats. We appreciate your cooperation and commitment to enhancing the security of your Azure resources.

Our goal is to deliver a low-friction experience for legitimate customers while ensuring robust security measures are in place. We encourage all customers to begin planning for compliance as soon as possible to avoid any business interruptions. 

Start today! For additional details on implementation, impacted accounts, and next steps for you, please refer to this documentation.
The post Announcing mandatory multi-factor authentication for Azure sign-in appeared first on Azure Blog.
Quelle: Azure