New Project Flash Update: Advancing Azure Virtual Machine availability monitoring

“Earlier this year, we introduced Project Flash in the Advancing Reliability blog series, to reaffirm our commitment to empowering Azure customers in monitoring virtual machine (VM) availability in a robust and comprehensive manner. Today, we’re excited to share the progress we’ve made since then in developing holistic monitoring offerings to meet customers’ distinct needs. I’ve asked Senior Technical Program Manager, Pujitha Desiraju, from the Azure Core Production Quality Engineering team to share the latest investments as part of Project Flash, to deliver the best monitoring experience for customers.”—Mark Russinovich, CTO, Azure.

Flash, as the project is internally known, is a collection of efforts across Azure Engineering, that aims to evolve Azure’s virtual machine (VM) availability monitoring ecosystem into a centralized, holistic, and intelligible solution customers can rely on to meet their specific observability needs. As part of this multi-year endeavor, we’re excited to announce the:

General availability of VM availability information in Azure Resource Graph for efficient and at-scale monitoring, convenient for detailed downtime investigations and impact assessment.
Preview of a VM availability metric in Azure Monitor for quick debugging is now publicly available, trend analysis of VM availability over time, and setting up threshold-based alerts on scenarios that impact workload performance.
Preview of VM availability status change events via Azure Event Grid for instantaneous notifications on critical changes in VM availability, to quickly trigger remediation actions to prevent end-user impact.

Our commitment remains, to maintaining data consistency and similar rigorous quality standards across all the monitoring solutions that are part of Flash, including existing solutions like Resource Health or Activity Log, so we deliver a consistent and cohesive experience to customers.

VM availability information in Azure Resource Graph for at-scale analysis

In addition to already flowing VM availability states, we recently published VM health annotations to Azure Resource Graph (ARG) for detailed failure attribution and downtime analysis, along with enabling a 14-day change tracking mechanism to trace historical changes in VM availability for quick debugging. With these new additions, we’re excited to announce the general availability of VM availability information in the HealthResources dataset in ARG! With this offering users can:

Efficiently query the latest snapshot of VM availability across all Azure subscriptions at once and at low latencies for periodic and fleetwide monitoring.
Accurately assess the impact to fleetwide business SLAs and quickly trigger decisive mitigation actions, in response to disruptions and type of failure signature.
Set up custom dashboards to supervise the comprehensive health of applications by joining VM availability information with additional resource metadata present in ARG.
Track relevant changes in VM availability across a rolling 14-day window, by using the change-tracking mechanism for conducting detailed investigations.

Getting started

Users can query ARG via PowerShell, REST API, Azure CLI, or even the Azure Portal. The following steps detail how data can be accessed from Azure Portal.

Once on the Azure Portal, navigate to Resource Graph Explorer which will look like the below image:

Figure 1: Azure Resource Graph Explorer landing page on Azure Portal.

Select the Table tab and (single) click on the HealthResources table to retrieve the latest snapshot of VM availability information (availability state and health annotations).

Figure 2: Azure Resource Graph Explorer Window depicting the latest VM availability states and VM health annotations in the HealthResources table.

There will be two  types of events populated in the HealthResources table:

 
Figure 3: Snapshot of the type of events present in the HealthResources table, as shown in Resource Graph Explorer on the Azure Portal.

microsoft.resourcehealth/availabilitystatuses

This event denotes the latest availability status of a VM, based on the health checks performed by the underlying Azure platform. Below are the availability states we currently emit for VMs:

Available: The VM is up and running as expected.
Unavailable: We’ve detected disruptions to the normal functioning of the VM and therefore applications will not run as expected.
Unknown: The platform is unable to accurately detect the health of the VM. Users can usually check back in a few minutes for an updated state.

To poll the latest VM availability state, refer to the properties field which contains the below details:

Sample

{
      "targetResourceType": "Microsoft.Compute/virtualMachines",
      "previousAvailabilityState": "Available",
"targetResourceId": "/subscriptions/<subscriptionId>/resourceGroups/<ResourceGroupName>/providers/Microsoft.Compute/virtualMachines/<VMName>",
      "occurredTime": "2022-10-11T11:13:59.9570000Z",
      "availabilityState": "Unavailable"
}

Property descriptions

Field

Description

Corresponding RHC field

targetResourceType

Type of resource for which health data is flowing

resourceType

targetResourceId

Resource Id

resourceId

occurredTime

Timestamp when the latest availability state is emitted by the platform

eventTimestamp

previousAvailabilityState

Previous availability state of the VM

previousHealthStatus

availabilityState

Current availability state of the VM

currentHealthStatus

Refer to this doc for a list of starter queries to further explore this data.

microsoft.resourcehealth/resourceannotations (NEWLY ADDED)

This event contextualizes any changes to VM availability, by detailing necessary failure attributes to help users investigate and mitigate the disruption as needed. See the full list of VM health annotations emitted by the platform.
These annotations can be broadly classified into three buckets:

Downtime Annotations: These annotations are emitted when the platform detects VM availability transitioning to Unavailable. (For example, during unexpected host crashes, rebootful repair operations).
Informational Annotations: These annotations are emitted during control plane activities with no impact to VM availability. (Such as VM allocation/Stop/Delete/Start). Usually, no additional customer action is required in response.
Degraded Annotations: These annotations are emitted when VM availability is detected to be at risk. (For example, when failure prediction models predict a degraded hardware component that can cause the VM to reboot at any given time). We strongly urge users to redeploy by the deadline specified in the annotation message, to avoid any unanticipated loss of data or downtime.

To poll the associated VM health annotations for a resource, if any, refer to the properties field which contains the following details:

Sample

{
     "targetResourceType": "Microsoft.Compute/virtualMachines",                                                                                                                                                                        "targetResourceId": "/subscriptions/<subscriptionId>/resourceGroups/<ResourceGroupName>/providers/Microsoft.Compute/virtualMachines/<VMName>",
     "annotationName": "VirtualMachineHostRebootedForRepair",
     "occurredTime": "2022-09-25T20:21:37.5280000Z",
     "category": "Unplanned",
     "summary": "We're sorry, your virtual machine isn't available because an unexpected failure on the host server. Azure has begun the auto-recovery process and is currently rebooting the host server. No  additional action is required from you at this time. The virtual machine will be back online after the reboot completes.",
     "context": "Platform Initiated",
     "reason": "Unexpected host failure"
}

Property descriptions

Field

Description

Corresponding RHC field

targetResourceType

Type of resource for which health data is flowing

resourceType

targetResourceId

Resource Id

resourceId

occurredTime

Timestamp when the latest availability state is emitted by the platform

eventTimestamp

annotationName

Name of the Annotation emitted

eventName

reason

Brief overview of the availability impact observed by the customer

title

category

Denotes whether the platform activity triggering the annotation was either planned maintenance or unplanned repair. This field is not applicable to customer/VM-initiated events.

Possible values: Planned | Unplanned | Not Applicable | Null

category

context

Denotes whether the activity triggering the annotation was due to an authorized user or process (customer-initiated), or due to the Azure platform (platform-initiated) or even activity in the guest OS that has resulted in availability impact (VM initiated).

Possible values: Platform-initiated | User-initiated | VM-initiated | Not Applicable | Null

context

summary

Statement detailing the cause for annotation emission, along with remediation steps that can be taken by users

summary

Refer to this doc for a list of starter queries to further explore this data.

Looking ahead to 2023, we have multiple enhancements planned for the annotation metadata that is surfaced in the HealthResources dataset. These enrichments will give users access to richer failure attributes to decisively prepare a response to a disruption. In parallel, we aim to extend the duration of historical lookback to a minimum of 30 days so users can comprehensively track past changes in VM availability.

VM availability metric in Azure Monitor Preview

We’re excited to share that the out-of-box VM availability metric is now available as a public preview for all users! This metric displays the trend of VM availability over time, so users can:
Set up threshold-based metric alerts on dipping VM availability to quickly trigger appropriate mitigation actions.
Correlate the VM availability metric with existing platform metrics like memory, network, or disk for deeper insights into concerning changes that impact the overall performance of workloads.
Easily interact with and chart metric data during any relevant time window on Metrics Explorer, for quick and easy debugging.
Route metrics to downstream tooling like Grafana dashboards, for constructing custom visualizations and dashboards.

Getting started

Users can either consume the metric programmatically via the Azure Monitor REST API or directly from the Azure Portal. The following  steps highlight metric consumption from the Azure Portal.

Once on the Azure Portal, navigate to the VM overview blade. The new metric will display as VM Availability (Preview), along with other platform metrics under the Monitoring tab.

Figure 4: View the newly added VM Availability Metric on the VM overview page on Azure Portal.

Select (single click) the VM availability metric chart on the overview page, to navigate to Metrics Explorer for further analysis.

Figure 5: View the newly added VM availability Metric on Metrics Explorer on Azure Portal.

Metric description:

Display Name

VM Availability (preview)

Metric Values

1 during expected behavior; corresponds to VM in Available state.

0 when VM is impacted by rebootful disruptions; corresponds to VM in Unavailable state.

NULL (shows a dotted or dashed line on charts) when the Azure service that is emitting the metric is down or is unaware of the exact status of the VM; corresponds to VM in Unknown state.

Aggregation

The default aggregation of the metric is Average, for prioritized investigations based on extent of downtime incurred.

The other aggregations available are:

Min, to immediately pinpoint to all the times where VM was unavailable.

Max, to immediately pinpoint to all the instances where VM was Available.

Refer here for more details on chart range, granularity, and data aggregation.

Data Retention

Data for the VM availability metric will be stored for 93 days to assist in trend analysis and historical lookback.

Pricing

Please refer to the Pricing breakdown, specifically in the “Metrics” and “Alert Rules” sections.

Looking ahead to 2023, we plan to include impact details (user vs platform initiated, planned vs unplanned) as dimensions to the metric, so users are well equipped to interpret dips, and set up much more targeted metric alerts. With the emission of dimensions in 2023, we also anticipate transitioning the offering to a general availability status.

Introducing instantaneous notifications on changes in VM availability via Event Grid

We’re thrilled to introduce our latest monitoring offering—the private preview of VM availability status change events in an Event Grid System Topic, which uses the low-latency technology of Azure Event Grid! Users can now subscribe to the system topic and route these events to their downstream tooling using any of the available event handlers (such as Azure Functions, Logic Apps, Event Hubs, and Storage queues). This solution uses an event-driven architecture to communicate scoped changes in VM availability to end users in less than five seconds from the disruption occurrence. This empowers users to take instantaneous mitigation actions to prevent end user impact.

As part of the private preview, we’ll emit events scoped to changes in VM availability states, with the sample schema below:

Sample

{
     "id": "4c70abbc-4aeb-4cac-b0eb-ccf06c7cd102",
     "topic": "/subscriptions/<subscriptionId>,
   "subject": "/subscriptions/<subscriptionId>/resourceGroups/<ResourceGroupName>/providers/Microsoft.Compute/virtualMachines/<VMName>/providers/Microsoft.ResourceHealth/AvailabilityStatuses/current",
    "data": {
        "resourceInfo": {
"id":"/subscriptions/<subscriptionId>/resourceGroups/<ResourceGroupName>/providers/Microsoft.Compute/virtualMachines/<VMName>/providers/Microsoft.ResourceHealth/AvailabilityStatuses/current",       
"properties": {
"targetResourceId":"/subscriptions/<subscriptionId>/resourceGroups/<ResourceGroupName>/providers/Microsoft.Compute/virtualMachines/<VMName>"
              "targetResourceType": "Microsoft.Compute/virtualMachines",
              "occurredTime": "2022-09-25T20:21:37.5280000Z"
"previousAvailabilityState": "Available",
      "availabilityState": "Unavailable"
           }
        },
        "apiVersion": "2020-09-01"
     },
"eventType": "Microsoft.ResourceNotifications.HealthResources.AvailabilityStatusesChanged",
   "dataVersion": "1",
     "metadataVersion": "1",
     "eventTime": "2022-09-25T20:21:37.5280000Z"
}

The properties field is fully consistent with the microsoft.resourcehealth/availabilitystatuses event in ARG. The event grid solution offers near-real-time alerting capabilities on the data present in ARG.

We’re currently releasing the preview to a small subset of users to rigorously test the solution and collect iterative feedback. This approach enables us to preview and even announce the general availability of a high quality and well-rounded offering in 2023. As we look towards the general availability of this solution, users can expect to receive events when annotations, automated RCAs are emitted by the platform.

What’s next?

We’ll be heavily focused on strengthening our monitoring platform to continuously improve the experience for customers based on ongoing feedback collected from the community (such as  aggregated VMSS health showing degraded inaccurately, VM unavailable for 15 minutes, Missing VM downtimes in Activity Log). By streamlining our internal message pipeline, we aim to not only improve data quality, but also maintain data consistency across our offerings and expand the scope of failure scenarios surfaced.

Introducing Degraded VM Availability state

In light of our upcoming efforts to centralize our monitoring architecture, we’ll be well-positioned to introduce a Degraded VM availability state for virtual machines in 2023. This state will be extremely useful in setting up targeted alerts on predicted hardware failure scenarios where there is imminent risk to VM availability. This state will also allow users to efficiently track cases of degraded hardware or software failures needing to redeploy, which today do not cause a corresponding change in VM availability. We will also aim to emit reminder annotations through the duration of the VM being marked Degraded, to prevent users from overlooking the request to redeploy.

Expand scope of failure attribution to include application freeze events

In 2023, we plan to expand our scope of failure attribution and emission to also include application freeze events that may be caused due to network agent updates, host OS updates lasting thirty seconds and freeze-causing repair operations. This will ensure users have enhanced visibility into freeze impact and will be applied across our monitoring offerings, including Resource Health and Activity Logs.

Learn More

Please stay tuned for more announcements on the Flash initiative, by tracking updates to the Advancing Reliability Series!
Quelle: Azure

Do more with less using new Azure HX and HBv4 virtual machines for HPC

This post was co-authored by Jyothi Venkatesh, Senior Product Manager, Azure HPC and Fanny Ou, Technical Program Manager, Azure HPC.

The next generation of purpose-built Azure HPC virtual machines

Today, we are excited to announce two new virtual machines (VMs) that deliver more performance, value-adding innovation, and cost-effectiveness to every Azure HPC customer. The all-new HX-series and HBv4-series VMs are coming soon to the East US region, and thereafter to the South Central US, West US3, and West Europe regions. These new VMs are optimized for a variety of HPC workloads such as computational fluid dynamics (CFD), finite element analysis, frontend and backend electronic design automation (EDA), rendering, molecular dynamics, computational geoscience, weather simulation, AI inference, and financial risk analysis.

Innovative technologies to help HPC customers where it matters most

HX and HBv4 VMs are packed with new and innovative technologies that maximize performance and minimize total HPC spend, including:

4th Gen AMD EPYC™ processors (Preview, Q4 2022).
Upcoming AMD EPYC processors, codenamed "Genoa-X," (with general availability in 1H 2023).
800 GB/s of DDR5 memory bandwidth (STREAM TRIAD).
400 Gb/s NVIDIA Quantum-2 CX7 InfiniBand, the first on the public cloud.
80 Gb/s Azure Accelerated Networking.
PCIe Gen4 NVMe SSDs delivering 12 GB/s (read) and 7 GB/s (write) of storage bandwidth.

Below are preliminary benchmarks from the preview of HBv4 and HX series VMs using 4th Gen AMD EPYC processors across several common HPC applications and domains. For comparison, performance information is also included from Azure’s most recent H-series (HBv3-series with Milan-X processors), as well as a 4-year-old HPC-optimized server commonly found in many on-premises datacenters (represented here by Azure HC-series with Skylake processors).

Figure 1: Performance comparison of HBv4/HX-series in Preview to HBv3-series and four-year-old server technology in an HPC-optimized configuration across diverse workloads and scientific domains.

Learn more about the performance of HBv4 and HX-series VMs with 4th Gen EPYC CPUs.

HBv4-series brings performance leaps across a diverse set of HPC workloads

Azure HBv3 VMs with 3rd Gen AMD EPYC™ processors with AMD 3D V-cache™ Technology already deliver impressive levels of HPC performance, scaling MPI workloads up to 27x higher than other clouds, surpassing many of the leading supercomputers in the world, and offering the disruptive value proposition of faster time to solution with lower total cost. Unsurprisingly, the response from customers and partners has been phenomenal. With the introduction of HBv4 series VMs, Azure is raising the bar yet again—this time across an even greater diversity of memory performance-bound, compute-bound, and massively parallel workloads.

VM Size

Physical CPU Cores

RAM (GB)

Memory Bandwidth (STREAM TRIAD) (GB/s)

L3 Cache/VM (MB)

FP64 Compute (TFLOPS)

InfiniBand RDMA Network (Gbps)

Standard_HB176rs_v4

176

688

800

768 MB

6

400

Standard_HB176-144rs_v4

144

688

800

768 MB

6

400

Standard_HB176-96rs_v4

96

688

800

768 MB

6

400

Standard_HB176-48rs_v4

48

688

800

768 MB

6

400

Standard_HB176-24rs_v4

24

688

800

768 MB

6

400

Notes: 1) “r” denotes support for remote direct memory access (RDMA) and “s” denotes support for Premium SSD disks. 2) At General Availability, Azure HBv4 VMs will be upgraded to Genao-X processors featuring 3D V-cache. Updated technical specifications for HBv4 will be posted at that time.

HX-series powers next generation silicon design

In Azure, we strive to deliver the best platform for silicon design, both now and far into the future. Azure HBv3 VMs, featuring 3rd Gen AMD EPYC processors with AMD 3D V-cache Technology, are a significant step toward this objective, offering the highest performance and total cost effectiveness in the public cloud for small and medium memory EDA workloads. With the introduction of HX-series VMs, Azure is enhancing its differentiation with a VM purpose-built for even larger models becoming commonplace among chip designers targeting 3, 4, and 5 nanometer processes.

HX VMs will feature 3x more RAM than any prior H-series VM, up to nearly 60 GB of RAM per core, and constrained cores VM sizes to help silicon design customers maximize ROI of their per-core commercial licensing investments.

VM Size

Physical CPU Cores

RAM (GB)

Memory/Core(GB)

L3 Cache/VM (MB)

Local SSD NVMe (TB)

InfiniBand RDMA Network (Gbps)

Standard_HX176rs

176

1,408

8

768

3.6 TB

400

Standard_HX176-144rs

144

1,408

10

768

3.6 TB

400

Standard_HX176-96rs

96

1,408

15

768

3.6 TB

400

Standard_HX176-48rs

48

1,408

29

768

3.6 TB

400

Standard_HX176-24rs

24

1,408

59

768

3.6 TB

400

Notes: 1) “r” denotes support for remote direct memory access (RDMA) and “s” denotes support for Premium SSD disks. 2) At General Availability, Azure HBv4 VMs will be upgraded to Genoa-X processors featuring 3D V-cache. Updated technical specifications for HBv4 will be posted at that time.

400 Gigabit InfiniBand for supercomputing customers

HBv4 and HX VMs are Azure’s first to leverage 400 Gigabit NVIDIA Quantum-2 InfiniBand. This newest generation of InfiniBand brings greater support for the offload of MPI collectives, enhanced congestion control, and enhanced adaptive routing capabilities. Using the new HBv4 or HX-series VMs and only a standard Azure Virtual Machine Scale Set (VMSS), customers can scale CPU-based MPI workloads beyond 50,000 cores per job.

Continuous improvement for Azure HPC customers

Microsoft and AMD share a vision for a new era of high-performance computing in the cloud: one defined by constant improvements to the critical research and business workloads that matter most to our customers. Azure continues to collaborate with AMD to make this vision a reality by raising the bar on the performance, scalability, and value we deliver with every release of Azure H-series VMs.

Figure 2: Azure HPC Performance 2019 through 2022.

Learn more about the performance of HBv4 and HX-series VMs with 4th Gen EPYC CPUs.

Customer and partner momentum

"We’re pleased to see Altair® AcuSolve®’s impressive linear scale-up on the HBv3 instances, showing up to 2.5 times speedup. Performance increases 12.83 times with an 8-node (512-core) configuration on 3rd AMD EPYC™ processors, an excellent scale-up value for AcuSolve compared to the previous generation delivering superior price performance. We welcome the addition of the new Azure HBv4 and HX-series virtual machines and look forward to pairing them with Altair software to the benefit of our joint customers.”

—Dr. David Curry, Senior Vice President, CFD and EDEM

"Customers in the HPC industry continue to demand higher performance and optimizations to run their most mission-critical and data-intensive applications. 4th Gen AMD EPYC processors provide breakthrough performance for HPC in the cloud, delivering impressive time to results for customers adopting Azure HX-series and HBv4-series VMs."

—Lynn Comp, Corporate Vice President, Cloud Business, AMD

"Ansys electronics, semiconductor, fluids, and structures customers demand more throughput out of their simulation tools to overcome challenges posed by product complexity and project timelines. Microsoft's HBv3 virtual machines, featuring AMD’s 3rd Gen EPYC processors with 3D V-Cache, have been giving companies a great price/performance crossover point to support these multiphysics simulations on-demand and with very little IT overhead. We look forward to leveraging Azure’s next generation of HPC VMs featuring 4th Gen AMD EPYC processors, the HX and HBv4 series, to enable even greater simulation complexity and speed to help engineers reduce risk and meet time-to-market deadlines."

—John Lee, Vice President and General Manager, Electronics and Semiconductor, Ansys

"We’ve helped thousands of customers combine the performance and scalability of the cloud, providing ease-of-use and instance access to our powerful computational software, which speeds the delivery of innovative designs. The two new high-performance computing virtual machines powered by the AMD Genoa processor on Microsoft Azure can provide our mutual customers with optimal performance as they tackle the ever-increasing demands of compute and memory capacity for gigascale, advanced-node designs."

—Mahesh Turaga, Vice President, Cloud Business Development, Cadence

"Hexagon simulation software powers some of the most advanced engineering in the world. We’re proud to partner with Microsoft, and excited to pair our software with Azure’s new HBv4 virtual machines. During early testing in collaboration with the Azure HPC team, we have seen a generational performance speedups of 400 percent when comparing structural simulations running on HBv3 and HX-series VMs. We look forward to seeing what our joint customers will do with this remarkable combination of software and hardware to advance their research and productivity, now and tomorrow. In the first quarter of 2023, we will be benchmarking heavy industrial CFD computations, leveraging multiple HBv4 virtual machines connected through InfiniBand."

—Bruce Engelmann, CTO, Hexagon

"Microsoft Azure has once again raised the bar for HPC infrastructure platform in the cloud this time with the launch of Azure HBv4 and HX virtual machines based on AMD’s 4th gen EPYC Genoa CPUs. We are expecting a strong customer demand for HBv4 and are excited to offer it to our customers that would like to run CFD, EDA, or other types of HPC workloads in the cloud.

—Mulyanto Poort, Vice President of HPC Engineering at Rescale

"Early testing by AMD with Siemens EDA workloads showed 15 percent to 22 percent improvements in runtimes with Microsoft Azure’s new AMD-based virtual machines compared to the previous generation. Semiconductor chip designers face a range of technical challenges that make hitting release dates extremely difficult. The combined innovation of AMD, Microsoft Azure, and Siemens provides a simplified path to schedule predictability through the increased performance possible with the latest offerings."

—Craig Johnson, Vice President, Siemens, EDA Cloud Solutions

"Customer adoption of the cloud for chip development is accelerating, driven by complexity and time-to-market advantages. The close collaboration between Synopsys and Microsoft brings together EDA and optimized compute to enable customers to scale under the Synopsys FlexEDA pay-per-use model. Verification represents a significant EDA workload in today’s complex SoCs and with the release of AMD’s next-generation EPYC processor available on Microsoft Azure, customers can take advantage of the optimized cache utilization and NUMA-aware memory layout techniques to achieve up to 2x verification throughput over previous generations."

—Sandeep Mehndiratta, Vice President of Cloud at Synopsys

Learn more

Sign up to request access to the new VMs.
NEW Azure HPC + AI Tech Community Blog.
Performance and Scalability of HBv4 and HX VMs.
Learn more about Azure HPC + AI.

#AzureHPCAI
Quelle: Azure

Improve your energy and carbon efficiency with Azure sustainability guidance

This week at the 27th United Nations Climate Change Conference of the Parties (COP27) in Sharm El-Sheikh, Egypt, we’re collectively focused on how to measure progress, build markets, and empower people across the globe to deliver a just, sustainable future for everyone on the planet.

It’s a pivotal moment for the world to come together to drive meaningful action to address and combat global climate change. It’s also an important event for Microsoft, where we will highlight our work to advance the sustainability of our business, share sustainability solutions for operational and environmental impact, and support the societal infrastructure for a sustainable world.

The customer signal is clear—sustainability is now a business imperative. In a study of over 1,230 business leaders across 113 countries, 81 percent of CEOs have increased their sustainability investments1. Sustainability is a top-10 business priority for the first time ever2, and carbon emissions are forecasted to become a top-three criterion for cloud purchases by 20253. The number of large cities with net zero targets has doubled since December 2020—from 115 to 2354 and the global market for green data centers is projected to grow to more than $181.9B by 20265.

Customers and partners are asking for help to understand how to meet and plan for rapidly evolving sustainability requirements, incentives, and regulations. At the same time, they’re dealing with rising energy costs and an uncertain economic environment. We’re hearing specific questions about building sustainable IT in the cloud: How to reduce current energy usage and costs, as well as carbon emissions? How can moving to the cloud help us achieve greater efficiency? What tools are available to make this easier?

To support you in navigating this learning curve, we’re announcing technical guidance and skilling offerings that can help you plan your path forward, improve your sustainability posture, and create new business value while reducing your operational footprint. And this is just the beginning – stay tuned for more announcements in the months ahead.

Accelerate your sustainability progress with Azure

Our recent On the Issues blog, Closing the Sustainability Skills Gap: Helping businesses move from pledges to progress underscores the importance of equipping companies and employees with a broad range of new skills to enable sustainable transformation. We’re investing across the company to support this skill development in myriad ways, including a broad range of technical guidance and skilling initiatives to help you achieve your sustainability goals with Azure. This week we’re announcing a set of architectural guidance resources to help you get started:

Azure Well-Architected Framework sustainability guidance: this documentation set describes workload optimizations for Green IT within Azure, building on the industry leadership of the Green Software Foundation and aligned to their green software principles. Because sustainability considerations apply to all five pillars: security, reliability, operational excellence, performance efficiency, and cost optimization, we approach the topic as a lens across workloads rather than a standalone pillar.
Azure Well-Architected Framework sustainability self-assessment: as you plan your cloud workloads, use this self-assessment to review the potential impact of your design decisions and how to optimize them for carbon and energy efficiency. You’ll also receive specific recommendations you can act on, whether you’re implementing or deploying an application or reviewing an existing application.
Sustainable software engineering practices in Azure Kubernetes Service (AKS): the guidance found in this article is focused on AKS services you're building or operating and includes design and configuration checklists, recommended design, and configuration options. Before applying sustainable software engineering principles to your application, we recommend reviewing the priorities, needs, and trade-offs of your application.

Supporting your sustainability journey in the cloud

With Azure, customers and partners can compound their benefits at each stage of the cloud journey, from migrating to the cloud to save on energy, carbon, and infrastructure costs, to optimizing in the cloud to achieve operational excellence, to reinvesting savings into new initiatives that will provide enduring business value.

Across industries, organizations are optimizing their cloud investments by aligning to patterns and practices in the Cloud Adoption Framework and the Well-Architected Framework. They’re also achieving market leadership through reinvesting to drive innovation. Sweden’s largest real estate company and a global leader in sustainability, Vasakronan, adopted an IoT and Digital Twins solution using Azure and expects to realize a year-on-year savings of six million kronor (USD 700,000) in energy consumption costs alone.

As part of Microsoft’s ongoing commitment to promote sustainable development and low-carbon business practices globally, our Azure guidance complements solutions such as Microsoft Cloud for Sustainability and Emissions Impact Dashboard for Microsoft cloud services. We’ll continue to work with customers, partners, and industry leaders, such as the Green Software Foundation to build, maintain, and promote best practices for green IT and innovation that further resilient, thriving, and just economies. From an industry-leading training company:

“This is the missing ingredient in our business; it gives purpose and meaning. If you could put an overlay on your environment or applications and say here are 20 recommendations to make it optimally sustainable, reduce carbon emissions, give the data so you can make incremental improvements over the years, and manage it—that’s huge!”— Todd Fine, Chief Strategy Officer, Atmosera.

As we continue to build out our guidance to help our customers achieve their sustainability goals using Azure, our goal is to meet you where you are and help you do more with less, whether you’re building cloud-native applications, operating in hybrid environments, or evaluating solutions for organization-wide emissions reporting.

Driving sustainability skilling across your organization

Research shows that cloud skilling programs can improve business outcomes and individual career advancement, as well as accelerate success in the cloud. For this reason, we’ve published a set of resources to provide a starting place to help your people and teams understand how they can contribute to their organization’s sustainability goals while developing highly relevant skills and expertise.

Azure sustainability guidance Cloud Skills Challenge: Azure sustainability guidance Cloud Skills Challenge: this fun, no-cost, interactive program helps skill individuals and teams on Microsoft cloud technologies via a gamified experience utilizing Microsoft Learn content. Teams can access a custom leaderboard, and individuals can compete with industry peers.
Azure sustainability guidance Microsoft Learn Collection: developed as a starting point to help you find relevant learning content on Azure sustainability initiatives, share this with friends and colleagues today and check back for updates in the weeks and months ahead. You can also make it yours—we invite you to copy this collection, personalize it, and share it with your network.
Principles of Sustainable Software Engineering course: This Microsoft Learn module provides a primer on the eight principles of Sustainable Software Engineering, covering a wide range of topics such as electricity and carbon efficiency, carbon intensity, and how to think through the trade-offs required for optimization. Accessible to any level of learner familiar with basic computing concepts.

Get started today

These resources will help you more easily plan your strategy, improve your current sustainability posture, and foster green innovation. Use them to chart a faster path toward internal and external sustainability goals and accelerate your organization’s Environmental, Social, Governance (ESG) journey. As an added benefit, sustainable workloads are also more efficient and modern, which can reduce the total cost of an application or initiative.

As you move to the cloud, you gain the advantage of our decades of action and progress on carbon, waste, water, and ecosystems within our datacenter regions. Read more about how we’re building on what we’ve learned from our sustainability journey.

We look forward to hearing how we can continue to support your sustainability journey in the cloud.

 

1 CEO Climate Leadership & Sustainability Study (accenture.com)
2 Carbon emissions data to become key factor in cloud purchases by 2025, predicts Gartner (computerweekly.com)
3 CEO Climate Leadership & Sustainability Study (accenture.com)
4 Net Zero Stocktake 2022 | Net Zero Tracker
5 Going Green is No Longer Optional | Data Centre Magazine
Quelle: Azure

Azure comes to Dallas for Supercomputing

Supercomputing (SC), held annually, is arguably one of the biggest annual events in high-performance computing (HPC). It’s a great chance for the community to connect and learn from one another, and for Microsoft, the goal remains the same. After a couple years of virtual-only events, we’re thrilled to be joining you in person in Dallas this year—and we’ve got some exciting things in store.

Step into our Microsoft Booth

This year we’ll be located at booth #2433—make sure to stop by and say "hi." We’ll have several goodies for you to take, a caricature artist to create a one-of-a-kind keepsake of your experience at the event, and two hardware booths to let you see in person some of the technology that powers Microsoft Azure HPC + AI virtual machines (VMs). We’ll even be showcasing our newest product launch in the hardware bar. Feeling tired? Come take a break in our lounge or café area for some much-needed relief, networking, or coffee. And of course, an event wouldn’t be the same without our Microsoft specialists and partners sharing in our booth. View our blog to see the full lineup of sessions in the booth.

Explore our joint booth with NVIDIA

We’ll also have a joint booth with NVIDIA, located at #2409. We’ll have Microsoft experts and partners giving presentations in this booth to share insights on the confluence of AI with HPC using NVIDIA accelerated computing on Azure. Don’t be shy about stopping by to talk to our subject matter experts, network with peers or simply get off your feet.

Want to take a break from the hustle of the conference and stretch your creative muscles? Come over to our chalkboard and create an image of what “Make AI Your Reality” means to you. Here’s what to do—draw your image, take a picture and post it to your favorite channel with #MakeAIYourReality, and then show it to a Microsoft or NVIDIA representative in the booth to get your raffle ticket for a chance to win an NVIDIA Jetson Nano developer kit—a small but powerful computer to start learning about building AI-enabled applications with ready-to-try projects and community support.

Attend the Women in HPC (WHPC) networking reception

We are tremendously honored to be sponsoring the annual Women in High-Performance Computing networking reception. Not only will this drive awareness of diversity and inclusivity topics, but will produce understandings about how we can all work towards improving the under-representation of women in supercomputing.

Visit the Student Cluster Competition

We are so excited to be supporting the SC Student Cluster Competition once again this year! Every year, both undergraduate and high school students design and build small clusters, learn scientific applications, apply optimization techniques for their chosen architectures, and compete in a 48-hour challenge at the SC event to complete real-world workloads, demonstrating their HPC knowledge for conference attendees and judges.

Catch up on what’s new at this year's Supercomputing

New channels to find us

Take a look at our new blog on Microsoft Tech Community.
Check out our joint HPCWire microsite with NVIDIA.
Tour our NVIDIA on the Azure website.

Hear from our customers

With a tight timeline, UD Trucks collaborated with Microsoft and executing partner HCLTech to design and deploy a Microsoft Azure–based system for its previously on-premises simulation and design processes, completing the project in only one month. The new system’s capabilities provide better computational results and have opened new avenues of data innovation, in addition to reducing costs by about 30 percent. The project’s success has led the company to develop plans to move all of its systems to Azure. Read the story.

Rimac Technology is now running Ubercloud on Microsoft Azure to support engineering simulations during the development and testing of electrical vehicles and components. The move has allowed them to manage greater model complexity and take advantage of increased processing speed and scale. Read the full story.

University of bath moved almost all its existing supercomputing resources to the cloud with Microsoft Azure HPC + AI. To power research workloads, it deployed Azure HPC + AI resources on 21 virtual machine instances, ensuring that the right virtual machines can be spun up for any project. Read the full story.

Kensington has decreased its engine runtime from 20 hours to just 25 minutes—for 1.9 billion total calculations with every run. With the stronger insights and predictions that it gains from its optimized models and results, Kensington creates more tailored products and better serves its customers. Read the full story.

Learn more

Take a look at our new blog on Microsoft Tech Community.
Check out our joint HPCWire microsite with NVIDIA.
Azure Scales 530B Parameter GPT-3 Model with NVIDIA NeMo Megatron.
Choose the right size for your workload with NVads A10 v5 virtual machines, now generally available.
Azure empowers easy-to-use, high-performance, and hyperscale model training using DeepSpeed.
Building the highway to AI success with cloud infrastructure.

 

#AzureHPCAI #SC22 #MakeAIYourReality
Quelle: Azure

Announcing more Azure VMware Solution enhancements

I’m writing to you today from VMware Explore in Barcelona, where my team and I are presenting to and meeting with customers and partners in person! When we launched Azure VMware Solution two years ago amid a pandemic, IT agility became a top priority as organizations scrambled to enable remote work and ensure business resilience via cloud solutions. In today’s economic climate most organizations want to do more with less. They recognize that by running workloads in the cloud, they can respond more rapidly and reduce IT infrastructure costs.

"I can definitely say that Azure—and in particular Azure VMware Solution—is the right solution for us. It allows us to seamlessly move from on-premises to the cloud, thereby freeing up resources and capital investments that can be used where they are needed more.”—Giorgio Veronesi, Sr. Vice President of ICT Infrastructure, Snam.

Given that TCO is top priority for most companies in the current economic climate, migrating your VMware workloads to Azure is a great way to reduce the cost of maintaining an on-premises VMware environment. Because every customer starts their cloud journey at a different place, we help enable customers to migrate to the cloud on their terms and maintain support for the business platforms and investments they have today.  Azure VMware Solution is an easy way to extend and migrate existing VMware Private Clouds to run them natively on Azure. Azure VMware Solution offers symmetry with on-premises environments, which helps to accelerate datacenter migrations, so customers recognize the benefits of the cloud sooner.

"With help from Microsoft and Mobiz, we were able to deliver a fully qualified landing zone in Azure in one-third the time and at one-third the budget compared to previous cloud efforts."—Sam Chenaur: Vice President and Global Head of Infrastructure, Sanofi.

In keeping with the goal of doing more with less, Microsoft’s unique Azure Hybrid Benefit and Extended Security Updates for Windows Server and SQL Server, Azure VMware Solution is one of the fastest and most cost-effective ways to seamlessly migrate and run VMware in the cloud. If you want to learn more about TCO in your organization read this Forrester paper.

Check out what’s new in Azure VMware Solution

I am excited to share some of the recent updates we’ve made to Azure VMware Solution.

Stretched Clusters for Azure VMware Solution, now in preview, provides 99.99 percent uptime for mission critical applications that require the highest availability. In times of availability zone failure, your virtual machines (VMs) and applications automatically failover to an unaffected availability zone with no application impact. Learn more.
Azure NetApp Files Datastores is now generally available to run your storage intensive workloads on Azure VMware Solution. This integration between Azure VMware Solution and Azure NetApp Files enables you to create datastores via the Azure VMware Solution resource provider with Azure NetApp Files NFS volumes and attach the datastores to your private cloud clusters of choice. Learn more.
Customer-managed keys for Azure VMware Solution is now in preview, both supporting higher security for customers’ mission-critical workloads and providing you with control over your encrypted vSAN data on Azure VMware Solution. With this feature, you can use Azure Key Vault to generate customer-managed keys as well as centralize and streamline the key management process. Learn more.
New node sizing for Azure VMware Solution. Start leveraging Azure VMware Solution across two new node sizes with the general availability of AV36P and AV52 in AVS. With these new node sizes organizations can optimize their workloads for memory and storage with AV36P and AV52. Learn more.
Microsoft Azure native services let you monitor, manage, and protect your virtual machines (VMs) in a hybrid environment (Azure, Azure VMware Solution, and on-premises). Here are some of the existing Azure services: Azure Arc, Azure Monitor, Microsoft Defender for Cloud, Azure Update Management, and Log Analytics Workspace. Learn more.

If you would like to stay up to date with the latest releases from Azure VMware Solution, please follow  Azure updates.

Learn more

This week we are offering a special opportunity to take the Azure VMware Solution Cloud Skills Challenge. Compete in this free, self-paced, Microsoft learning path and advance your technical skills at the same time! Register for the Challenge.

And if you are here at VMware Explore Barcelona, stop by the Microsoft booth, and say hello. We are excited to see you in person!

Check out all our Azure Breakout Sessions during the event.
Visit Booth #401 for our hourly in-booth theater sessions.

You can also attend our next Azure Webinar on December 15th: How to Modernize Your VMware Environment with Microsoft Azure.

As always, you can visit the Azure VMware Solution website or documentation for more information.
Quelle: Azure

Accelerate your cloud-native journey with Azure Monitor

This blog was co-authored by Xema Pathak, Senior Product Manager; Sahil Arora, Principal PM Lead; Matthew McCleary, Senior Program Manager and Brian Wren, Principal Content Developer.

Organizations are going through an era of digital transformation and are embracing various cloud-native technologies to fuel innovation. Developers are critical to this transformation; they need to quickly bring innovation to the market to address customer needs. At Microsoft Azure, we aspire to be the platform to empower you to accelerate your cloud-native journey!

Applications developed on Azure deliver reliability, scalability, and the ability to handle huge amounts of workloads anywhere around the world. These cloud-native apps take advantage of containers, serverless technology, and micro-services-based architecture with Azure Kubernetes Service, Azure Container Apps, and Azure Functions.

Such a growing application environment brings new challenges. Business acceleration is far more than just adopting cloud-native technologies, it’s also about agility and scale. We recognize how onerous it can be for developers to configure and monitor the infrastructure and distributed microservices. Once microservices are deployed, you need the ability to effectively detect and troubleshoot issues to ensure that you provide performance and reliability to your customers. Combating these challenges requires a reliable and scalable monitoring solution that seamlessly combines metrics, logs, and dashboards into a single experience.

Our team has been busy bringing you a reliable, scalable, and secure monitoring service with Azure Monitor. We are excited to share our new offerings—Azure Monitor managed service for Prometheus (preview) and Azure Managed Grafana. They complement existing Azure Monitor tools to help you monitor each layer of your full cloud-native stack on Kubernetes and quickly troubleshoot issues across microservices and infrastructure. Additionally, you can now set up collection for Prometheus metrics and container logs and view them in Grafana dashboards with a single click!

The new Azure Monitor managed service for Prometheus (preview) gives you a fully managed cloud-native metrics solution to ingest, alert on, and query your Prometheus metrics, which provide visibility to the health and performance of your infrastructure and applications. Azure monitor container insights increases your visibility with a fully managed cloud-native logs solution for advanced troubleshooting by analyzing container stdout, stderr, and infrastructure logs. You can quickly identify and mitigate latency and reliability issues using distributed traces with Azure Monitor application insights with preview OpenTelemetry-based instrumentation, which is a fully managed application and performance monitoring (APM) solution. Take your DevOps and site reliability engineering productivity to the next level with the new Azure Managed Grafana with plugins for Azure Monitor, which gives you a fully managed service for full-stack troubleshooting dashboards.

Let’s take a look at how simple it is to configure the monitoring for your kubernetes clusters with Azure Monitor and how to quickly identify the issues across each layer of your full cloud-native stack.

Monitor with highly available, scalable, and secure managed service for Prometheus

Prometheus has become a de facto standard for Kubernetes monitoring to collect a rich set of metric types and visualize them with Grafana, which offers a huge set of community dashboards. However, running self-managed Prometheus is challenging to scale for enterprise workloads, requiring significant time and bandwidth to set up and maintain Kubernetes monitoring deployments. We are introducing Azure Monitor Managed service for Prometheus (preview) to overcome the challenges of self-managed Prometheus and help you accelerate innovation by carrying out frequent high-scale deployments for your services.

Azure Monitor managed service for Prometheus (preview) is a fully managed service that provides a highly available, scalable, and enterprise-grade secure service to easily monitor applications and services running in a containerized environment. Use it as a drop-in replacement for self-managed Prometheus or as remote storage option. Our remote write interface allows you to continue using your self-managed Prometheus, adding the benefits of our managed service such as high-availability, scaling, monitoring across clusters, and long-term data retention.

Easily troubleshoot issues with logs

With Azure Monitor’s unified cloud-native offering for Kubernetes monitoring, you can easily set up log collection alongside managed service for Prometheus. Azure Monitor container insights collects troubleshooting logs, provides recommended alerts to proactively identify issues, and has visualizations to monitor health and performance of your Kubernetes cluster. It complements Prometheus and Grafana for end-to-end Kubernetes monitoring across microservices and infrastructure. With Azure Monitor, you can easily:
•    Identify any resource bottlenecks and perform advanced debugging for any implicit failures.
•    Proactively lookout for any failures or outages by configuring alerts and notifications on Prometheus metrics and container logs.
•    Continuously observe the overall health of your infrastructure and correlate metrics and logs under a single view.

 

Observe your full-stack with Azure Managed Grafana and OpenTelemetry-based instrumentation

Once you have the monitoring data, it is imperative to bring it all together for continuous observability. To gain more insights for the data you collect, link Azure Managed Grafana to managed service for Prometheus. This offers a curated set of popular open source Grafana dashboards built on top of Prometheus metrics out-of-the-box. You can also combine application metrics and infrastructure metrics from various data sources into a single dashboard for full-stack observability.

At the application layer, Azure Monitor’s preview OpenTelemetry-based instrumentation allows you to use open source technologies to collect additional telemetry from within your application components. We are announcing new capabilities for our preview OpenTelemetry-based offering including metrics, sampling, and resilient data transport. This makes it easier to start using OpenTelemetry APIs with Azure Monitor Application Insights. There’s no need to configure additional agents or other system processes on your cluster. In the video below, check out our recent progress in bringing OpenTelemetry and Grafana to Azure and learn more about the value they can bring to your project.

As cloud-native technologies become central to digital transformation, we are focused on empowering developers to unlock productivity and innovation. We continue to innovate and add more capabilities that allow you to provide performant and reliable experiences to your customers. Get started with monitoring Kubernetes clusters with Azure Monitor.

Learn more
•    Learn more about Azure Monitor managed service for Prometheus and read our technical documentation.
•    Learn more on how to get started with Azure Monitor’s unified cloud-native offering for Kubernetes monitoring.
•    Read the Grafana integrations with Azure Monitor blog.
•    Learn more about OpenTelemetry with Azure Monitor.
Quelle: Azure

Zero downtime migration for Azure Front Door—now in preview

In March of this year, we announced the general availability of two new Azure Front Door tiers. Azure Front Door Standard and Premium is our native, modern cloud content-delivery network (CDN), catering to both dynamic and static content delivery acceleration with built-in turnkey security and a simple and predictable pricing model. It has already been widely adopted by many of our customers. We also promised to provide a zero downtime migration tool to migrate from Azure Front Door (classic) and Azure CDN from Microsoft (classic) to the new Azure Front Door tier.

Today, we are taking the next step in that journey, and we are excited to announce the preview of the Azure Front Door tier migration capability as well as some new additional features. The migration capability for Azure CDN from Microsoft (classic) will be coming soon.

New features/capabilities on the new Front Door since general availability

Along with the migration feature, we added more capabilities, and integrations to the new Front Door tiers to provide you a better cloud CDN solution and a more integrated Azure cloud experience.

Preview—Upgrade from Standard to Premium tier without downtime: To learn more about upgrading to Premium tier, see Azure Front Door Tier Upgrade. This capability is also supported during the migration from Azure Front Door (classic) to the new Front Door tier.
Preview—Managed identities integration: Azure Front Door now supports Managed Identities generated by Azure Active Directory to allow Front Door to easily and securely access other Azure AD–protected resources such as Azure Key Vault. This feature is in addition to the AAD Application access to Key Vault that is currently supported. To learn more about how to enable managed identities on Azure Front Door Standard and Premium, please read Set up managed identity with Front Door.
Integration with App Service: Front Door can now be deployed directly from the App Service resource with a few clicks. The previous deployment workflow only supported Azure Front Door (classic) and Azure CDN.
Pre-validated domain integration with Static Web Apps: Static Web App (SWA) customers who have already validated custom domains at the SWA level can now skip domain validation on their Azure Front Door. For more details, see Configure a custom domain on Azure Front Door using the Azure portal.
Terraform support for Azure Front Door Standard and Premium, enabling the automation of Azure Front Door Standard and Premium provisioning using Terraform. For more information, see Create a Front Door Standard/Premium profile using Terraform.
Azure Advisor integration provides suggestions for best practices and configurations, including expired certificates, certificates about to expire, autorotation failure for managed certificates, domains pending validation after 24 hours, use the latest "secret" version.

Migration overview

Azure Front Door enables you to perform a zero-downtime migration from Azure Front Door (classic) to Azure Front Door Standard or Premium in just three simple steps. The migration will take a few minutes to complete depending on the complexity of your Azure Front Door (classic) instance, such as the number of domains, backend pools, routes, and other configurations.

If your Azure Front Door (classic) instance has custom domains with your own certificates, there will be two extra steps to enable managed identities and grant managed identity to a key vault for the new Azure Front Door profile.

The classic instance will be migrated to the Standard or Premium tier by default based on the Azure Front Door (classic) WAF configurations. Upgrading from the Standard tier to Premium during the migration is also supported. If your Azure Front Door (classic) qualifies to migrate to Azure Front Door Standard, but the number of resources exceeds the standard quota limit, the Azure Front Door (classic) instances will be migrated to a Premium profile instead.

If you have Web Application Firewall (WAF) policies associated with the Front Door profile, the migration process will create copies of your WAF policies and configurations for the new Front Door profile tier. You can also use an existing WAF policy that matches the tier you're migrating to.

Azure Front Door tier migration is supported using the Azure portal. Azure PowerShell, Azure CLI, SDK, and Rest API support will come soon.

You’ll be charged for the Azure Front Door Standard and Premium base fee from the moment the migration completes. Data transfer out from edge location to client, Outbound Data Transfer from Edge to the Origin, Requests will be charged based on the traffic flow after migration. For more details about Azure Front Door Standard and Premium pricing, see our pricing for Azure Front Door.

Notable changes after migration

DevOps: Azure Front Door Standard and Premium uses a different resource provider namespace Microsoft.Cdn, while Azure Front Door (classic) uses Microsoft.Network. After migration from classic to the Standard or Premium tier, you’ll need to change your Dev-Ops scripts and infrastructure code to use the new namespace and updated ARM template, Bicep, PowerShell Module, Terraform, CLI commands, and API.
Endpoint: The new Front Door endpoint gets generated with a hash value to prevent domain takeover, in the format of endpointname-hashvalue.z01.azurefd.net. The Azure Front Door (classic) endpoint name will continue to work after migration. However, we recommend replacing it with the newly created endpoint in Azure Front Door Standard and Premium. For more information, refer to Endpoint in Azure Front Door.
Diagnostic logs and metrics won’t be migrated. We recommend you enable diagnostic logs and monitoring metrics in your Azure Front Door Standard or Premium profile after migration. Azure Front Door Standard and Premium tier also offers built-in reports and health probe logs.

Get started

Get started with your Azure Front Door migration today!

To learn more about the service and various features, refer to the Azure Front Door documentation.

Learn more about Azure Front Door's tier migration capabilities

About Azure Front Door (classic) to Standard/Premium tier migration.
Mapping between Azure Front Door (classic) and Standard/Premium tier.
Migrate Azure Front Door (classic) to Standard/Premium tier in the Azure portal.

We’re looking forward to your feedback to drive a better experience for the general availability of the migration feature.
Quelle: Azure

Sharing the latest improvements to efficiency in Microsoft’s datacenters

In April, I published a blog that explained how we define and measure energy and water use at our datacenters, and how we are committed to continuous improvements.

Now, in the lead up to COP27, the global climate conference to be held in Egypt, I am pleased to provide a number of updates on how we’re progressing in making our datacenters more efficient across areas such as waste, renewables, and ecosystems. You can also visit Azure Sustainability—Sustainable Technologies | Microsoft Azure to explore this further.

Localized fact sheets in 28 regions

To share important information about the impact of our datacenters regionally with our customers, we have published localized fact sheets in 28 regions across the globe. These fact sheets provide a wide range of information and details about many different aspects of our datacenters and their operations.

A review of PUE (Power Usage Effectiveness) and WUE (Water Usage Effectiveness)

PUE is an industry metric that measures how efficiently a datacenter consumes and uses the energy that powers a datacenter, including the operation of systems like powering, cooling, and operating the servers, data networks and lights. The closer the PUE number is to “1,” the more efficient the use of energy.
While local environment and infrastructure can affect how PUE is calculated, there are also slight variations across providers.

Here is the simplest way to think about PUE:

WUE is another key metric relating to the efficient and sustainable operations of our datacenters and is a crucial aspect as we work towards our commitment to be water positive by 2030. WUE is calculated by dividing the number of liters of water used for humidification and cooling by the total annual amount of power (measured in kWh) needed to operate our datacenter IT equipment.

In addition to PUE and WUE, below are key highlights across carbon, water, and waste initiatives at our datacenters.

Datacenter efficiency in North and South America

As I illustrated in April, our newest generation of datacenters have a design PUE of 1.12; this includes our Chile datacenter that is under construction. We are constantly focused on improving our energy efficiency, for example in California, our San Jose datacenters will be cooled with an indirect evaporative cooling system using reclaimed water all year and zero fresh water. Because the new datacenter facilities will be cooled with reclaimed water, they will have a WUE of 0.00 L/kWh in terms of freshwater usage.

In addition, as we continue our journey to achieve zero waste by 2030, we are proud of the progress we are making with our Microsoft Circular Centers. These centers sit adjacent to a Microsoft datacenter and process decommissioned cloud servers and hardware. Our teams sort and intelligently channel the components and equipment to optimize, reuse or repurpose.

In October, we launched a Circular Center in Chicago, Illinois that has the potential capacity to process up to 12,000 servers per month for reuse, diverting up to 144,000 servers annually. We plan to open a Circular Center in Washington state early next year and have plans for Circular Centers in Texas, Iowa, and Arizona to further optimize our supply chain and reduce waste.

Furthermore, our team has successfully completed an important water reuse project at one of our datacenters. This treatment facility, the first of its kind in Washington state and over 10 years in the making, will process water for reuse by local industries, including datacenters, decreasing the need for potable water for datacenter cooling.

Innovative solutions in Europe, the Middle East, and Africa

This winter Europeans face the possibility of an energy crisis, and we have made a number of investments in optimizing energy efficiency in our datacenters to ensure that we are operating our facilities as effectively as possible. Datacenters are the backbone of modern society and as such it is important that we continue to provide critical services to the industries that need us most in a way that constantly mitigates energy consumption.

Across our datacenters in EMEA, we have made steady progress across carbon, waste, water, and ecosystems. We are committed to shifting to 100 percent renewable energy supply by 2025, meaning that we will have power purchase agreements for green energy contracted for 100 percent of carbon emitting electricity consumed by all our data centers, buildings, and campuses. This will add additional gigawatts of renewable energy to the grid, increasing energy capacity. With that said we have added more than 5 gigawatts of renewable energy to the grid globally, this has culminated in more than 15 individuals deals in Europe spanning Ireland, Denmark, Sweden, and Spain.

In Finland, we recently announced an important heat reuse project that will take excess heat from our datacenters and transfer that heat to the local districts’ heating systems that can be used for both domestic and commercial purposes.

To reduce waste from our datacenters in EMEA, the Circular Center we opened in Amsterdam in 2020, which has since already delivered an 83 percent reuse of end-of-life datacenter assets and components. This is progress towards our target of 90 percent reuse and recycling of all servers and components for all cloud hardware by 2025. In addition, in January 2022, we opened a Circular Center in Dublin, Ireland, and have plans to open another Circular Center in Sweden to serve the region.

As we continue to seek out efficiencies in our operations, recently we turned to nature for inspiration, to understand how much of the natural ecosystem we could replenish on the site of a datacenter, essentially integrating the datacenter into nature with the goal of renewing and revitalizing the surrounding area so that we can restore and create a pathway to provide regenerative value for the local community and environment. In the Netherlands we have begun construction of a lowland forested area around the datacenter as well as forested wetland. This was done to support the growth of native plants to mirror a healthy, resilient ecosystem and support biodiversity, improve storm water control and prevent erosion.

Updates in Asia Pacific

Finally, I’d like to highlight some of the sustainability investments we have made across Asia Pacific. In June 2022, we launched our Singapore Circular Center that is capable of processing up to 3,000 servers per month for reuse, or 36,000 servers annually. We have plans to open additional Circular Centers in Australia and South Korea in fiscal year 2025 and beyond. Across our datacenters in APAC, we have formed partnerships with local energy providers for renewable energy that is sourced from wind, solar, and hydro power and we have plans to further these partnerships and investments in renewable energy. In our forthcoming datacenter region in New Zealand, we have signed an agreement that will enable Microsoft to power all of its datacenters with 100 percent renewable energy from the day it opens.

Innovating to design the hyperscale datacenter of the future

What these examples from across our global datacenter portfolio show is our ongoing commitment to make our global Microsoft datacenters more sustainable and efficient, enabling our customers to do more with less.

Our objective moving forward is to continue providing transparency across the entire datacenter lifecycle about how we infuse principles of reliability, sustainability, and innovation at each step of the datacenter design, construction, and operations process.

Design: How do we ensure we design for reliability, efficiency, and sustainability, to help reduce our customers' scope three emissions?
Construction: How do we reduce embodied carbon and create a reliable supply chain?
Operation: How do we infuse innovative green technologies to decarbonize and operate to the efficient design standards?
Decommissioning: How do we recycle and reuse materials in our datacenters?
Community: How do we partner with the community and operate as good neighbors?

We have started by sharing datacenter region-specific data around carbon, water, waste, ecosystems, and community development and we will continue to provide updates as Microsoft makes further investments globally.

Learn more

You can learn more about our global datacenter footprint across the 60+ datacenter regions by visiting datacenters.microsoft.com.

View our Microsoft datacenter factsheets.
Learn more about Azure sustainability.
Discover new sustainability guidance in the Azure Well-Architected Framework.
Take a virtual tour of Microsoft’s datacenters.

Quelle: Azure

Build a globally resilient architecture with Azure Load Balancer

Azure Load Balancer’s global tier is a cloud-native global network load balancing solution. With cross-region Load Balancer, customers can distribute traffic across multiple Azure regions with ultra-low latency and high performance. To better understand the use case of Azure’s cross-region Load Balancer, let’s dive deeper into a customer scenario. In this blog, we’ll learn about a customer, their use case, and how Azure Load Balancer came to the rescue.

Who can benefit from Azure Load Balancer?

This example customer is a software vendor in the automotive industry. Their current product offerings are cloud–based software, focused on helping vehicle dealerships manage all aspects of their business including sales leads, vehicles, and customer accounts. While it is a global company, most of its business is done in Europe, the United Kingdom (UK), and Asia Pacific regions. To support its global business, the customer utilizes a wide range of Azure services including virtual machines (VMs), a variety of platform as a service (PaaS) solutions, Load Balancer, and MySQL to help meet an ever-growing demand.

What are the current global load balancing solutions?

The customer is using domain name service (DNS)–based traffic distribution to direct traffic to multiple Azure regions. At each Azure region, they deploy regional Azure Load Balancers to distribute traffic across a set of virtual machines. However, if a region went down, they experienced downtime due to DNS caching. Although minimal, this was not a risk they could continue to take on as their business expanded globally.

What are the problems with the current solutions?

Since the customer’s solution is global, as traffic increased, they noticed high latency when requesting information from their endpoints across regions. For example, users located in Africa noticed high latency when they tried to request information. Often their requests were routed to an Azure region on another continent, which caused the high latency. Answering requests with low latency is a critical business requirement to ensure business continuity. As a result, they needed a solution that withstood regional failover, while simultaneously providing ultra-low latency with high performance.

How did Azure’s cross-region Load Balancer help?

Given that low latency is a requirement for the customer, a global layer 4 load balancer was a perfect solution to the problem. The customer deployed Azure’s cross-region Load Balancer, giving them a single unique globally anycast IP to load balance across their regional offices. With Azure’s cross-region Load Balancer, traffic is distributed to the closest region, ensuring low latency when using the service. For example, if a customer connected from Asia Pacific regions, traffic is automatically routed to the closest region, in this case Southeast Asia. The customer was able to add all their regional load balancers to the backend of the cross-region Load Balancer and thus improved latency without any additional downtime. Before the update was deployed across all regions, the customer verified that their metrics for data path availability and health probe status are 100 percent on both its cross-region Load Balancer and each regional Load Balancer.
 

After deploying cross-region Load Balancer, traffic is now distributed with ultra-low latency across regions. Since the cross-region Load Balancer is a network load balancer, only the TCP/UDP headers are quickly inspected instead of the entire packet. In addition, the cross-region Load Balancer will send traffic to the closest participating Azure region to a client. These benefits are seen by the customer who now sees traffic being served with lower latency than before.

Learn More

Visit the Cross-region load balancer overview to learn more about Azure’s cross-region Load Balancer and how it can fit into your architecture.
Quelle: Azure

Secure your digital payment system in the cloud with Azure Payment HSM—now generally available

We are very excited to announce the general availability of Azure Payment HSM, a BareMetal Infrastructure as a service (IaaS) that enables customers to have native access to payment HSM in the Azure cloud. With Azure Payment HSM, customers can seamlessly migrate PCI workloads to Azure and meet the most stringent security, audit compliance, low latency, and high-performance requirements needed by the Payment Card Industry (PCI).

Azure Payment HSM service empowers service providers and financial institutions to accelerate their payment system’s digital transformation strategy and adopt the public cloud.

“Payment HSM support in the public cloud is one of the most significant hurdles to overcome in moving payment systems to the public cloud.  While there are many different solutions, none can meet the stringent requirements required for a payment system. Microsoft, working with Thales, stepped up to provide a payment HSM solution that could meet the modernization ambitions of ACI Worldwide’s technology platform. It has been a pleasure working with both teams to bring this solution to reality."
—Timothy White, Chief Architect, Retail Payments and Cloud

Service overview

Azure Payment HSM solution is delivered using Thales payShield 10K Payment HSM, which offers single-tenant HSMs and full remote management capabilities. The service is designed to enable total customer control with strict role and data separation between Microsoft and the customer. HSMs are provisioned and connected directly to the customer’s virtual network, and the HSMs are under the customer’s sole administration control. Once allocated, Microsoft’s administrative access is limited to “Operator” mode and full responsibility for configuration and maintenance of the HSM and software falls upon the customer. When the HSM is no longer required and the device is returned to Microsoft, customer data is erased to ensure  privacy and security. The solution comes with Thales payShield premium package license and enhanced support Plan, with a direct relationship between the customer and Thales.

 

Figure 1: After HSM is provisioned, HSM device is connected directly to a customer’s virtual network with full remote HSM management capabilities through Thales payShield Manager and TMD.

The customer can quickly add more HSM capacity on demand and subscribe to the highest performance level (up to 2500 CPS) for mission-critical payment applications with low latency. The customer can upgrade, or downgrade HSM performance level based on business needs without interruption of HSM production usage. HSMs can be easily provisioned as a pair of devices and configured for high availability.

Azure remains committed to helping customers achieve compliance with the Payment Card Industry’s leading compliance certifications. Azure Payment HSM is certified across stringent security and compliance requirements established by the PCI Security Standards Council (PCI SSC) including PCI DSS, PCI 3DS, and PCI PIN. Thales payShield 10K HSMs are certified to FIPS 140-2 Level 3 and PCI HSM v3. Azure Payment HSM customers can significantly reduce their compliance time, efforts, and cost by leveraging the shared responsibility matrix from Azure’s PCI Attestation of Compliance (AOC).

Typical use cases

Financial institutions and service providers in the payment ecosystem including issuers, service providers, acquirers, processors, and payment networks will benefit from Azure Payment HSM. Azure Payment HSM enables a wide range of use cases, such as payment processing, which allows card and mobile payment authorization and 3D-Secure authentication; payment credential issuing for cards, wearables, and connected devices; securing keys and authentication data and sensitive data protection for point-to-point encryption, security tokenization, and EMV payment tokenization.

Get started

Azure Payment HSM is available at launch in the following regions: East US, West US, South Central US, Central US, North Europe, and West Europe

As Azure Payment HSM is a specialized service, customers should ask their Microsoft account manager and CSA to send the request via email.

Learn more about Azure Payment HSM

Azure Payment HSM.
Azure Payment HSM documentation.
Thales payShield 10K.
Thales payShield Manager.
Thales payShield Trusted Management Device.

To download PCI certification reports and shared responsibility matrices:

Azure PCI PIN AOC.
Azure PCI DSS AOC.
Azure PCI 3DS AOC.

Quelle: Azure