Generative AI and the path to personalized medicine with Microsoft Azure

Transforming care for patients and providers alike with Azure OpenAI Service

In the rapidly evolving landscape of healthcare, the integration of artificial intelligence (AI) isn’t a futuristic vision: It’s a present reality. Azure OpenAI Service is supporting the way care is delivered and experienced by patients and providers alike. As healthcare providers and tech companies collaborate to harness the power of generative AI, they pave the way for more efficient, accessible, and personalized healthcare solutions.

There’s never been a better time to examine how Azure OpenAI Service could assist the healthcare sector. After all, the potential benefits are staggering, ranging from a 50% reduction in treatment costs to a 40% improvement in health outcomes.

A recent report from MarketsandMarkets states that the AI in healthcare market size is estimated to surge from USD 20.9 billion in 2024 to USD 148.4 billion by 2029. As seen in the use cases that follow, the scalability and efficiency of AI applications in healthcare are set are positively impacting patient and healthcare workforce experiences and make personalized medicine a reality for staff and patients around the globe. 

Azure OpenAI Service

Build your own copilot and generative AI applications

Explore

Below, let’s look at four Microsoft partners who have made it their mission to drive better outcomes for patients and those that serve them, using Azure OpenAI Service. 

Kry is providing healthcare for all with Azure

Challenge: Kry sought to leverage technology to improve accessibility, personalize patient care, and alleviate the administrative burdens on clinicians, all while expanding limited healthcare resources. 

Azure OpenAI Service Solution: Kry addressed immediate challenges of accessibility, personalization, and administrative efficiency. Kry utilized Azure OpenAI Service to develop AI-driven tools that enable patients to access healthcare services more easily. By integrating generative AI into their platform, they could offer services in over 30 languages, thereby making healthcare more accessible to a global population. Customization ensures that patients receive relevant and effective treatments, improving outcomes and patient satisfaction. By automating routine tasks and streamlining patient data management, clinicians can devote more time to patient care rather than paperwork.

“AI allows Kry clinicians to focus on delivering better care, while ensuring patients can access the advice, care, and treatment they need in the most efficient way”
—Fredrik Nylander, Chief Technology Officer at Kry

This not only enhances the efficiency of healthcare delivery but also addresses the issue of physician burnout. This approach has enabled Kry to become Europe’s largest digital-first healthcare provider, managing over 200 million patient interactions. Through the efficient use of generative AI, more patients can be served with the same or even fewer resources, significantly expanding the impact of healthcare services. 

TatvaCare utilizing Azure Open AI to promote patient-centric care

Challenge: Faced with growing healthcare demands and the complexity of managing chronic conditions, TatvaCare sought to deliver more efficient, patient-centric healthcare practices and improved healthcare outcomes. 

Azure OpenAI Service Solution: TatvaCare tackled this multifaceted challenge head-on by adopting Azure OpenAI to power AskAI, their intelligent AI assistant. This AI assistant was designed to understand and process patient inquiries and provide feedback in real-time, offering personalized care plans that align with individual patient needs and preferences. The efficiency gained through this automation means that TatvaCare can focus more on direct patient care rather than administrative tasks. For example, about 180,000 prescriptions are currently generated on the platform per month. AskAI, also plays a crucial role in reducing the communication gaps between doctors and patients. The AI assistant’s ability to generate highly accurate, personalized responses and recommendations streamlines the process of managing health, making it less cumbersome and more user-friendly for patients in all stages of their healthcare journey. Not only does Azure’s powerful AI capabilities provide a platform for more efficient interaction between patients and healthcare, but it also guarantees high standards of data security and compliance. Azure’s robust security frameworks ensure that all patient interactions and data handled by AskAI are protected against unauthorized access, maintaining patient confidentiality and trust.

Providence is freeing up time for caregivers with the help of Azure AI

Challenge: Increasingly burdened by administrative tasks and an overwhelming volume of patient communications, how could Providence, a leading healthcare organization, leverage technology to streamline processes, improve the efficiency of message handling, and ultimately free up caregivers to focus more on direct patient care? 

Azure OpenAI Service Solution: Providence found an effective solution to these challenges through the development and deployment of ProvARIA, a cutting-edge AI system powered by Azure OpenAI Service. Specifically designed to manage the deluge of incoming messages received by healthcare providers, ProvARIA categorizes messages based on content and urgency, ensuring that critical patient communications are prioritized and addressed promptly. The administrative workload on healthcare providers can often detract from patient care. ProvARIA integrates deeply into clinical workflows, providing context-specific recommendations and quick action shortcuts. These features streamline various administrative tasks, from scheduling to patient follow-up, thereby reducing the time and effort required for their completion. With ProvARIA handling the sorting and prioritization of messages, caregivers can allocate more time to face-to-face interactions with patients. The successful pilot of ProvARIA in several Providence clinics showcased notable improvements in message processing times and caregiver efficiency.

The impact of Microsoft AI in healthcare

Microsoft’s subsidiary Nuance uses conversational, ambient, and generative AI for its Dragon Ambient eXperience (DAX) Copilot solution. DAX Copilot, automatically documents patient encounters accurately and efficiently at the point of care—alleviating administrative burdens, improving clinician well-being and enhancing patient experience.

The advent of Azure OpenAI Service is providing solutions that can assist in addressing the diverse and complex needs of the global healthcare sector. From reducing administrative burdens on clinicians to enhancing the overall quality of patient care, the impact of AI in healthcare is profound and far-reaching.

Our commitment to responsible AI

At Microsoft, we are guided by our AI principles and Responsible AI Standard along with decades of research on AI, grounding, and privacy-preserving machine learning. A multidisciplinary team of researchers, engineers and policy experts reviews our AI systems for potential harms and mitigations — refining training data, filtering to limit harmful content, query and result-blocking sensitive topics, and applying Microsoft technologies like Azure AIContent Safety, InterpretML and Fairlearn. We make it clear how the system makes decisions by noting limitations, linking to sources, and prompting users to review, fact-check and adjust content based on subject-matter expertise.

Get started with Azure OpenAI Service

Apply for access to Azure OpenAI Service by completing this form. 

Learn about Azure OpenAI Service and the latest enhancements. 

Get started with GPT-4 in Azure OpenAI Service in Microsoft Learn. 

Read our partner announcement blog, empowering partners to develop AI-powered apps and experiences with ChatGPT in Azure OpenAI Service. 

Learn how to use the new Chat Completions API (in preview) and model versions for ChatGPT and GPT-4 models in Azure OpenAI Service.

Learn more about Azure AI Content Safety.

The post Generative AI and the path to personalized medicine with Microsoft Azure appeared first on Azure Blog.
Quelle: Azure

AI-powered dialogues: Global telecommunications with Azure OpenAI Service

In an era where digital innovation is king, the integration of Microsoft Azure OpenAI Service is cutting through the static of the telecommunications sector. Industry leaders like Windstream, AudioCodes, AT&T, and Vodafone are leveraging AI to better engage with their customers and streamline their operations. These companies are pioneering the use of AI to not only enhance the quality of customer interactions but also to optimize their internal processes—demonstrating a unified vision for a future where digital and human interactions blend seamlessly.  

Azure OpenAI Service

Build your own copilot and generative AI applications

Explore our solutions

Leveraging Azure OpenAI Service to enhance communication

Below we look at four companies who have strategically adopted Azure OpenAI Service to create more dynamic, efficient, and personalized communication methods for customers and employees alike. 

1. Windstream’s AI-powered transformation streamlines operational efficiencies: Windstream sought to revolutionize its operations, enhancing workflow efficiency and customer service.  

Windstream streamlined workflows and improved service quality by analyzing customer calls and interactions with AI-powered analytics, providing insights into customer sentiments and needs. This approach extends to customer communications, where technical data is transformed into understandable outage notifications, bolstering transparency, and customer trust. Internally, Windstream has capitalized on AI for knowledge management, creating a custom-built generative pre-trained transformer (GPT) platform within Microsoft Azure Kubernetes Service (AKS) to index and make accessible a vast repository of documents, which enhances decision-making and operational efficiency across the company. The adoption of AI has facilitated rapid, self-sufficient onboarding processes, and plans are underway to extend AI benefits to field technicians to provide real-time troubleshooting assistance through an AI-enhanced index of technical documents. Windstream’s strategic focus on AI underscores the company’s commitment to innovation, operational excellence, and superior customer service in the telecommunications environment. 

Azure marketplace

Browse cloud software

2. AT&T automates for efficiency and connectivity with Azure OpenAI Service: AT&T sought to boost productivity, enhance the work environment, and reduce operational costs. 

AT&T is leveraging Azure OpenAI Service to automate business processes and enhance both employee and customer experiences, aligning with its core purpose of fostering connections across various aspects of life including work, health, education, and entertainment. This strategic integration of Azure and AI technologies into their operations allows the company to streamline IT tasks and swiftly respond to basic human resources inquiries. In its quest to become the premier broadband provider in the United States and make the internet universally accessible, AT&T is committed to driving operational efficiency and better service through technology. The company is employing Azure OpenAI Service for various applications, including assisting IT professionals in managing resources, facilitating the migration of legacy code to modern frameworks to spur developer productivity, and enabling employees to effortlessly complete routine human resources tasks. These initiatives allow AT&T staff to concentrate on more complex and value-added activities, enhancing the quality of customer service. Jeremy Legg, AT&T’s Chief Technology Officer, highlights the significance of automating common tasks with Azure OpenAI Service, noting the potential for substantial time and cost savings in this innovative operational shift. 

3. Vodafone revolutionizes customer service with TOBi and Microsoft Azure AI: Vodafone sought to lower development costs, quickly enter new markets, and improve customer satisfaction with more accurate and personable interactions. 

Vodafone, a global telecommunications giant, has embarked on a digital transformation journey, central to which is the development of TOBi, a digital assistant created using Azure services. TOBi, designed to provide swift and engaging customer support, has been customized and expanded to operate in 15 languages across multiple markets. This move not only accelerates Vodafone’s ability to enter new markets but also significantly lowers development costs and improves customer satisfaction by providing more accurate and personable interactions. The assistant’s success is underpinned by Azure Cognitive Services, which enables it to understand and process natural language, making interactions smooth and intuitive. Furthermore, Vodafone’s initiative to leverage the new conversational language understanding feature from Microsoft demonstrates its forward-thinking approach to providing multilingual support, notably in South Africa, where TOBi will soon support Zulu among other languages. This expansion is not just about broadening the linguistic reach but also about fine-tuning TOBi’s conversational abilities to recognize slang and discern between similar requests, thereby personalizing the customer experience.  

4. AudioCodes leverages Microsoft Azure for enhanced communication: AudioCodes sought streamlined workflows, improved service level agreements (SLAs), and increased visibility. 

AudioCodes, a leader in voice communications solutions for over 30 years, migrated its solutions to Azure for faster deployment, reduced costs, and improved SLAs. The result? The company’s ability to serve its extensive customer base, which includes more than half of the Fortune 100 enterprises. The company’s shift towards managed services and the development of applications aimed at enriching customer experiences is epitomized by AudioCodes Live, designed to facilitate the transition to Microsoft Teams Phone. AudioCodes has embraced cloud technologies, leveraging Azure services to streamline telephony workflows and create advanced applications for superior call handling, such as its Microsoft Teams-native contact center solution, Voca. By utilizing Azure AI and AI, Voca offers enterprises robust customer interaction capabilities, including intelligent call routing and customer relationship management (CRM) integration. AudioCodes’ presence on Azure Marketplace has substantially increased its visibility, generating over 11 million usage hours a month from onboarded customers. The company plans to utilize Azure OpenAI Service in the future to bring generative AI capabilities into its solutions.

The AI enhanced future of global telecommunications 

The dawn of a new era in telecommunications is upon us, with industry pioneers like Windstream, AudioCodes, AT&T, and Vodafone leading the charge into a future where AI and Azure services redefine the essence of connectivity. Their collective journey not only highlights a shared commitment to enhancing customer experience and operational efficiency, but also paints a vivid picture of a world where communication transcends traditional boundaries, enabled by the fusion of cloud infrastructure and advanced AI technologies. This visionary approach is laying the groundwork for a paradigm where global communication is more seamless, intuitive, and impactful, demonstrating the unparalleled potential of AI to weave a more interconnected and efficient fabric of global interaction. 

Our commitment to responsible AI

empowering responsible ai practices

Read the latest

With responsible AI tools in Azure, Microsoft is empowering organizations to build the next generation of AI apps safely and responsibly. Microsoft has announced the general availability of Azure AI Content Safety, a state-of-the art AI system that helps organizations keep AI-generated content safe and create better online experiences for everyone. Customers—from startup to enterprise—are applying the capabilities of Azure AI Content Safety to social media, education, and employee engagement scenarios to help construct AI systems that operationalize fairness, privacy, security, and other responsible AI principles.  

Get started with Azure OpenAI Service

Apply for access to Azure OpenAI Service by completing this form.  

Learn about Azure OpenAI Service and the latest enhancements.  

Get started with GPT-4 in Azure OpenAI Service in Microsoft Learn.  

Read our partner announcement blog, empowering partners to develop AI-powered apps and experiences with ChatGPT in Azure OpenAI Service.  

Learn how to use the new Chat Completions API (in preview) and model versions for ChatGPT and GPT-4 models in Azure OpenAI Service. 

Learn more about Azure AI Content Safety. 

The post AI-powered dialogues: Global telecommunications with Azure OpenAI Service appeared first on Azure Blog.
Quelle: Azure

Introducing Phi-3: Redefining what’s possible with SLMs

We are excited to introduce Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) available, outperforming models of the same size and next size up across a variety of language, reasoning, coding, and math benchmarks. This release expands the selection of high-quality models for customers, offering more practical choices as they compose and build generative AI applications.

Starting today, Phi-3-mini, a 3.8B language model is available on Microsoft Azure AI Studio, Hugging Face, and Ollama. 

Phi-3-mini is available in two context-length variants—4K and 128K tokens. It is the first model in its class to support a context window of up to 128K tokens, with little impact on quality.

It is instruction-tuned, meaning that it’s trained to follow different types of instructions reflecting how people normally communicate. This ensures the model is ready to use out-of-the-box.

It is available on Azure AI to take advantage of the deploy-eval-finetune toolchain, and is available on Ollama for developers to run locally on their laptops.

It has been optimized for ONNX Runtime with support for Windows DirectML along with cross-platform support across graphics processing unit (GPU), CPU, and even mobile hardware.

It is also available as an NVIDIA NIM microservice with a standard API interface that can be deployed anywhere. And has been optimized for NVIDIA GPUs. 

In the coming weeks, additional models will be added to Phi-3 family to offer customers even more flexibility across the quality-cost curve. Phi-3-small (7B) and Phi-3-medium (14B) will be available in the Azure AI model catalog and other model gardens shortly.   

Microsoft continues to offer the best models across the quality-cost curve and today’s Phi-3 release expands the selection of models with state-of-the-art small models.

Azure AI Studio

Phi-3-mini is now available

Explore the release

Groundbreaking performance at a small size 

Phi-3 models significantly outperform language models of the same and larger sizes on key benchmarks (see benchmark numbers below, higher is better). Phi-3-mini does better than models twice its size, and Phi-3-small and Phi-3-medium outperform much larger models, including GPT-3.5T.  

All reported numbers are produced with the same pipeline to ensure that the numbers are comparable. As a result, these numbers may differ from other published numbers due to slight differences in the evaluation methodology. More details on benchmarks are provided in our technical paper. 

Note: Phi-3 models do not perform as well on factual knowledge benchmarks (such as TriviaQA) as the smaller model size results in less capacity to retain facts. 

Safety-first model design 

Responsible ai principles

Learn about our approach

Phi-3 models were developed in accordance with the Microsoft Responsible AI Standard, which is a company-wide set of requirements based on the following six principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness. Phi-3 models underwent rigorous safety measurement and evaluation, red-teaming, sensitive use review, and adherence to security guidance to help ensure that these models are responsibly developed, tested, and deployed in alignment with Microsoft’s standards and best practices.  

Building on our prior work with Phi models (“Textbooks Are All You Need”), Phi-3 models are also trained using high-quality data. They were further improved with extensive safety post-training, including reinforcement learning from human feedback (RLHF), automated testing and evaluations across dozens of harm categories, and manual red-teaming. Our approach to safety training and evaluations are detailed in our technical paper, and we outline recommended uses and limitations in the model cards. See the model card collection. 

Unlocking new capabilities 

Microsoft’s experience shipping copilots and enabling customers to transform their businesses with generative AI using Azure AI has highlighted the growing need for different-size models across the quality-cost curve for different tasks. Small language models, like Phi-3, are especially great for: 

Resource constrained environments including on-device and offline inference scenarios.

Latency bound scenarios where fast response times are critical.

Cost constrained use cases, particularly those with simpler tasks.

For more on small language models, see our Microsoft Source Blog.

Thanks to their smaller size, Phi-3 models can be used in compute-limited inference environments. Phi-3-mini, in particular, can be used on-device, especially when further optimized with ONNX Runtime for cross-platform availability. The smaller size of Phi-3 models also makes fine-tuning or customization easier and more affordable. In addition, their lower computational needs make them a lower cost option with much better latency. The longer context window enables taking in and reasoning over large text content—documents, web pages, code, and more. Phi-3-mini demonstrates strong reasoning and logic capabilities, making it a good candidate for analytical tasks. 

Customers are already building solutions with Phi-3. One example where Phi-3 is already demonstrating value is in agriculture, where internet might not be readily accessible. Powerful small models like Phi-3 along with Microsoft copilot templates are available to farmers at the point of need and provide the additional benefit of running at reduced cost, making AI technologies even more accessible.  

ITC, a leading business conglomerate based in India, is leveraging Phi-3 as part of their continued collaboration with Microsoft on the copilot for Krishi Mitra, a farmer-facing app that reaches over a million farmers.

“Our goal with the Krishi Mitra copilot is to improve efficiency while maintaining the accuracy of a large language model. We are excited to partner with Microsoft on using fine-tuned versions of Phi-3 to meet both our goals—efficiency and accuracy!”   
Saif Naik, Head of Technology, ITCMAARS

Originating in Microsoft Research, Phi models have been broadly used, with Phi-2 downloaded over 2 million times. The Phi series of models have achieved remarkable performance with strategic data curation and innovative scaling. Starting with Phi-1, a model used for Python coding, to Phi-1.5, enhancing reasoning and understanding, and then to Phi-2, a 2.7 billion-parameter model outperforming those up to 25 times its size in language comprehension.1 Each iteration has leveraged high-quality training data and knowledge transfer techniques to challenge conventional scaling laws. 

Get started today 

To experience Phi-3 for yourself, start with playing with the model on Azure AI Playground. You can also find the model on the Hugging Chat playground. Start building with and customizing Phi-3 for your scenarios using the Azure AI Studio. Join us to learn more about Phi-3 during a special live stream of the AI Show.  

1 Microsoft Research Blog, Phi-2: The surprising power of small language models, December 12, 2023.
The post Introducing Phi-3: Redefining what’s possible with SLMs appeared first on Azure Blog.
Quelle: Azure

Cloud Cultures, Part 7: Creating balance in a digital world through precision and mindfulness in Japan

Innovate. Connect. Cultivate.

The Cloud Cultures series is an exploration of the intersection between cloud innovation and culture across the globe.

‘Mottainai,’ an idea deeply rooted in Japanese culture, is a call to respect resources and avoid waste. It goes beyond mere frugality; it’s an inherent recognition of the value of each item. I saw this ideology reflected everywhere during my trip to Japan for this episode of Cloud Cultures—in the transportation system, in my interactions with shop owners, even in the movements of a master sushi chef (or itamae) that I watched over the bar at an Omakase restaurant.

The same approach is applied to Japan’s technological innovations. While innovation is often associated with monumental breakthroughs or flashy advancements, I found that in Japan, innovation thrives in the simplest forms. It’s a philosophy woven into the Japanese way of life—a reverence for simplicity, mindfulness, and the intrinsic value of everything around us. Using the principles of precision and mindfulness, we can bridge the gaps between technology, design, and craftmanship.

Japan invents with intention

I started my latest Cloud Culture adventure in the Shinagawa district of Tokyo, wandering through narrow streets and sampling local specialties before meeting with Takeshi Numoto, Executive Vice President and Chief Marketing Officer at Microsoft. He’s a Japanese native and expert in ordering delicious meals—needless to say, I was thrilled to cohost this episode with Takeshi. We walked to an Omakase restaurant where we discussed the formalities of Japanese business culture. After sharing some baked oysters prepared with a culinary torch, I went to learn about one of Japan’s most renowned innovations: the railway system.

West Japan Railway Company prioritizes safety and innovation

For the 20 billion passengers riding trains in Japan each year, confidence in the precision and quality of the railway system is key. With over thirty-five years of operation and an impeccable safety record with five million daily users, West Japan Railway Company (JR-West) embodies responsibility, prioritizing safety while envisioning the future of train technology. On Takeshi’s advice, I met with a member of the local Microsoft team, Toshie Ninomiya, Managing Executive Officer of Microsoft Japan. Together, we had the opportunity to sit down with Okuda Hideo, Director and Executive Officer of Digital Solution Headquarters of JR-West, and his team.

During the pandemic, JR-West experienced an 89% decline in passengers. This created a sense of urgency—and ultimately, demanded a mindset shift. They pivoted their business strategy from filling as many seats as possible to curating a unique, personal experience for their riders. With their customers in mind, JR-West implemented cloud and AI solutions to become a more data-driven company. Now, they store customer transport and purchase data on the cloud, which can be analyzed to unlock insights that enable better customer experiences.

After our visit with JR-West, it became clear to me why Japan’s railway system is recognized worldwide. Through meticulous attention to detail and data-driven insights, JR-West ensures that their services exceed customer expectations while maintaining impeccable standards of efficiency and safety.

Sony Group uses cloud technology to embrace AI

While Toshie and I learned how JR-West uses AI to benefit their customers, Takeshi visited Sony headquarters to meet up with an old friend, Tsuyoshi Kodera, Executive Vice President, CDO and CIO of Sony Corporation.

In 1946, with only 190,000 yen and a team of 20 employees, Masaru Ibuka founded Sony with a vision of “establishing an ideal factory that stresses a spirit of freedom and open mindedness that will, through technology, contribute to Japanese culture.” Reflecting Sony’s unwavering commitment to pushing boundaries and achieving the unprecedented, the company has consistently introduced groundbreaking products, often claiming the titles of ‘Japan’s first’ and ‘world’s first.’

Throughout our journey, we were reminded of the importance of collaboration and balance. Beyond their partnership with Microsoft, Sony has expanded their AI strategies in exceptional ways. For example, they’ve implemented new autofocus capabilities in their cameras, implemented AI throughout their factory manufacturing lines, and even incorporated it in their famous Gran Turismo series on PlayStation. Sony is looking toward the future of AI and cloud capabilities to create deeper experiences, build lifelong customers, and help democratize innovative technologies.

NISSIN FOODS Group modernizes with AI to improve efficiency and productivity

For our final stop, Toshie and I visited the CUPNOODLES MUSEUM in Yokohama to meet with NISSIN FOODS Group’s Chief Information Officer, Toshihiro Narita, and discuss how the company is fusing together cloud computing and shaping future food trends.

Since launching the world’s first instant noodles, NISSIN FOODS Group’s vision has remained consistent: contribute to society and the earth by gratifying people everywhere with pleasures and delights food can provide. And despite not being a traditional tech company, NISSIN FOODS Group is embracing digital transformation wholeheartedly.

As part of its program to support sustainable growth, NISSIN FOODS Group aims to improve labor productivity through efficiency. To this end, they’ve implemented various initiatives to improve in-office productivity. This includes migrating to the cloud using Microsoft 365 to adopt a remote work-enabled environment and improve cybersecurity hygiene to continuously maintain the health of the company’s IT estate. Cloud and AI technologies are helping NISSIN FOODS Group’s employees be more productive, giving them more time back to focus on creative work, and in turn, create new food cultures.

After enjoying our custom CUPNOODLES from the My CUPNOODLES Factory, Toshie and I had a chance to reflect on the unique approach NISSIN FOODS Group is taking. They are showcasing how every company can encourage innovation by modernizing with cloud technology, while staying true to their roots. From CUPNOODLES to cutting-edge food tech, innovation knows no bounds in Japan, offering a flavorful glimpse into the future of both technology and gastronomy.

Building a future that embraces tradition

From digital invention to food accessibility and revolutionizing transportation, Japan continues to evolve with a focus on precision and mindfulness. I see a culture that blends tradition and innovation, forging a future that honors heritage while embracing progress. Inspired by the incredible sights and smells of my journey—not to mention the insightful leaders I’ve met along the way—I carry with me a renewed perspective on how we can build a digital world rooted in intentionality and craftsmanship.

Cloud Cultures

Check out the other blogs in this series, from Mexico to Malaysia, and more

Read now

Learn more

Watch more Cloud Cultures episodes

Find the Azure geography that meets your needs

Learn more about Microsoft customers in Japan

Discover about how Microsoft is investing in AI in Japan

The post Cloud Cultures, Part 7: Creating balance in a digital world through precision and mindfulness in Japan appeared first on Azure Blog.
Quelle: Azure

Azure IoT’s industrial transformation strategy on display at Hannover Messe 2024

Running and transforming a successful enterprise is like being the coach of a championship-winning sports team. To win the trophy, you need a strategy, game plans, and the ability to bring all the players together. In the early days of training, coaches relied on basic drills, manual strategies, and simple equipment. But as technology advanced, so did the art of coaching. Today, coaches use data-driven training programs, performance tracking technology, and sophisticated game strategies to achieve unimaginable performance and secure victories. 

We see a similar change happening in industrial production management and performance and we are excited to showcase how we are innovating with our products and services to help you succeed in the modern era. Microsoft recently launched two accelerators for industrial transformation:

Azure’s adaptive cloud approach—a new strategy

Azure IoT Operations (preview)—a new product

Our adaptive cloud approach connects teams, systems, and sites through consistent management tools, development patterns, and insight generation. Putting the adaptive cloud approach into practice, IoT Operations leverages open standards and works with Microsoft Fabric to create a common data foundation for IT and operational technology (OT) collaboration.

Azure IoT Operations

Build interoperable IoT solutions that transform physical operations at scale

Discover solutions

accelerating industrial transformation with azure iot operations

Read the blog

We will be demonstrating these accelerators in the Microsoft booth at Hannover Messe 2024, presenting the new approach on the Microsoft stage, and will be ready to share exciting partnership announcements that enable interoperability in the industry.  

Here’s a preview of what you can look forward to at the event from Azure IoT. 

Experience the future of automation with IoT Operations 

Using our adaptive cloud approach, we’ve built a robotic assembly line demonstration that puts together car battery parts for attendees of the event. This production line is partner-enabled and features a standard OT environment, including solutions from Rockwell Automation and PTC. IoT Operations was used to build a monitoring solution for the robots because it embraces industry standards, like Open Platform Communications Unified Architecture (OPC UA), and integrates with existing infrastructure to connect data from an array of OT devices and systems, and flow it to the right places and people. IoT Operations processes data at the edge for local use by multiple applications and sends insights to the cloud for use by multiple applications there too, reducing data fragmentation.  

For those attending Hannover Messe 2024, head to the center of the Microsoft booth and look for the station “Achieve industrial transformation across the value chain.”  

Watch this video to see how IoT Operations and the adaptive cloud approach build a common data foundation for an industrial equipment manufacturer.

Consult with Azure experts on IT and OT collaboration tools 

Find out how Microsoft Azure’s open and standardized strategy, an adaptive cloud approach, can help you reach the next stage of industrial transformation. Our experts will help your team collect data from assets and systems on the shop floor, compute at the edge, integrate that data into multiple solutions, and create production analytics on a global scale. Whether you’re just starting to connect and digitize your operations, or you’re ready to analyze and reason with your data, make predictions, and apply AI, we’re here to assist.  

For those attending Hannover Messe 2024, these experts are located at the demonstration called “Scale solutions and interoperate with IoT, edge, and cloud innovation.” 

Check out Jumpstart to get your collaboration environment up and running. In May 2024, Jumpstart will have a comprehensive scenario designed for manufacturing.

Attend a presentation on modernizing the shop floor  

We will share the results of a survey on the latest trends, technologies, and priorities for manufacturing companies wanting to efficiently manage their data to prepare for AI and accelerate industrial transformation. 73% of manufacturers agreed that a scalable technology stack is an important paradigm for the future of factories.1 To make that a reality, manufacturers are making changes to modernize, such as adopting containerization, shifting to central management of devices, and emphasizing IT and OT collaboration tools. These modernization trends can maximize the ROI of existing infrastructure and solutions, enhance security, and apply AI at the edge. 

This presentation “How manufacturers prepare shopfloors for a future with AI,” will take place in the Microsoft theater at our booth, Hall 17, on Monday, April 22, 2024, at 2:00 PM CEST at Hannover Messe 2024.  

For those who cannot attend, you can sign up to receive a notification when the full report is out.

Learn about actions and initiatives driving interoperability  

Microsoft is strengthening and supporting the industrial ecosystem to enable at-scale transformation and interoperate solutions. Our adaptive cloud approach both incorporates existing investments in partner technology and builds a foundation for consistent deployment patterns and repeatability for scale.  

Our ecosystem of partners

Microsoft is building an ecosystem of connectivity partners to modernize industrial systems and devices. These partners provide data translation and normalization services across heterogeneous environments for a seamless and secure data flow on the shop floor, and from the shop floor to the cloud. We leverage open standards and provide consistent control and management capabilities for OT and IT assets. To date, we have established integrations with Advantech, Softing, and PTC. 

Siemens and Microsoft have announced the convergence of the Digital Twin Definition Language (DTDL) with the W3C Web of Things standard. This convergence will help consolidate digital twin definitions for assets in the industry and enable new technology innovation like automatic asset onboarding with the help of generative AI technologies.

Microsoft embraces open standards and interoperability. Our adaptive cloud approach is based on those principles. We are thrilled to join project Margo, a new ecosystem-led initiative, that will help industrial customers achieve their digital transformation goals with greater speed and efficiency. Margo will define how edge applications, edge devices, and edge orchestration software interoperate with each other with increased flexibility. Read more about this important initiative. 

Discover solutions with Microsoft

Visit our booth and speak with our experts to reach new heights of industrial transformation and prepare the shop floor for AI. Together, we will maximize your existing investments and drive scale in the industry. We look forward to working with you.

Azure IoT Operations

Azure IoT

Hannover Messe 2024

1 IoT Analytics, “Accelerate industrial transformation: How manufacturers prepare shopfloor for AI”, May 2023.
The post Azure IoT’s industrial transformation strategy on display at Hannover Messe 2024 appeared first on Azure Blog.
Quelle: Azure

Microsoft Entra resilience update: Workload identity authentication

Microsoft Entra is not only the identity system for users; it’s also the identity and access management (IAM) system for Azure-based services, all internal infrastructure services at Microsoft, and our customers’ workload identities. This is why our 99.99% service-level promise extends to workload identity authentication, and why we continue to improve our service’s resilience through a multilayered approach that includes the backup authentication system. 

In 2021, we introduced the backup authentication system, as an industry-first innovation that automatically and transparently handles authentications for supported workloads when the primary Microsoft Entra ID service is degraded or unavailable. Through 2022 and 2023, we continued to expand the coverage of the backup service across clouds and application types. 

Today, we’ll build on our resilience blogpost series by going further in sharing how workload identities gain resilience from the regionally isolated authentication endpoints as well as from the backup authentication system. We’ll explore two complementary methods that best fit our regional-global infrastructure. One example of workload identity authentication is when an Azure virtual machine (VM) authenticates its identity to Azure Storage. Another example is when one of our customers’ workloads authenticates to application programming interfaces (APIs).  

Regionally isolated authentication endpoints 

Regionally isolated authentication endpoints provide region-isolated authentication services to an Azure region. All frequently used identities will authenticate successfully without dependencies on other Azure regions. Essentially, they are the primary endpoints for Azure infrastructure services as well as the primary endpoints for managed identities in Azure (Managed identities for Azure resources – Microsoft Entra ID | Microsoft Learn). Managed identities help prevent out-of-region failures by consolidating service dependencies, and improving resilience by handling certificate expiry, rotation, and trust.  

This layer of protection and isolation does not need any configuration changes from Azure customers. Key Azure infrastructure services have already adopted it, and it’s integrated with the managed identities service to protect the customer workloads that depend on it. 

How regionally isolated authentication endpoints work 

Each Azure region is assigned a unique endpoint for workload identity authentication. The region is served by a regionally collocated, special instance of Microsoft Entra ID. The regional instance relies on caching metadata (for example, directory data that is needed to issue tokens locally) to respond efficiently and resiliently to the workload identity’s authentication requests. This lightweight design reduces dependencies on other services and improves resilience by allowing the entire authentication to be completed within a single region. Data in the local cache is proactively refreshed. 

The regional service depends on Microsoft Entra ID’s global service to update and refill caches when it lacks the data it needs (a cache miss) or when it detects a change in the security posture for a supported service. If the regional service experiences an outage, requests are served seamlessly by Microsoft Entra ID’s global service, making the regional service interruption invisible to the customers.  

Performant, resilient, and widely available 

The service has proven itself since 2020 and now serves six billion requests per day across the globe.  The regional endpoints, working with global services, exceed 99.99% SLA. The resilience of Azure infrastructure is further protected by workload-side caches kept by Azure client SDKs. Together, the regional and global services have managed to make most service degradations undetectable by dependent infrastructure services. Post-incident recovery is handled automatically. Regional isolation is supported by public and all Sovereign Clouds. 

Infrastructure authentication requests are processed by the same Azure datacenter that hosts the workloads along with their co-located dependencies. This means that endpoints that are isolated to a region also benefit from performance advantages. 

Backup authentication system to cover workload identities for infrastructure authentication 

For workload identity authentication that does not depend on managed identities, we’ll rely on the backup authentication system to add fault-tolerant resilience.  In our blogpost from November 2021, we explained the approach for user authentication which has been generally available for some time. The system operates in the Microsoft cloud but on separate and decorrelated systems and network paths from the primary Microsoft Entra ID system. This means that it can continue to operate in case of service, network, or capacity issues across many Microsoft Entra ID and dependent Azure services. We are now applying that successful approach to workload identities. 

Backup coverage of workload identities is currently rolling out systematically across Microsoft, starting with Microsoft 365’s largest internal infrastructure services in the first half of 2024. Microsoft Entra ID customer workload identities’ coverage will follow in the second half of 2025. 

Protecting your own workloads 

The benefits of both regionally isolated endpoints and the backup authentication system are natively built into our platform. To further optimize the benefits of current and future investments in resilience and security, we encourage developers to use the Microsoft Authentication Library (MSAL) and leverage managed identities whenever possible. 

What’s next? 

We want to assure our customers that our 99.99% uptime guarantee remains in place, along with our ongoing efforts to expand our backup coverage system and increase our automatic backup coverage to include all infrastructure authentication—even for third-party developers—in the next year. We’ll make sure to keep you updated on our progress, including planned improvements to our system capacity, performance, and coverage across all clouds.  

Thank you, 

Nadim Abdo  

CVP, Microsoft Identity Engineering  

Learn more about Microsoft Entra: 

Related blog post: Advances in Azure AD resilience 

See recent Microsoft Entra blogs 

Dive into Microsoft Entra technical documentation 

Learn more at Azure Active Directory (Azure AD) rename to Microsoft Entra ID 

Join the conversation on the Microsoft Entra discussion space 

Learn more about Microsoft Security  

The post Microsoft Entra resilience update: Workload identity authentication appeared first on Azure Blog.
Quelle: Azure

Azure high-performance computing leads to developing amazing products at Microsoft Surface

This blog was written in collaboration with the Microsoft Surface and Azure team. It describes how we used Azure high-performance computing (HPC) to save time, costs, and revolutionized our product design piece of manufacturing our Microsoft Surface products.

The Microsoft Surface organization exists to create iconic end-to-end experiences across hardware, software, and services that people love to use every day. We believe that products are a reflection of the people who build them, and that the right tools and infrastructure can complement the talent and passion of designers and engineers to deliver innovative products. Product level simulation models are routinely used in day-to-day decision making on design, reliability, and product features. The organization is also on a multi-year journey to deliver differentiated products in a highly efficient manner. Microsoft Azure HPC plays a vital role in enabling this vision. Below is an account of how we were able to do more with less by leveraging the power of simulation and Azure HPC. 

Surface devices development on Microsoft Azure 

I’m a Principal Engineer at Microsoft and a structural analyst. I’ve been a heavy user of Azure HPC and an early adopter of Azure A8 and A9 virtual machines. In 2015, with the help of our Surface IT team, we deployed and solved many issues with Abaqus (a Finite Element Analysis (FEA) software) implementation in Azure HPC. By 2016, product level structural simulations for Surface Pro 4 and the original Surface laptop had fully migrated to Azure HPC from on-premises servers. Large models with millions of degrees of freedom became routine and easily solved on Azure HPC. This early use of simulations enabled problem solving for design engineers tasked with robustness and reliability metrics. Usage grew along with product line growth. Along with my colleagues Pritul Shah, Senior Director of a cross product engineering team, and Jarkko Sihvonen, Senior Engineer of the IT Infrastructure and Services team, we collaborated to scale up structural simulation footprint in our organization. The vision to build a global simulation team meant access to computing servers in Western North America and Southeast Asia which was easily deployed by the Surface IT and Azure HPC teams. 

Product development: Surface laptop  

The availability of Azure HPC for structural simulations using Abaqus helped make this a primary development tool for product design. Design concepts created in digital computer-aided design (CAD) systems are translated into FEA model in detail. These are true digital prototypes and constitute all major subsystems in the device. The analyst can use FEA models to impose different test and reliability conditions in a virtual environment and determine feasibility. In a few days, hundreds of simulations are executed to evaluate various design ideas and solutions to make the device robust. Subsequently, the selected design becomes a protype and then subject to rigorous testing for real-world use conditions. There are multiple feedback loops built into our engineering process to compare actual tests and FEA results for model validation.  

In the first graphics depicted above, a digital prototype (FEA model) laptop device is set-up to drop on its corner to the floor. This models the real-world physical testing that is conducted in our Reliability Engineering labs. The impact velocity for a given height is the initial condition for the dynamic simulation. The dynamic drop simulation is executed on hundreds of cores of an Azure HPC cluster using Abaqus solver. We used the Abaqus and Explicit solver which is known for its robust and accurate solution for high-speed, nonlinear, dynamic events such as consumer electronics drop testing and automotive crashworthiness. These solvers are optimized especially for Azure HPC clusters and enable scaling to thousands of cores for fast throughputs. The simulation jobs complete in a matter of a few hours on these optimized Azure HPC servers instead of the days it used to take previously. The results are reviewed by the analysts and stress levels are checked against material limits. Design teams and analysts then review the reports and make design updates. This cycle continues in very quick loops as the Azure HPC servers enable fast turnaround for reviews.  

The second graphic depicts an example of the hinge in the device that was optimized for strength. The team was able to visualize the impact induced motion and stress levels of the hinge internal parts from the simulation. This enabled us to isolate the main issue and make the right design improvements. This insight helped redesign the hinge assembly to cause lower stress levels. Significant time was saved in the design process as only one iteration was needed for success. Tooling, physical prototyping, and testing costs were also saved. 

Presently, the entire Microsoft Surface product line utilizes this approach of validating design with digital prototypes (FEA models) run on Azure HPC clusters. Thousands of simulation jobs are executed routinely in a matter of weeks to enable cutting-edge designs that have very high reliability and customer satisfaction. 

What’s next 

The team is now focused on deploying more scalable simulation and Azure HPC resource for multi-disciplinary teams and for multi-physics modeling. There is a huge opportunity to enable machine learning and AI in product creation. Azure HPC and the partnerships within Microsoft organizations will be leveraged to drive large scale innovations at a rapid speed. We are also continuing this digital transformation journey with model based systems engineering (MBSE) with the V4 Institute. World-class organizations looking to do more with less and on a quest for scaling digital simulations will greatly benefit from collaborating with Azure.  

Learn More 

Learn more about Azure HPC.

Get the latest Azure HPC content.

Find out how Microsoft Cloud for Manufacturing can help you embrace new design and manufacturing paradigms. 

Azure high-performance computing

Unlock your innovation

Discover solutions

The post Azure high-performance computing leads to developing amazing products at Microsoft Surface appeared first on Azure Blog.
Quelle: Azure

AI study guide: The no-cost tools from Microsoft to jump start your generative AI journey

The world of AI is constantly changing. Every day it seems there are new ways we can work with generative AI and large language models. It can be hard to know where to start your own learning journey when it comes to AI. Microsoft has put together several resources to help you get started. Whether you are ready to build your own copilot or you’re at the very beginning of your learning journey, read on to find the best and free resources from Microsoft on generative AI training.

Let’s go!

Azure AI

Build intelligent apps at enterprise scale with the Azure AI portfolio

Lean more

Azure AI fundamentals

If you’re just starting out in the world of AI, I highly recommend Microsoft’s Azure AI Fundamentals course. It includes hands on exercises, covers Azure AI Services, and dives into the world of generative AI. You can either take the full course in one sitting or break it up and complete a few modules a day.

Learning path: Azure AI fundamentals

Course highlight: Fundamentals of generative AI module

Azure AI engineer

For those who are more advanced in AI knowledge, or are perhaps software engineers, this learning path is for you. This path will guide you through building AI infused applications that leverage Azure AI Services, Azure AI Search, and Open AI.

Course highlight: Get started with Azure OpenAI Service module

Let’s get building with Azure AI Studio

Imagine a collaborative workshop where you can build AI apps, test pre-trained models, and deploy your creations to the cloud, all without getting lost in mountains of code. In our newest learning path, you will learn how to build generative AI applications like custom copilots that use language models to provide value to your users.

Learning path: Create custom copilots with Azure AI Studio (preview)

Course highlight: Build a RAG-based copilot solution with your own data using Azure AI Studio (preview) module

Dive deep into generative AI with Azure OpenAI Service

If you have some familiarity with Azure and experience programming with C# or Python, you can dive right into the Microsoft comprehensive generative AI training.

Learning path: Develop generative AI solutions with Azure OpenAI Service

Course highlight: Implement Retrieval Augmented Generation (RAG) with Azure OpenAI Service module

Cloud Skills Challenges

Microsoft Azure’s Cloud Skills Challenges are free and interactive events that provide access to our tailored skilling resources for specific solution areas. Each 30-day accelerated learning experience helps users get trained in Microsoft AI. The program offers learning modules, virtual training days, and even a virtual leaderboard to compete head-to-head with your peers in the industry. Learn more about Cloud Skills Challenges here, then check out these challenges to put your AI skills to the test.

Invest in App Innovation to Stay Ahead of the Curve

Learn more

Challenges 1-3 will help you prepare for Microsoft AI Applied Skills, scenario-based credentials. Challenges 4 and 5 will help you prepare for Microsoft Azure AI Certifications, with the potential of a 50% exam discount on your certification of choice1.

Challenge #1: Generative AI with Azure OpenAI

In about 18 hours, you’ll learn how to train models to generate original content based on natural language input. You should already have familiarity with Azure and experience programming with C# or Python. Begin now!

Challenge #2: Azure AI Language

Build a natural language processing solution with Azure AI Language. In about 20 hours, you’ll learn how to use language models to interpret the semantic meaning of written or spoken language. You should already have familiarity with the Azure portal and experience programming with C# or Python. Begin now!

Challenge #3: Azure AI Document Intelligence

Show off your smarts with Azure AI Document Intelligence Solutions. In about 21 hours, you’ll learn how to use natural language processing (NLP) solutions to interpret the meaning of written or spoken language. You should already have familiarity with the Azure portal and C# or Python programming. Begin now!

Challenge #4: Azure AI Fundamentals

Build a robust understanding of machine learning and AI principles, covering computer vision, natural language processing, and conversational AI. Tailored for both technical and non-technical backgrounds, this learning adventure guides you through creating no-code predictive models, delving into conversational AI, and more—all in just about 10 hours.

Complete the challenge within 30 days and you’ll be eligible for 50% off the cost of a Microsoft Certification exam. Earning your Azure AI Fundamentals certification can supply the foundation you need to build your career and demonstrate your knowledge of common AI and machine learning workloads—and what Azure services can solve for them. Begin now!

Challenge #5: Azure AI Engineer

Go beyond theory to build the future. This challenge equips you with practical skills for managing and leveraging Microsoft Azure’s Cognitive Services. Learn everything from secure resource provisioning to real-time performance monitoring. You’ll be crafting cutting-edge AI solutions in no time, all while preparing for Exam AI-102 and your Azure AI Engineer Associate certification. Dive into interactive tutorials, hands-on labs, and real-world scenarios. Complete the challenge within 30 days and you’ll be eligible for 50% off the cost of a Microsoft Certification exam2. Begin now!

Finally, our free Microsoft AI Virtual Training Days are a great way to immerse yourself in free one or two-day training sessions. We have three great options for Azure AI training:

Azure AI Fundamentals

Generative AI Fundamentals

Building Generative Apps with Azure OpenAI Service

Start your AI learning today

For any and all AI-related learning opportunities, check out the Microsoft Learn AI Hub including tailored AI training guidance. You can also follow our Azure AI and Machine Learning Tech Community Blogs for monthly study guides.

Microsoft Cloud Skills Challenge | 30 Days to Learn It – Official Rules

https://developer.microsoft.com/en-us/offers/30-days-to-learn-it/official-rules#terms-and-conditions

The post AI study guide: The no-cost tools from Microsoft to jump start your generative AI journey appeared first on Azure Blog.
Quelle: Azure

Advancing memory leak detection with AIOps—introducing RESIN

“Operating a cloud infrastructure at global scale is a large and complex task, particularly when it comes to service standard and quality. In a previous blog, we shared how AIOps was leveraged to improve service quality, engineering efficiency, and customer experience. In this blog, I’ve asked Jian Zhang, Principal Program Manager from the AIOps Platform and Experiences team to share how AI and machine learning is used to automate memory leak detection, diagnosis, and mitigation for service quality.”—Mark Russinovich, Chief Technology Officer, Azure.

In the ever-evolving landscape of cloud computing, memory leaks represent a persistent challenge—affecting performance, stability, and ultimately, the user experience. Therefore, memory leak detection is important to cloud service quality. Memory leaks happen when memory is allocated but not released in a timely manner unintentionally. It causes potential performance degradation of the component and possible crashes of the operation system (OS). Even worse, it often affects other processes running on the same machine, causing them to be slowed down or even killed.

Given the impact of memory leak issues, there are many studies and solutions for memory leak detection. Traditional detection solutions fall into two categories: static and dynamic detection. The static leak detection techniques analyze software source code and deduce potential leaks whereas the dynamic method detects leak through instrumenting a program and tracks the object references at runtime.

However, these conventional techniques for detecting memory leaks are not adequate to meet the needs of leak detection in a cloud environment. The static approaches have limited accuracy and scalability, especially for leaks that result from cross-component contract violations, which need rich domain knowledge to capture statically. In general, the dynamic approaches are more suitable for a cloud environment. However, they are intrusive and require extensive instrumentations. Furthermore, they introduce high runtime overhead which is costly for cloud services.

RESIN

Designed to address memory leaks in production cloud infrastructure

Explore the research

Introducing RESIN

Today, we are introducing RESIN, an end-to-end memory leak detection service designed to holistically address memory leaks in large cloud infrastructure. RESIN has been used in Microsoft Azure production and demonstrated effective leak detection with high accuracy and low overhead.

REsin: a holistic service for memory leaks

Read the report

RESIN system workflow

A large cloud infrastructure could consist of hundreds of software components owned by different teams. Prior to RESIN, memory leak detection was an individual team’s effort in Microsoft Azure. As shown in Figure 1, RESIN utilizes a centralized approach, which conducts leak detection in multi-stages for the benefit of low overhead, high accuracy, and scalability. This approach does not require access to components’ source code or extensive instrumentation or re-compilation.

Figure 1: RESIN workflow

RESIN conducts low-overhead monitoring using monitoring agents to collect memory telemetry data at host level. A remote service is used to aggregate and analyze data from different hosts using a bucketization-pivot scheme. When leaking is detected in a bucket, RESIN triggers an analysis on the process instances in the bucket. For highly suspicious leaks identified, RESIN performs live heap snapshotting and compares it to regular heap snapshots in a reference database. After generating multiple heap snapshots, RESIN runs diagnosis algorithm to localize the root cause of the leak and generates a diagnosis report to attach to the alert ticket to assist developers for further analysis—ultimately, RESIN automatically mitigates the leaking process.

Detection algorithms

There are unique challenges in memory leak detection in cloud infrastructure:

Noisy memory usage caused by changing workload and interference in the environment results in high noise in detection using static threshold-based approach.

Memory leak in production systems are usually fail-slow faults that could last days, weeks, or even months and it can be difficult to capture gradual change over long periods of time in a timely manner.

At the scale of Azure global cloud, it’s not practical to collect fine-grained data over long period of time.

To address these challenges, RESIN uses a two-level scheme to detect memory leak symptoms: A global bucket-based pivot analysis to identify suspicious components and a local individual process leak detection to identify leaking processes.

With the bucket-based pivot analysis at component level, we categorize raw memory usage into a number of buckets and transform the usage data into summary about number of hosts in each bucket. In addition, a severity score for each bucket is calculated based on the deviations and host count in the bucket. Anomaly detection is performed on the time-series data of each bucket of each component. The bucketization approach not only robustly represents the workload trend with noise tolerance but also reduces computational load of the anomaly detection.

However, detection at component level only is not sufficient for developers to investigate the leak efficiently because, normally, many processes run on a component. When a leaking bucket is identified at the component level, RESIN runs a second-level detection scheme at the process granularity to narrow down the scope of investigation. It outputs the suspected leaking process, its start and end time, and the severity score.

Diagnosis of detected leaks

Once a memory leak is detected, RESIN takes a snapshot of live heap, which contains all memory allocations referenced by running application, and analyzes the snapshots to pinpoint the root cause of the detected leak. This makes memory leak alert actionable.

RESIN also leverages Windows heap manager’s snapshot capability to perform live profiling. However, the heap collection is expensive and could be intrusive to the host’s performance. To minimize overhead caused by heap collection, a few considerations are considered to decide how snapshots are taken.

The heap manager only stores limited information in each snapshot such as stack trace and size for each active allocation in each snapshot.

RESIN prioritizes candidate hosts for snapshotting based on leak severity, noise level, and customer impact. By default, the top three hosts in the suspected list are selected to ensure successful collection.

RESIN utilizes a long-term, trigger-based strategy to ensure the snapshots capture the complete leak. To facilitate the decision regarding when to stop the trace collection, RESIN analyzes memory growth patterns (such as steady, spike, or stair) and takes a pattern-based approach to decide the trace completion triggers.

RESIN uses a periodical fingerprinting process to build reference snapshots, which is compared with the snapshot of suspected leaking process to support diagnosis.

RESIN analyzes the collected snapshots to output stack traces of the root.

Mitigation of detected leaks

When a memory leak is detected, RESIN attempts to automatically mitigate the issue to avoid further customer impact. Depending on the nature of the leak, a few types of mitigation actions are taken to mitigate the issue. RESIN uses a rule-based decision tree to choose a mitigation action that minimizes the impact.

If the memory leak is localized to a single process or Windows service, RESIN attempts the lightest mitigation by simply restarting the process or the service. OS reboot can resolve software memory leaks but takes a much longer time and can cause virtual machine downtime and as such, is normally reserved as the last resort. For a non-empty host, RESIN utilizes solutions such as Project Tardigrade, which skips hardware initialization and only performs a kernel soft reboot, after live virtual machine migration, to minimize user impact. A full OS reboot is performed only when the soft reboot is ineffective.

RESIN stops applying mitigation actions to a target once the detection engine no longer considers the target leaking.

Result and impact of memory leak detection

RESIN has been running in production in Azure since late 2018 and to date, it has been used to monitor millions of host nodes and hundreds of host processes daily. Overall, we achieved 85% precision and 91% recall with RESIN memory leak detection,1 despite the rapidly growing scale of the cloud infrastructure monitored.

The end-to-end benefits brought by RESIN are clearly demonstrated by two key metrics:

Virtual machine unexpected reboots: the average number of reboots per one hundred thousand hosts per day due to low memory.

Virtual machine allocation error: the ratio of erroneous virtual machine allocation requests due to low memory.

Between September 2020 and December 2023, the virtual machine reboots were reduced by nearly 100 times, and allocation error rates were reduced by over 30 times. Furthermore, since 2020, no severe outages have been caused by Azure host memory leaks.1

Learn more about RESIN

You can improve the reliability and performance of your cloud infrastructure, and prevent issues caused by memory leaks through RESIN’s end-to-end memory leak detection capabilities designed to holistically address memory leaks in large cloud infrastructure. To learn more, read the publication.

1 RESIN: A Holistic Service for Dealing with Memory Leaks in Production Cloud Infrastructure, Chang Lou, Johns Hopkins University; Cong Chen, Microsoft Azure; Peng Huang, Johns Hopkins University; Yingnong Dang, Microsoft Azure; Si Qin, Microsoft Research; Xinsheng Yang, Meta; Xukun Li, Microsoft Azure; Qingwei Lin, Microsoft Research; Murali Chintalapati, Microsoft Azure, OSDI’22.
The post Advancing memory leak detection with AIOps—introducing RESIN appeared first on Azure Blog.
Quelle: Azure

Microsoft Cost Management updates—March 2024 

Whether you’re a new student, a thriving startup, or the largest enterprise, you have financial constraints, and you need to know what you’re spending, where it’s being spent, and how to plan. Nobody wants a surprise when it comes to the bill, and this is where Cost Management comes in. 

We’re always looking for ways to learn more about your challenges and how Cost Management can help you better understand where you’re accruing costs in the cloud, identify and prevent bad spending patterns, and optimize costs to empower you to do more with less. Here are a few of the latest improvements and updates based on your feedback: 

Microsoft Azure Kubernetes Service (AKS) costs

Auto renewal of Azure Reservations 

Connector for AWS—Retirement date: March 31, 2025 

Pricing updates on Azure.com 

Cost Management Labs 

New ways to save money in the Microsoft Cloud 

New videos and learning opportunities 

Documentation updates 

Let’s dig into the details. 

Cost Management solutions

Learn how to optimize your cloud investments with confidence

Microsoft Azure Kubernetes Service (AKS) costs 

Cost views

I am pleased to share that the AKS cost views are now generally available in Cost analysis. This was officially announced at Kubecon in Paris held last month. We announced the preview of these views in November 2023 at Ignite. 

AKS users always had visibility into the infrastructure costs of running their clusters. With these new views, they also get visibility into the costs of namespaces running in their clusters and an aggregated view of cluster costs across their subscription. With these additional insights, users can allocate and optimize their AKS costs more efficiently, maximizing the benefits of running their workloads on shared infrastructure. To enable these views, users must install the cost analysis add-on on their clusters.  

Figure 1: Kubernetes clusters view 

Figure 2: Kubernetes namespaces view

Please refer to the two articles below for more information: 

Azure Kubernetes Service cost analysis – Azure Kubernetes Service | Microsoft Learn 

View Kubernetes costs (Preview) – Cost Management | Microsoft Learn 

Fleet workload placement

An additional announcement from Kubecon that I want to highlight is the extension of fleet workload placement to schedule workloads to clusters based on new heuristics such as cost and availability of resources. For more information, please refer to “Open-Source Fleet Workload Placement Scheduling and Override.”  

Auto renewal of Azure Reservations 

Azure Reservations can significantly reduce your resource costs by up to 72% from pay-as-you-go prices. To simplify the management of reservations and to continue getting reservation discounts, you can now set up auto-renewal of your reservations at the time of purchase. Please note that the setting is turned off by default, so make sure to turn it on before your reservation purchase expires. To learn more, refer to “Automatically renew Azure reservations – Cost Management | Microsoft Learn.”

Connector for Amazon Web Services (AWS)—Retirement date: March 31, 2025  

Please note that we will be retiring the connector for AWS in Cost Management on March 31, 2025. You will not have access to AWS data through the API or portal beyond the retirement date, you will continue to have access to data that you stored in your S3 bucket in the AWS console. To prepare for the retirement date, we have removed the ability to add a new connector from Cost Management. We encourage you to look at alternative solutions to access your AWS costs. For more information, please refer to “Support for Connector for AWS in Cost Management is ending on 31 March 2025.” 

Pricing updates on Azure.com 

We’ve been working hard to make some changes to our Azure pricing experiences, and we’re excited to share them with you. These changes will help make it easier for you to estimate the costs of your solutions:

Azure savings plan has now been extended to Microsoft Azure Spring apps, offering more flexibility and cost optimization on both the pricing page and calculator. 

We’ve added a calculator entry for Azure Kubernetes Services Edge Essentials. 

We’ve added pricing for many new offers on Microsoft Azure, including Microsoft Azure Application Gateway (with the general availability (GA) of Application Gateway for Containers), new Microsoft Azure Virtual Machines series (Dasv6, Easv6, and Fasv6 – all in preview), Microsoft Azure Red Hat OpenShift (added virtual machine (VM) families and improved search experience on Pricing Calculator), Microsoft Azure SQL Database (HA Replica Pricing to elastic pools, Hyperscale), Microsoft Azure Databricks (“Model Training” workload for premium-tier workspaces), Microsoft Azure Managed Grafana (Standard and Essential plan types added to pricing calculator), Microsoft Azure Backup (pricing for Enhanced Policy type), Microsoft Azure Private 5G Core (new offers for RAN Overage and Devices Overage to both page and calculator). 

We’re constantly working to improve our pricing tools and make them more accessible and user-friendly. We hope you find these changes helpful in estimating the costs for your Azure Solutions. If you have any feedback or suggestions for future improvements, please let us know!

Cost Management Labs 

With Cost Management Labs, you get a sneak peek at what’s coming in Cost Management and can engage directly with us to share feedback and help us better understand how you use the service, so we can deliver more tuned and optimized experiences. Here are a few features you can see in Cost Management Labs:  

 Currency selection in Cost analysis smart views. View your non-USD charges in USD or switch between the currencies you have charges in to view the total cost for that currency only. To change currency, select “Customize” at the top of the view and select the currency you would like to apply. Currency selection is not applicable to those with only USD charges. Currency selection is enabled by default in Labs.   

 Streamlined Cost Management menu. Organize Cost Management tools into related sections for reporting, monitoring, optimization, and configuration settings.   

Recent and pinned views in the cost analysis preview. Show all classic and smart views in cost analysis and streamline navigation by prioritizing recently used and pinned views.   

Forecast in Cost analysis smart views. Show your forecast cost for the period at the top of Cost analysis preview.   

Charts in Cost analysis smart views. View your daily or monthly cost over time in Cost analysis smart views.   

Open configuration items in the menu. Experimental option to show the selected configuration screen as a nested menu item in the Cost Management menu. Please share feedback.  

New ways to save money in the Microsoft Cloud 

Here are a couple of important updates for you to review that can help reduce costs:

“Generally Available: Azure Kubernetes Service (AKS) support for 5K Node limit by default for standard tier clusters” 

“Public Preview: Well-Architected Framework assessment on Azure Advisor” 

New videos and learning opportunities 

Check out “Leverage anomaly management processes with Microsoft Cost Management”, a great video for managing anomalies and reservations. You can also follow the Cost Management YouTube channel to stay in the loop with new videos as they’re released and let us know what you’d like to see next. Want a more guided experience? Start with ”Control Azure spending and manage bills with Microsoft Cost Management.”

Refer to the blog post: Combine FinOps best practices and Microsoft tools to streamline and optimize your workloads about using Microsoft tools for FinOps best practices.

Documentation updates  

Here are a few documentation updates you might be interested in: 

New: “Azure Hybrid Benefit documentation”

Update: “Transfer Azure Enterprise enrollment accounts and subscriptions”

Update: Azure EA pricing – Cost Management

Update: Review your Azure Enterprise Agreement bill

Update: Understand usage details fields

Update: “Organize your costs by customizing your billing account”

Want to keep an eye on all documentation updates? Check out the Cost Management and Billing documentation change history in the azure-docs repository on GitHub. If you see something missing, select “Edit” at the top of the document and submit a quick pull request. You can also submit a GitHub issue. We welcome and appreciate all contributions! 

What’s next? 

These are just a few of the big updates from last month. Don’t forget to check out the previous Cost Management updates. We’re always listening and making constant improvements based on your feedback, so please keep the feedback coming. 

Best wishes, 

Cost Management team 
The post Microsoft Cost Management updates—March 2024  appeared first on Azure Blog.
Quelle: Azure