Azure high-performace computing at SC’19

HBv2 Virtual Machines for HPC, Azure’s most powerful yet, now in preview

Azure HB v2-series Virtual Machines (VM) for high-performance computing (HPC) are now in preview in the South Central US region.

HBv2-series Virtual Machines are Azure’s most advanced HPC offering yet, featuring performance and Message Passing Interface scalability rivaling the most advanced supercomputers on the planet, and price and performance on par with on-premises HPC deployments.

HBv2 Virtual Machines are designed for a variety of real-world HPC applications, from fluid dynamics to finite element analysis, molecular dynamics, seismic processing & imaging, weather modeling, rendering, computational chemistry, and more.

Each HBv2 Virtual Machines features 120 AMD EPYCTM 7742 processor cores at 2.45 GHz (3.3 GHz Boost), 480 GB of RAM, 480 MB of L3 cache, and no simultaneous multithreading. A HBv2 Virtual Machine also provides up to 340 GB per second of memory bandwidth, up to four teraflops of double-precision compute, and up to eight teraflops of single-precision compute.

Finally, a HBv2 Virtual Machine features 900 GB of low-latency, high-bandwidth block storage via NVMeDirect, and supports up to eight Azure Managed Disks.

200 Gigabits high data rate (HDR) InfiniBand comes to the Azure

HBv2-series Virtual Machines feature one of the cloud’s first deployment of 200 Gigabit per second HDR InfiniBand networking from Mellanox, which provides up to 8 times higher bandwidth and 16 times lower latencies than found elsewhere on the public cloud.

With HBv2 Virtual Machines, Azure is also introducing two new network features to support the highest sustained performance for tightly-coupled workloads. The first is adaptive routing, which helps optimize Message Passing Interface performance on congested networks. The second is support for dynamic connected transport (DCT) which provides reliable transport, and enhancements to scalable, asynchronous, and high-performance communication.

As with HB and HC Virtual Machines, HBv2 Virtual Machines support hardware-based offload for Message Passing Interface collectives.

Azure & Cray deliver cloud-based seismic imaging at 28,000 cores, 42 GB per second reads, and 62 GB per second write performance

Customers come to Azure for our ability to support their largest and most critical workloads. Energy companies have been among the first and most eager to embrace our advanced HPC capabilities, including for their core subsurface discovery workloads. Advances in subsurface computing support more accurate identification of energy resources, as well as safer extraction of these resources from challenging areas such as beneath thick deposits of salt in the Gulf of Mexico.

As part of our work with one of our strategic partner operators energy exploration customers, today we are sharing that Azure recently supported what is believe to be one of the largest cloud-based seismic processing workload yet.

Powered by up to 468 Azure HB Virtual Machines totaling 28,080 AMD EPYC first generation CPU cores and more than 123 terabyte per second of aggregate memory bandwidth, the customer was able to run imaging jobs utilizing a variety of pre-stack and post-stack migration, full-waveform inversion, and real-time migration techniques.

Seismic imaging is as much about data movement as it is compute, however, to support this record scale customer workload Cray provided the supercomputing firm’s vaunted ClusterStor storage system. Announced earlier this year, Cray® ClusterStor™ in Azure is a dedicated Lustre filesystem solution to accelerate data processing for the largest and most complex HPC and AI jobs run on Azure, and can optionally be connected to Azure H-series Virtual Machines. Not only does Cray ClusterStor in Azure leverage the same technology that powers many of the fastest HPC filesystems on the planet, it also is among the most affordable on the cloud. Over a typical three-year reserved instance period, Cray ClusterStor in Azure can cost as little as 1/10th of Lustre offerings found on other public clouds.

The combination of the Azure HB-series Virtual Machines and Cray ClusterStor provided a highly scalable solution as delivering an 11.5x improvement in time to solution as the pool of compute virtual machines was increased from 16 to 400.

The Cray ClusterStor in Azure storage solution, whose measured performance peaked at 42 GB per second (reads) and 62 GB per second (writes) also delivered significant differentiation for the customer by driving a 66 percent improvement in application performance as compared to an alternative, high-performance network file system (NFS) approach.

Available now

Azure HBv2-series Virtual Machines are currently available in South Central US, with additional regions rolling out soon.

Learn more about Azure HPC Virtual Machines.
Find out more about high-performance computing (HPC) in Azure.
Find out more about Cray ClusterStor in Azure.

Quelle: Azure

Bringing confidential computing to Kubernetes

Historically, data has been protected at rest through encryption in data stores, and in transit using network technologies, however as soon as that data is processed in the CPU of a computer it is decrypted and in plain text. New confidential computing technologies are game changing as they provide data protection, even when the code is running on the CPU, with secure hardware enclaves. Today, we are announcing that we are bringing confidential computing to Kubernetes workloads.

Confidential computing with Azure

Azure is the first major cloud platform to support confidential computing building on Intel® Software Guard Extensions (Intel SGX). Last year, we announced the preview of the DC-series of virtual machines that run on Intel® Xeon® processors and are confidential computing ready.

This confidential computing capability also provides an additional layer of protection even from potentially malicious insiders at a cloud provider, reduces the chances of data leaks and may help address some regulatory compliance needs.

Confidential computing enables several previously not possible use-cases. Customers in regulated industries can now collaborate together using sensitive partner or customers data to detect fraud scenarios without giving the other party visibility into that data. In another example customers can perform mission critical payment processing in secure enclaves.

How it works for Kubernetes

With confidential computing for Kubernetes, customers can now get this additional layer of data protection for their Kubernetes workloads with the code running on the CPU with secure hardware enclaves. Use the open enclave SDK for confidential computing in code. Create a Kubernetes cluster on hardware that supports Intel SGX, such as the DC-series virtual machines running Ubuntu 16.04 or Ubuntu 18.04 and install the confidential computing device plugin into those virtual machines. The device plugin (running as a DaemonSet) surfaces the usage of the Encrypted Page Cache (EPC) RAM as a schedulable resource for Kubernetes. Kubernetes users can then schedule pods and containers that use the Open Enclave SDK onto hardware which supports Trusted Execution Environments (TEE).

The following pod specification demonstrates how you would schedule a pod to have access to a TEE by defining a limit on the specific EPC memory that is advertised to the Kubernetes scheduler by the device plugin available in preview.

Now the pods in these clusters can run containers using secure enclaves and take advantage of confidential computing. There is no additional fee for running Kubernetes containers on top of the base DC-series cost.

The Open Enclave SDK was recently open sourced by Microsoft and made available to the Confidential Computing Consortium, under the Linux Foundation for standardization to create a single uniform API to use with a variety of hardware components and software runtimes across the industry landscape.

Try out confidential computing for Kubernetes with Azure today. Let us know what you think in our survey or on GitHub.
Quelle: Azure

Finastra “did not expect the low RPO” of Azure Site Recovery DR

Today’s question and answer style post comes after I had the chance to sit down with Bryan Heymann, Head of Cloud Architecture at Finastra, discussing his experience with Azure Site Recovery. Finastra builds and deploys technology on its open software architecture, our conversation focused on the organization’s journey to replace several disparate disaster recovery (DR) technologies with Azure Site Recovery. To learn more about achieving resilience in Azure, refer to this whitepaper.

 

 

You have been on Azure for a few years now – before we get too deep in DR, can you start with some context on the cloud transformation that you are going through at Finastra?

We think of our cloud journey across three horizons. Currently, we're at “Horizon 0” – consolidating and migrating our core data centers to the cloud with a focus on embracing the latest technologies and reducing total cost of ownership (TCO.) The workloads are a combination of production sites and internal employee sites.

Initially, we went through a 6-month review with a third party to identify our datacenter strategy, and decided to select a public cloud. Ultimately, we realized that Microsoft would be a solid partner to help us on our journey. We moved some of our solutions to the cloud and our footprint has organically grown from there.

All this is the enabler to go to future “horizons” to ensure we continuously keep pace with the latest technology. Most importantly we’re looking to move up the value chain – so instead of us worrying about standing a server up, patching a server, auditing a server, identity on a server… we’re now ingesting and deploying the right policies for the service (not the server) and taking advantage of the availability, security, and disaster recovery options.

Exciting to hear about the journey you've taken so far. I believe DR is a requirement across all of those horizons, right?

Disaster recovery is front and center for us. We work closely with our clients to regularly test. At this point we have executed more than 50 test failovers. Disaster recovery and backups are non-negotiable standards in our shared environment.

What were you most skeptical about when it came to DR in Azure? What was it that helped you become convinced that Azure Site Recovery was the right choice?

We used just about every tool in our data centers and always had mixed results. We thought that Azure Site Recovery might be the same, but I was glad that we were wrong. We have a strong success rate and even wrote special dashboards to track our recovery point objective (RPO) for a holistic view on our posture! We were skeptical that Site Recovery would be point and click capable, and whether it would be able to keep up with the amount of change we have, when failing over from the East coast to the West coast. Our first DR test in Azure, over two years ago now, was actually wildly successful. We did not expect the low RPO that we saw and were delighted. I think this speaks volumes to Azure’s network backbone and how you handle replication, to be that performant.

We hear that from a lot of customers. Great to get further validation from you! Could you ‘double click’ on your onboarding experience with Azure Site Recovery, up to that first DR drill?

There wasn't any heavy onboarding, which is a good thing as it really wasn’t needed. It was so intuitive and easy for our team to use. The documentation was very accurate. The point and click capabilities of Site Recovery and the documentation enabled us to onboard and go. It has all been in line with what we needed, without surprises.

What kind of workloads are you protecting with Azure Site Recovery?

All of our virtual machines (VMs) across North America are using Site Recovery, everything from our lending space, to our payment space, to our core space. These systems support thousands of our customers, and each of those have their own end customers which would number in the millions – Site Recovery is our default disaster recovery mechanism across the whole fleet.

Wow, that’s a lot of customers and some sensitive financial spaces so no wonder disaster recovery is such a high priority for your teams. We regularly hear prospective customers asking whether Azure Site Recovery supports Linux – I'd love to understand if you have Linux-based applications using Site Recovery, and what your experience has been with those?

Actually, it was our very first application for which we deployed Azure Site Recovery – and it’s all Linux. Linux support for Site Recovery has been fantastic. We failover every six months, without any issues. The ease of use and the amount of times we have tested now has significantly increased. We pressed on our normal RPOs to get them down to very, very aggressive levels. Some of our Linux based applications are complex, but Site Recovery has worked without any issues.

You touched upon DR drills – I'd love to understand what your drill experience has been like?

The experience has been seamless and simple. The application itself may have some configurations that need to be considered during DR drills, such as post-scripts, but those are hammered out quickly. We try to do drills every six months, but at least once every year.

Which features of Azure Site Recovery do you like the most?

I love that I can fail across regions. I also love the granular recovery point reporting. It allows us to see where we may or may not be seeing problems. I'm not sure we ever got that from any other tools, it’s very powerful and it’s graphical user interface based – and any Joe could do it, it's not hard to select a VM and replicate it to another region. I especially like the fact that we are only charged for storage in the far side region so, financially, there's not an impact of having warm standbys and still we are able to hit our RPO.

If you were to go through this journey all over again with Azure Site Recovery, is there anything that you would have done differently?

I would have liked to get our knowledge base and plans in place for a month longer before implementing it. It's just so easy that we were able to blow through most of it, but we did miss a couple of simple things early on which were easily fixed later on our journey. We found out quickly we didn't want standard hard drives, we wanted premium for example.

Looking forward, how do you plan to leverage Azure Site Recovery?

We recently used Azure Site Recovery to move a customer in our payment space from on premises to Azure – we will now get those machines on Site Recovery across Azure regions, we're not going to rebuild the entire platform. It's obviously the de facto to get us out, and it is the standard for regional disaster recovery for VMs there. There is no other product used.

People ask me what keeps me up at night, there are really two things. “Are we secure?” and “Can we recover?” – I call it the butterfly effect. When you come in each morning, are you confident that if you cratered a datacenter, you could come up in a different one? I can confidently answer that with yes. We could fail out to another region, with all our data. That's a pretty nice spot to be in, especially when you're sitting in a hyperscale cloud. I know that I have storage replication. I know that I own the network links. To allow somebody to run this stuff on our behalf was a mindset change, but it has really been a positive experience for us.
Quelle: Azure

Learn how to accelerate time to insight at the Analytics in Azure virtual event

The next wave of analytics is here with Azure Synapse Analytics! Simply unmatched and truly limitless, we're excited about this launch and want to share the highlights with you. Please join us for the Analytics in Azure virtual event on Tuesday, December 10, 2019 from 10:00 AM – 11:00 AM Pacific Time. Be among the first to see how Azure Synapse can accelerate your organization’s time to insight. Sign up for the live stream today for reminders, agenda updates, and instructions to tune in live.

Azure Synapse is a limitless analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources at scale. Build end-to-end analytics solutions with a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs, with blazing speed.

Join us for this virtual event and find out how to:

Query petabyte-scale data on demand from the data lake, or provision elastic compute resources for demanding workloads like data warehousing.
Build a modern data warehouse enhanced with streaming analytics, machine learning, BI, and AI capabilities.
Reduce project development time for machine learning, BI, and AI.
Easily optimize petabyte-scale workloads and automatically prioritize critical jobs.
Help safeguard data with Azure Active Directory integration, dynamic data masking, column-level and row-level security, and automated threat detection.

You’ll hear directly from Gayle Sheppard, Corporate Vice President of Microsoft Azure Data, and John Macintyre, Director of Product for Microsoft Azure Analytics, who will dive deep on Azure Synapse. Other Microsoft engineering and analytics experts will join the event with insights, demos, and answers to your questions in a live Q&A. Don’t miss your opportunity to ask how Azure Synapse can enable the growth and evolution of your business. 

There’s never been a better time to embrace technologies that allow you to unlock insights from all your data to stay competitive and fuel innovation with purpose. Register today for the Analytics in Azure virtual event on December 10th. We hope you can join us!
Quelle: Azure

Microsoft cloud in Norway opens with availability of Microsoft Azure

Today, we’re announcing the availability of Microsoft Azure from our new cloud datacenter regions in Norway, marking a major milestone as the first global cloud provider to deliver enterprise-grade services in country. These new regions demonstrate our ongoing investment to help enable digital transformation and advance intelligent cloud and intelligent edge computing technologies across both commercial and public sectors.

DNB, Equinor, Lånekassen, and Posten are just a few of the customers and partners leveraging our cloud services to accelerate innovation and increase computing resources. This new offering of Microsoft Azure delivers scalable, highly available, and resilient cloud services to Norwegian companies and organizations while meeting data residency, security, and compliance needs.

Our President, Brad Smith, recently visited Norway to celebrate this important launch and to discuss how vital trust is for those we serve, not only to help bring forth innovation but to ensure our customers are protected.

“Our customers have entrusted us to protect, operate, and develop our platform in a way that keeps their data private and secure. This is an immense responsibility that we can’t just claim, but a responsibility that we must earn every single day.” – Brad Smith, President, Microsoft

Accelerating digital transformation in Norway

As we further our expansion commitment, we consider the demand for locally delivered cloud services and the opportunity for digital transformation in the market. Azure enables our customers and partners to increase their utilization of public cloud services and accelerate investments into private and hybrid cloud solutions. Norwegian organizations can now embrace these benefits to further innovation and build digital businesses at scale. Below are just a few of the customers and partners embracing Microsoft Azure in Norway.

The Norwegian banking industry is recognized for its rapid technology adoption, digitalizing the services that build the best products for customers. As Norway’s largest financial services group, DNB Group is a major operator in several industries, for which they also have a Nordic or international strategy. With Microsoft Azure, DNB will be able to migrate to the cloud in accordance with Norwegian data handling regulations to modernize, gain operational efficiency, and secure the best experience for its customers. 

“The possibility of data residency was a decisive factor in choosing Microsoft’s datacenter regions. Now we are looking forward to using the cloud to modernize and achieve efficiency and agility in order to ensure the best experience for our customers.” – Alf Otterstad, Executive Vice President, Group IT, DNB

Equinor, a broad energy company developing oil, gas, wind, and solar energy in more than 30 countries worldwide, has chosen Microsoft Azure to enable its digital transformation journey through a seven-year consumption and development agreement. With this strategic partnership, anchored in cloud-enabled innovation, and by moving its whole system portfolio to Azure, Equinor is aiming to achieve a more cost-efficient, safer, and more reliable operation. Equinor will utilize a variety of cloud services like machine learning and advanced analytics to improve performance, decrease costs, and increase safety. Through the partnership with Microsoft and leveraging capabilities within Azure, Equinor seeks to be a leader in the transformation of the energy industry worldwide and a growing force in renewables.

“Equinor’s ambition is to become a global digital leader within our industry. We have a long history of innovation and technology development. The strategic partnership will, through cloud services, involve development of the next-generation IT workplace, extended business application platforms, and mixed-reality solutions.” – Åshild Hanne Larsen, CIO and SVP, Corporate IT Equinor

Lånekassen, the Norwegian State Educational Loan Fund, has over 1.1 million customers, composed of former and current students. By moving to Azure, it seeks to develop new and transformative citizen services, based on cognitive and analytical technologies. Lånekassen’s purpose is to make education possible, and to provide the Norwegian workforce with relevant competences. It aims to strengthen student funding as well as maintain and increase the already high level of automatized customer services and application processes.

“It has been a priority for Lånekassen to focus on how we can utilize new technology to deliver an even better service for our students and manage our student financing schemes even more efficiently. As we move our core solutions into the cloud, it will give us increased opportunities to innovate. We have already had great success with using machine learning, and we are now looking forward to optimizing our operations further.” – Nina Schanke Funnemark, CEO, Lånekassen

Posten Norge AS has chosen to use the Microsoft Azure platform to meet ever-changing market demands by modernizing some of its existing applications estate and creating new services for its customers and partners. Posten’s next-generation logistics system will provide its workforce with new digital toolsets to deliver even better customer experiences.

“Posten’s vision is to make everyday life simpler and the world smaller. With this vision, we aim to simplify and increase the value of trade and communication for people and businesses in the Nordic region. With the opening of Norwegian datacenter regions, we hope to accelerate and fuel our vision further.” – Arne Erik Berntzen, CIO, Posten AS

Bringing the complete cloud to Norway

The new cloud regions in Norway connect with Microsoft’s 54 regions via our global network, one of the largest and most innovative on the planet, spanning more than 130,000 miles of terrestrial fiber and subsea cable systems to deliver services to customers. Microsoft brings the global cloud closer to home for Norwegian organizations and citizens through our transatlantic system Marea, the highest-capacity subsea cable to cross the Atlantic.

The new cloud regions in Norway are targeted to expand in 2020 with Office 365, one of the world’s leading cloud-based productivity solutions, and Dynamics 365 and Power Platform, the next generation of intelligent business applications and tools.

Learn more about the new cloud services in Norway and the availability of Azure regions and services across the globe.
Quelle: Azure

Introducing Azure Cost Management for partners

As a partner, you play a critical role in successful planning and managing long-term cloud implementations for your customers. While the cloud grants the flexibility to scale the cloud infrastructure to the changing needs, it does become challenging to control the spend when cloud costs can fluctuate dramatically with demand. This is where Azure Cost Management comes in to help you track and control cloud cost, prevent overspending and increase predictability for your cloud costs

Announcing general availability of Azure Cost Management for all cloud solution partners (CSPs) who have onboarded their customers to the new Microsoft Customer agreement. With this update, partners and their customers can take advantage of Azure Cost Management tools available to manage cloud spend, similar to the cost management capabilities available for pay-as-you-go (PAYG) and enterprise customers today.

This is the first of the periodic updates to enable cost management support for partners that enables partners to understand, analyze, dissect and manage cost across all their customers and invoices.

With this update, CSPs use Azure Cost Management to:

Understand invoiced costs and associate the costs to the customer, subscriptions, resource groups, and services.
Get an intuitive view of Azure costs in cost analysis with capabilities to analyze costs by customer, subscription, resource group, resource, meter, service, and many other dimensions.
View resource costs that have Partner Earned Credit (PEC) applied in Cost Analysis.
Set up notifications and automation using programmatic budgets and alerts when costs exceed budgets.
Enable the Azure Resource Manager policy that provides customer access to Cost Management data. Customers can then view consumption cost data for their subscriptions using pay-as-you-go rates.

For more information see, Get Started with Azure Cost Management as a Partner.

Analyze costs by customer, subscription, tags, resource group or resource using cost analysis

Using cost analysis, partners can group by and filter costs by customer, subscription, tags, resource group, resource, and reseller Microsoft partner Network identifier (MP NID), and have increased visibility into costs for better cost control. Partners can also view and manage the costs in the billing currency and in US dollars for billing scopes.

Reconcile cost to an invoice

Partners can reconcile costs by invoice across their customers and their subscriptions to understand the pre-tax costs that contributed to the invoice.

You can analyze azure spend for the customers you support and their subscriptions and resources. With this enhanced visibility into the costs of your customers, you can use spending patterns to enforce cost control mechanisms, like budgets and alerts to manage costs with continued and increased accountability.

Enable cost management at retail rates for your customers

In this update, a partner can also enable cost management features, initially at pay-as-you-go rates for your customers and resellers who have access to the subscriptions in the customer’s tenant. As a partner, if you decide to enable cost management for the users with access to the subscription, they will have the same capabilities to analyze the services they consume and set budgets to control costs that are computed at pay-as-you-go prices for Azure consumed services. This is just the first of the updates and we have features planned in the first half of 2020 to enable cost management for customers at prices that partner can set by applying a markup on the pay-as-you-go prices.

Partners can set a policy to enable cost management for users with access to an Azure subscription to view costs at retail rates for a specific customer.

If the policy is enabled for subscriptions in the customer’s tenant, users with role-based access control (RBAC) access to the subscription can now manage Azure consumption costs at retail prices.

Set up programmatic budgets and alerts to automate and notify when costs exceed threshold

As a partner, you can set up budgets and alerts to send notifications to specified email recipients when the cost threshold is exceeded. In the partner tenant, you can set up budgets for costs as invoiced to the partner. You can also set up monthly, quarterly, or annual budgets across all your customers, or for a specific customer, and filter by subscription, resource, reseller MPN ID, or resource group.

Any user with RBAC access to a subscription or resource group can also set up budgets and alerts for Azure consumption costs at retail rates in the customer tenant if the policy for cost visibility has been enabled for the customer.

When a budget is created for a subscription or resource group in the customer tenant, you can also configure it to call an action group. The action group can perform a variety of different actions when your budget threshold is met. For more information about action groups, see Create and manage action groups in the Azure portal. For more information about using budget-based automation with action groups, see Manage costs with Azure budgets.

All the experiences that we provide in Azure Cost Management natively are also available as REST APIs for enabling automated cost management experiences.

Coming soon

We will be enabling cost recommendation and optimization suggestions, for better savings and efficiency in managing Azure costs.
We will launch Azure Cost Management at retail rates for customers who are not on the Microsoft Customer Agreement and are supported by CSP partners.
Showback features that enable partners to charge a markup on consumption costs are also being planned for 2020.

Try Azure Cost Management for partners today! It is natively available in the Azure portal for all partners who have onboarded customers to the new Microsoft Customer Agreement.
Quelle: Azure

Unlocking the promise of IoT: A Q&A with Vernon Turner

Vernon Turner is the Founder and Chief Strategist at Causeway Connections, an information and communications technology research firm. For nearly a decade, he’s been serving on global, national, and state steering committees, advising governments, businesses, and communities on IoT-based solution implementation. He recently talked with us about the importance of distinguishing between IoT hype and reality, and identifies three steps businesses need to take to make a successful digital transformation.

What is the promise of IoT?

The promise of more and more data from more and more connected sensors boils down to unprecedented insights and efficiencies. Businesses get more visibility into their operations, a better understanding of their customers, and the ability to personalize offerings and experiences like never before, as well as the ability to cut operational costs via automation and business-process efficiencies.

But just dabbling with IoT won’t unlock real business value. To do that, companies need to change everything, how they make products, how they go to market, their strategy, and their organizational structure. They need to really transform. And to do that, they need to do three things, lead with the customer experience, migrate to offering subscription-based IoT-enabled services, and have a voice in an emergent ecosystem of partners related to their business.

Why is the customer experience so important to fulfilling the promise of IoT?

There can be a lot of hype around IoT-enabled offerings. 

I recently toured several so-called smart buildings with a friend in the construction industry. He showed me that just filling a building with IoT-enabled gadgets doesn’t make it smart. A truly smart building goes beyond connected features and addresses the specific, real-world needs of tenants, leaseholders, and building managers.

If it doesn’t radically change the customer experience, it doesn’t fulfill the promise of IoT.

What’s the disconnect? Why aren’t “smart” solution vendors delivering what customers want?

Frankly, it’s easier to sell a product than an experience.

Customer experience should be at the center of the pitch for IoT, because IoT enables customers to have much more information about the product, in real-time, across the product lifecycle. But putting customer experience first requires making hard changes. It means adopting new strategies, business models, and organization charts, as well as new approaches to product development, sales and marketing, and talent management. And it means asking suppliers to create new business models to support sharing data across the product lifecycle.

Why is the second step to digital transformation, migrating to offering subscription-based, IoT-enabled services, so important?

To survive in our digitally transforming economy, it’s essential for businesses and their suppliers, to move from selling static products to a subscription-based services business model.

As sensors and other connected devices become increasingly omnipresent, customers see more real-time data showing them exactly what they’re consuming, and how the providers of the services they’re consuming are performing. By moving to a subscription (or “X as a service”) model, businesses can provide more tailored offerings, grow their customer base, and position themselves for success in the digital age.

When companies embrace transformation, it can have a ripple effect across their operations. Business units can respond to market needs to create a new service by combining microservices using the rapid software development techniques of DevOps. These services drive a shift from infrequent, low-business-value interactions with customers to continuous engagement between customers and companies’ sales and business units. This improves customer relationships, staving off competition, and introducing new sales opportunities.

What challenges should companies be prepared for as they migrate to offering subscription services?

For a subscription-based services model to work, most companies need to make significant changes to their culture and organizational structure.

Financial planning needs to stop reviewing past financial statements and start focusing on future recurring revenue. Instead of concentrating on margin-based products, sales should start selling outcomes that add value for customers. Marketing must be driven by data about the customer experience and what the customer needs, rather than what serves the branding campaign.

From now on, rapid change, responsiveness to the customer, and the ability to customize and scale services are going to be the norm in business.

You mentioned the importance of participating in an emergent ecosystem of partners. What does that mean? Why does it matter?

As digital business processes mature and subscription models become the standard, customers will demand ways to integrate their relationships with IT and business vendors in an ecosystem connected by a single platform.

Early results show that vendors who actively participate in their solution platform’s ecosystem enjoy a higher net promoter score (NPS). In the short term, they gain stickiness with customers. And in the long run, they become more relevant across their ecosystem, gain a competitive advantage over peers inside and outside their ecosystem, and deliver more value to customers.

How does ecosystem participation increase the value delivered to customers?

Because everyone’s using the same platform, customers get transparency into the performance of suppliers. Service-level management becomes the first point of contact between businesses and suppliers. Key performance indicators trigger automatic responses to customer experiences. Response times to resolve issues are mediated by the platform.

These tasks and functions are carried out within the ecosystem and orchestrated by third-party service management companies. But that’s not to say businesses in the ecosystem don’t still have an individual, separate relationship with their customers. Rather, the ecosystem acts as a gateway for IT and business suppliers to integrate their offerings into customer services. Business and product outcomes from the ecosystem feed research and development, product design, and manufacturing, leading to continual improvement in services delivery and customer experience.

To conclude, let’s go back to something we talked about earlier. For builders, a truly smart building is one that does more than just keep the right temperature. It also monitors and secures wireless networks, optimizes lighting based on tenants’ specific needs, manages energy use, and so on to deliver comfortable, customized work, living, or shopping environments. To deliver that kind of customer experience takes an ecosystem of partners, all working in concert. For companies to unlock the value of IoT, they need to participate actively in that ecosystem.

Learn how Azure helps businesses unlock the value of IoT.
Quelle: Azure

PyTorch on Azure with streamlined ML lifecycle

It's exciting to see the Pytorch Community continue to grow and regularly release updated versions of PyTorch! Recent releases improve performance, ONNX export, TorchScript, C++ frontend, JIT, and distributed training. Several new experimental features, such as quantization, have also been introduced.

At the PyTorch Developer Conference earlier this fall, we presented how our open source contributions to PyTorch make it better for everyone in the community. We also talked about how Microsoft uses PyTorch to develop machine learning models for services like Bing. Whether you are an individual, a small team, or a large enterprise, managing the machine learning lifecycle can be challenging. We'd like to show you how Azure Machine Learning can make you and your organization more productive with PyTorch.

Streamlining the research to production lifecycle with Azure Machine Learning

One of the benefits of using PyTorch 1.3 in Azure Machine Learning is Machine Learning Operations (MLOps). MLOps streamlines the end-to-end machine learning (ML) lifecycle so you can frequently update models, test new models, and continuously roll out new ML models alongside your other applications and services. MLOps provides:

Reproducible training with powerful ML pipelines that stitch together all the steps involved in training your PyTorch model, from data preparation, to feature extraction, to hyperparameter tuning, to model evaluation.
Asset tracking with dataset and model registries so you know who is publishing PyTorch models, why changes are being made, and when your PyTorch models were deployed or used in production.
Packaging, profiling, validation, and deployment of PyTorch models anywhere from the cloud to the edge.
Monitoring and management of your PyTorch models at scale in an enterprise-ready fashion with eventing and notification of business impacting issues like data drift.

 

Training PyTorch Models

With MLOps, data scientists write and update their code as usual and regularly push it to a GitHub repository. This triggers an Azure DevOps build pipeline that performs code quality checks, data sanity tests, unit tests, builds an Azure Machine Learning pipeline, and publishes it to your Azure Machine Learning workspace.

The Azure Machine Learning pipeline does the following tasks:

Train model task executes the PyTorch training script on Azure Machine Learning compute. It outputs a model file which is stored in the run history.
Evaluate model task evaluates the performance of the newly trained PyTorch model with the model in production. If the new model performs better than the production model, the following steps are executed. If not, they will be skipped.
Register model task takes the improved PyTorch model and registers it with the Azure Machine Learning model registry. This allows us to version control it.

You can find example code for training a PyTorch model, doing hyperparameter sweeps, and registering the model in this PyTorch MLOps example.

Deploying PyTorch models

The Machine Learning extension for DevOps helps you integrate Azure Machine Learning tasks in your Azure DevOps project to simplify and automate model deployments. Once a new model is registered in your Azure Machine Learning workspace, you can trigger a release pipeline to automate your deployment process. Models can then be automatically packaged and deployed as a web service across test and production environments such as Azure Container Instances and Azure Kubernetes Service (AKS). You can even enable gated releases so that, once the model is successfully deployed to the staging or quality assurance (QA) environment, a notification is sent to approvers to review and approve the release to production. You can see sample code for this in the PyTorch ML Ops example.

Next steps

We’re excited to support the latest version of PyTorch in Azure. With Azure Machine Learning and its MLOps capabilities, you can use PyTorch in your enterprise with a reproducible model lifecycle. Check out the MLOps example repository for an end to end example of how to enable a CI/CD workflow for PyTorch models.
Quelle: Azure

Bing delivers its largest improvement in search experience using Azure GPUs

Over the last couple of years, deep learning has become widely adopted across the Bing search stack and powers a vast number of our intelligent features. We use natural language models to improve our core search algorithm’s understanding of a user’s search intent and the related webpages so that Bing can deliver the most relevant search results to our users. We rely on deep learning computer vision techniques to enhance the discoverability of billions of images even if they don’t have accompanying text descriptions or summary metadata. We leverage machine-based reading comprehension models to retrieve captions within larger text bodies that directly answer the specific questions users have. All these enhancements lead toward more relevant, contextual results for web search queries.

Recently, there was a breakthrough in natural language understanding with a type of model called transformers (as popularized by Bidirectional Encoder Representations from Transformers, BERT). Unlike previous deep neural network (DNN) architectures that processed words individually in order, transformers understand the context and relationship between each word and all the words around it in a sentence. Starting from April of this year, we used large transformer models to deliver the largest quality improvements to our Bing customers in the past year. For example, in the query "what can aggravate a concussion", the word "aggravate" indicates the user wants to learn about actions to be taken after a concussion and not about causes or symptoms. Our search powered by these models can now understand the user intent and deliver a more useful result. More importantly, these models are now applied to every Bing search query globally making Bing results more relevant and intelligent.

Deep learning at web-search scale can be prohibitively expensive

Bing customers expect an extremely fast search experience and every millisecond of latency matters.  Transformer-based models are pre-trained with up to billions of parameters, which is a sizable increase in parameter size and computation requirement as compared to previous network architectures. A distilled three-layer BERT model serving latency on twenty CPU cores was initially benchmarked at 77ms per inference. However, since these models would need to run over millions of different queries and snippets per second to power web search, even 77ms per inference remained prohibitively expensive at web search scale, requiring tens of thousands of servers to ship just one search improvement.

Leveraging Azure Virtual Machine GPUs to achieve 800x inference throughput

One of the major differences between transformers and previous DNN architectures is that it relies on massive parallel compute instead of sequential processing. Given that graphics processing unit (GPU) architecture was designed for high throughput parallel computing, Azure’s N-series Virtual Machines (VM) with GPU accelerators built-in were a natural fit to accelerate these transformer models. We decided to start with the NV6 Virtual Machine primarily because of the lower cost and regional availability.  Just by running the three-layer BERT model on that VM with GPU, we observed a serving latency of 20ms (about 3x improvement). To further improve the serving efficiency, we partnered with NVIDIA to take full advantage of the GPU architecture and re-implemented the entire model using TensorRT C++ APIs and CUDA or CUBLAS libraries, including rewriting the embedding, transformer, and output layers.  NVIDIA also contributed efficient CUDA transformer plugins including softmax, GELU, normalization, and reduction.

We benchmarked the TensorRT-optimized GPU model on the same Azure NV6 Virtual Machine and was able to serve a batch of five inferences in 9ms, an 8x latency speedup and 43x throughput improvement compared to the model without GPU acceleration. We then leveraged Tensor Cores with mixed precision on a NC6s_v3 Virtual Machine to even further optimize the performance, benchmarking a batch size of 64 inferences at 6ms (~800x throughput improvement compared to CPU).

Transforming the Bing search experience worldwide using Azure’s global scale

With these GPU optimizations, we were able to use 2000+ Azure GPU Virtual Machines across four regions to serve over 1 million BERT inferences per second worldwide. Azure N-series GPU VMs are critical in enabling transformative AI workloads and product quality improvements for Bing with high availability, agility, and significant cost savings, especially as deep learning models continue to grow in complexity. Our takeaway was very clear, even large organizations like Bing can accelerate their AI workloads by using N-series virtual machines on Azure with built-in GPU acceleration. Delivering this kind of global-scale AI inferencing without GPUs would have required an exponentially higher number of CPU-based VMs, which ultimately would have become cost-prohibitive.  Azure also provides customers with the agility to deploy across multiple types of GPUs immediately, which would have taken months of time if we were to install GPUs on-premises.  The N-series Virtual Machines were essential to our ability to optimize and ship advanced deep learning models to improve Bing search, available globally today.

N-series general availability

Azure provides a full portfolio of Virtual Machine capabilities across the NC, ND, and NV series product lines. These Virtual Machines are designed for application scenarios for which GPU acceleration is common, such as compute-intensive, graphics-intensive, and visualization workloads.

NC-series Virtual Machines are optimized for compute-intensive and network-intensive applications.
ND-series Virtual Machines are optimized for training and inferencing scenarios for deep learning.
NV-series Virtual Machines are optimized for visualization, streaming, gaming, encoding, and VDI scenarios.

See our Supercomputing19 blog for recent product additions to the ND and NV-series Virtual Machines.

Learn more

Join us at Supercomputing19 to learn more about our Bing optimization journey, leveraging Azure GPUs.
Quelle: Azure