Introducing automated failover for private workloads using Cloud DNS routing policies with health checks

High availability is an important consideration for many customers and we’re happy to introduce health checking for private workloads in Cloud DNS to build business continuity/disaster recovery (BC/DR) architectures. Typical BC/DR architectures are built using multi-regional deployments on Google Cloud. In a previous blog post, we showed how highly available global applications can be published using Cloud DNS routing policies. The globally distributed, policy-based DNS configuration provided reliability, but in case of a failure, it required manual intervention to update the geo-location policy configuration. In this blog we will use Cloud DNS health check support for Internal Load Balancers to automatically failover to health instances. We will use the same setup we used in the previous blog. We have an internal knowledge-sharing web application. It uses a classic two-tier architecture: front-end servers tasked to serve web requests from our engineers and back-end servers containing the data for our application. Our San Francisco, Paris, and Tokyo engineers will use this application, so we decided to deploy our servers in three Google Cloud regions for better latency, performance, and lower cost.High level designThe wiki application is accessible in each region via an Internal Load Balancer (ILB). Engineers use the domain name wiki.example.com  to connect to the front-end web app over Interconnect or VPN. The geo-location policy will use the Google Cloud region where the Interconnect or VPN lands as the source for the traffic and look for the closest available endpoint.DNS resolution based on the location of the userWith the above setup, if our application in one of the regions goes down, we have to manually update the geo-location policy and remove the affected region from the configuration. Until someone detects the failure and updates the policy, the end users close to that region will not be able to reach the application. Not a great user experience. How can we design this better? Google Cloud is introducing Cloud DNS health check support for Internal Load balancers. For an internal TCP/UDP load balancer, we can use the existing health checks for a back-end service, and Cloud DNS will receive direct health signals from the individual back-end instances. This enables automatic failover when the endpoints fail their health checks.For example, if the US frontend service is unhealthy, Cloud DNS may return the closest region load balancer IP (in our example, Tokyo’s) to the San Francisco clients depending on the latency.DNS resolution based on the location of the user and health of ILBs backendsEnabling the health checks for the wiki.example.com record provides us with automatic failover in case of a failure and ensures that Cloud DNS always returns only the healthy endpoints in response to the client queries. This removes manual intervention and significantly improves the failover time.The Cloud DNS routing policy configuration would look like this:Creating the Cloud DNS managed zone:code_block[StructValue([(u’code’, u’gcloud dns managed-zones create wiki-private-zone \rn –description=”DNS Zone for the front-end servers of the wiki application” \rn –dns-name=wiki.example.com \rn –networks=prod-vpc \rn –visibility=private’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ed50078bcd0>)])]Creating the Cloud DNS Record set:For health checking to work, we need to reference the ILB using the ILB forwarding rule name. If we use the ILB IP instead, then Cloud DNS will not check the health of the endpoint. See the official documentation page for more information on how to configure Cloud DNS routing policies with health checks.code_block[StructValue([(u’code’, u’gcloud dns record-sets create front.wiki.example.com. \rn–ttl=30 \rn–type=A \rn–zone=wiki-private-zone \rn–routing-policy-type=GEO \rn–routing-policy-data=”us-west2=us-ilb-forwarding-rule;europe-west1=eu-ilb-forwarding-rule;asia-northeast1=asia-ilb-forwarding-rule” \rn–enable-health-checking’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ed50078b750>)])]Note: Cloud DNS uses the health checks configured on the load balancers itself. Users do not need to configure any additional health checks for Cloud DNS. See the official documentation page for information on how to create health checks for GCP Load Balancers.With this configuration, if we were to lose the application in one region due to an incident, the health checks on the ILB would fail, and Cloud DNS would automatically resolve new user queries to the next closest healthy endpoint.We can expand this configuration to ensure that front-end servers send traffic only to healthy bank-end servers in the region closest to them. We would configure front-end servers to connect to the global hostname backend.wiki.example.com.The Cloud DNS geo-location policy with health checks will use the front-end servers’ GCP region information to resolve this hostname to the closest available healthy back-end tier Internal Load Balancer.Front-end to back-end communication (instance to instance)Putting it all together, we now have set up our multi-regional and multi-tiered application with DNS policies to automatically failover to a healthy endpoint closest to the end user.
Quelle: Google Cloud Platform

BigQuery’s performance and scale means that everyone gets to play

Editor’s note: Today, we’re hearing from telematics solutions company Geotab about how Google BigQuery enables them to democratize data across their entire organization and reduce the complexity of their data pipelines. Geotab’s telematics devices and an extensive range of integrated sensors and apps record a wealth of raw vehicle data, such as GPS, engine speeds, ambient air temperatures, driving patterns, and weather conditions. With the help of our telematics solutions, our customers gain insights that help them optimize fleet operations, improve safety, and reduce fuel consumption. Google BigQuery sits at the heart of our platform as the data warehouse for our entire organization, ingesting data from our vehicle telematics devices and all customer-related data. Essentially, each of the nearly 3 billion raw data records that we collect every day across the organization, goes into BigQuery, whatever its purpose. In this post, we’ll share why we leverage BigQuery to accelerate our analytical insights, and how it’s helped us solve some of our most demanding data challenges. Democratizing big data with easeAs a company, Geotab manages geospatial data, but the general scalability of our data platform is even more critical for us than specific geospatial features. One of our biggest goals is to democratize the use of data within the company. If someone has an idea to use data to inform some aspect of the business better, they should have the green light to do that whenever they want.Nearly every employee within our organization has access to BigQuery to run queries related to the projects that they have permission to see. Analysts, VPs, data scientists, and even users who don’t typically work with data have access to the environment to help solve customer issues and improve our product offerings.While we have petabytes of information, not everything is big—our tables range in size from a few megabytes up to several hundred terabytes. Of course, there are many tricks and techniques for optimizing performant queries in the BigQuery environment, but most users don’t have to worry about optimization, parallelization, or scalability.Google BigQuery sits at the heart of our platform as the data warehouse for our entire organization.The beauty of the BigQuery environment is that it handles all of that for us behind the scenes. If someone needs insight from data and isn’t a BigQuery expert, we want them to be as comfortable querying those terabytes as they are on smaller tables—and this is where BigQuery excels. A user can write a simple query just as easily on a billion rows as on 100 rows without once thinking about whether BigQuery can handle the load. It’s fast, reliable, and frees up our time to rapidly iterate on product ideation and data exploration.Geotab has thousands of dashboards and scheduled queries constantly running to provide insights for various business units across the organization. While we do hit occasional performance and optimization bumps, most of the time, BigQuery races through everything without a hiccup. Also, the fact that BigQuery is optimized for performance on small tables means we can spread our operations and monitoring across the organization without too much thought—20% of the queries we run touch less than 6 MB of data while 50% touch less than 800 MB. That’s why it’s important that BigQuery excels not only at scale but at throughput for more bite-sized applications. The confidence we have in BigQuery to handle these loads across so many disparate business units is part of why we continue to push for increasingly more teams to take a data-driven approach to their business objectives.Reducing the complexity of the geospatial data pipelineThe ability of BigQuery to manage vast amounts of geospatial data has also changed our approach to data science. On the scale we are operating, with tens of petabytes of data, it’s not feasible for us to operate with anything other than BigQuery. In the past, using open-source geospatial tools, we would hit limits at volumes of around 2.5 million data points. BigQuery allows us to model over 4 billion data points, which is game-changing. Even basic functions, such as ingesting and managing geospatial polygons, used to be a complex workflow to string together in Python with Dataflow. Now, those geographic data types are handled natively by BigQuery and can be streamed directly into a table. Even better, all of the analytics, model building, and algorithm development can happen in that same environment—without ever leaving BigQuery. No other solution that would provide geospatial model building and analytics at this scale in a single environment. Here’s an example. We have datasets of vehicle movements through intersections. Even just a few years ago, we struggled to run an intersection dataset at scale and had to limit its use to one city at a time. Today, we are processing all the intersection data for the entire world every day without ever leaving BigQuery. Rather than worry about architecting a complex data pipeline across multiple tools, we can focus on what we want to do with the data and the business outcomes we are trying to achieve. BigQuery is more than a data warehouseWe frequently deal with four or five billion data points in our analytics applications and BigQuery functions like a data lake. It’s not just our SQL database—it also easily supports all of our unstructured data, such as BLOBS from our CRM systems or GIS data files as well as images. It’s been a fascinating experience to see SQL consuming more and more unstructured data and applying a more relational structure that makes it consumable and familiar to analysts with traditional database management skills. A great example is BigQuery’s support for JSON functions, which allows us to take hierarchical non-uniform data structures of metadata from things like OpenStreetMap and store it natively in BigQuery with easy access to descriptive keys and values. As a result, we can hire a wider range of analysts for roles across the business, not just PhD-level data scientists, knowing they can work effectively with the data in BigQuery. Even within our data science team, most of the things that we needed Python to accomplish a few years ago can now be done in SQL. That allows us to spend more time deriving insights rather than managing extended parts of the data pipeline. We also leverage SQL capabilities, such as stored procedures, to run within the data warehouse and churn through billions of data points with a five-second latency.The ease of using SQL to access this data has made it possible to democratize data across our company and give everyone the opportunity to use data to improve outcomes for our customers and develop interesting new applications. Reimagining innovation with Google CloudOver the years, we haven’t stayed with BigQuery because we have to—we want to. Google Cloud is helping us drive the insights that will fuel our future and the future of all organizations looking to raise the bar with data-driven insights and intelligence. BigQuery’s capabilities have continued to evolve along with our needs, with the addition of increasingly complex analytics, data science methodologies, geospatial support, and BQML. BigQuery offers Geotab an environment that provides a unique ability to manage, transform and analyze geospatial data at enormous scale. It also makes it possible to aggregate all kinds of other structured and unstructured data needed for our business into a single source of truth—against which all of our analytics can be performed.
Quelle: Google Cloud Platform

How UX researchers make Google Cloud better with user feedback

Customer experiences are critical to user experience (UX) researchers at every level of developing Google Cloud products. Whether it’s migrating a user to Google Cloud, helping them understand it once they are there, or delving into using Cloud services, one thing is clear: learning from the people who use Google Cloud is fundamental.Understanding our usersUX researchers touch various points of the customer journey, like migration, cloud operations, and data analytics. In each of these areas, understanding the customer’s business needs and goals grants the UX researcher greater insight into how to provide the best possible experience. This is primarily done by engaging with user feedback. From widely-used products like Google Kubernetes Engine and BigQuery, to the targeted solutions of the Recommendation API and Error Reporting tools, Google Cloud’s team of UX researchers are pursuing a deep understanding of customer’s workflows and pain points. Between in-person and remote sessions with UX researchers and online surveys, our Google User Experience Research program offers a range of options for customers to engage in user feedback.Applying insights to our productsGoogle’s researchers work with our product development team to act on user feedback. Insights learned during one of Google Cloud’s early customer migrations resulted in the co-creation of our beta and general availability versions of Migrate for Windows Containers in Google Slides. Not only did this underscore the importance of proactive and collaborative customer integration, but it was well received in the developer community. The user group also broadened the Google Cloud team’s perspective on the Error Reporting system by requesting that the product solutions expand to handle more categories of errors and events. As a result, the Error Reporting System now catches more types of issues, which are also presented to customers, making the reports more useful for them. Advocating for user insightsResearchers are the champions of the usability in applications, a principle which guides UX researchers as they tap into real world customer experiences to inspire new roadmaps and ways of working. This process is made possible when users share their lived experiences with our researchers. If you are interested in helping Google Cloud become more helpful for everyone, sign-up to be a part of the Cloud UX team’s user participant pool here.
Quelle: Google Cloud Platform

Azure Scales 530B Parameter GPT-3 Model with NVIDIA NeMo Megatron

This post was co-authored by Hugo Affaticati, Technical Program Manager, Microsoft Azure HPC + AI, and Jon Shelley, Principal TPM Manager, Microsoft Azure HPC + AI.

Natural language processing (NLP), automated speech recognition (ASR), and text-to-speech (TTS) applications are becoming increasingly common in today’s world. Most companies have leveraged these technologies to create chatbots for managing customer questions and complaints, streamlining operations, and removing some of the heavy cost burden that comes with headcount. But what you may not realize is they’re also being used internally to reduce risk and identify fraudulent behavior, reduce customer complaints, increase automation, and analyze customer sentiment. It’s prevalent in most places, but especially in industries such as healthcare, finance, retail, and telecommunications.

NVIDIA recently released the latest version of the NVIDIA NeMo Megatron framework, which is now in open beta. This framework can be used to build and deploy large language models (LLMs) with natural language understanding (NLU).

Combining NVIDIA NeMo Megatron with our Azure AI infrastructure offers a powerful platform that anyone can spin up in minutes without having to incur the costs and burden of managing their own on-premises infrastructure. And of course, we have taken our benchmarking of the new framework to a new level, to truly show the power of the Azure infrastructure.

Reaching new milestones with 530B parameters

We used Azure NDm A100 v4-series virtual machines to run the GPT-3 model's new NVIDIA NeMo Megatron framework and test the limits of this series. NDm A100 v4 virtual machines are Azure’s flagship GPU offerings for AI and deep learning powered by NVIDIA A100 80GB Tensor Core GPUs. These instances have the most GPU memory capacity and bandwidth, backed by NVIDIA InfiniBand HDR connections to support scaling up and out. Ultimately, we ran a 530B-parameter benchmark on 175 virtual machines, resulting in a training time per step of as low as 55.7 seconds (figure1). This benchmark measures the compute efficiency and how it scales by measuring the time taken per step to train the model after steady state is reached, with a mini-batch size of one. Such outstanding speed would not have been possible without InfiniBand HDR providing excellent communication between nodes without increased latency.

Figure 1: Training time per step on the 530B-parameter benchmark from 105 to 175 virtual machines.

These results highlight an almost linear speed increase, guaranteeing better performance for a higher number of nodes—paramount for heavy or time-sensitive workloads. As shown by these runs with billions of parameters, customers can rest assured that Azure’s infrastructure can handle even the most difficult and complex workloads, on demand.

“Speed and scale are both key to developing large language models, and the latest release of the NVIDIA NeMo Megatron framework introduces new techniques to deliver 30 percent faster training for LLMs,” said Paresh Kharya, senior director of accelerated computing at NVIDIA. “Microsoft’s testing with NeMo Megatron 530B also shows that Azure NDm A100 v4 instances powered by NVIDIA A100 Tensor Core GPUs and NVIDIA InfiniBand networking provide a compelling option for achieving linear training speedups at massive scale.”

Showcasing Azure AI capabilities—now and in the future

Azure’s commitment is to make AI and HPC accessible to everyone. It includes, but is not limited to, providing the best AI infrastructure that scales from the smallest use cases to the heaviest workloads. As we continue to innovate to build the best platform for your AI workloads, our promise to you is to use the latest benchmarks to test our AI capabilities. These results help drive our own innovation and showcase that there is no limit to what you can do. For all your AI computing needs, Azure has you covered.

Learn more

To learn more about the results or how to recreate them, please see the following links.

A quick start guide to benchmarking LLM models in Azure: NVIDIA NeMo Megatron—Results.
A quick start guide to benchmarking LLM models in Azure: NVIDIA NeMo Megatron—Steps.

Quelle: Azure

Introducing the Docker+Wasm Technical Preview

The Technical Preview of Docker+Wasm is now available! Wasm has been producing a lot of buzz recently, and this feature will make it easier for you to quickly build applications targeting Wasm runtimes.

As part of this release, we’re also happy to announce that Docker will be joining the Bytecode Alliance as a voting member. The Bytecode Alliance is a nonprofit organization dedicated to creating secure new software foundations, building on standards such as WebAssembly and WebAssembly System Interface (WASI).

What is Wasm?

WebAssembly, often shortened to Wasm, is a relatively new technology that allows you to compile application code written in over 40+ languages (including Rust, C, C++, JavaScript, and Golang) and run it inside sandboxed environments.

The original use cases were focused on running native code in web browsers, such as Figma, AutoCAD, and Photoshop. In fact, fastq.bio saw a 20x speed improvement when converting their web-based DNA sequence quality analyzer to Wasm. And Disney built their Disney+ Application Development Kit on top of Wasm! The benefits in the browser are easy to see.

But Wasm is quickly spreading beyond the browser thanks to the WebAssembly System Interface (WASI). Companies like Vercel, Fastly, Shopify, and Cloudflare support using Wasm for running code at the edge, and Fermyon is building a platform to run Wasm microservices in the cloud.

Why Docker?

At Docker, our goal is to help developers bring their ideas to life by conquering the complexity of app development. We strive to make it easy to build, share, and run your application, regardless of the underlying technologies. By making containers accessible to all, we proved our ability to make the lives of developers easier and were recognized as the #1 most-loved developer tool.

We see Wasm as a complementary technology to Linux containers where developers can choose which technology they use (or both!) depending on the use case. And as the community explores what’s possible with Wasm, we want to help make Wasm applications easier to develop, build, and run using the experience and tools you know and love.

How do I get the technical preview?

Ready to dive in and try it for yourself? Great! But before you do, a couple quick notes to keep in mind as you start exploring:

Important note #1: This is a technical preview build of Docker Desktop, and things might not work as expected. Be sure to back up your containers and images before proceeding.Important note #2: This preview has the containerd image store enabled and cannot be disabled. If you’re not currently using the containerd image store, then pre-existing images and containers will be inaccessible.

You can download the technical preview build of Docker Desktop here:

macOS Apple SiliconmacOS IntelWindows AMD64Linux Arm64 (deb)Linux AMD64 (deb, rpm, tar)

Are there any known limitations?

Yes! This is an early technical preview and we’re still working on making the experience as smooth as possible. But here are a few things you should be aware of:

Docker Compose may not exit cleanly when interruptedWorkaround: Clean up docker-compose processes by sending them a SIGKILL (killall -9 docker-compose).Pushes to Hub might give an error stating server message: insufficient_scope: authorization failed, even after logging in using Docker DesktopWorkaround: Run docker login in the CLI

Okay, so how does the Wasm integration actually work?

We’re glad you asked! First off, we need to remind you that since this is a technical preview, things may change quite rapidly. But here’s how it currently works.

We’re leveraging our recent work to migrate image management to containerd, as it provides the ability to use both OCI-compatible artifacts and containerd shims.We collaborated with WasmEdge to create a containerd shim. This shim extracts the Wasm module from the OCI artifact and runs it using the WasmEdge runtime.We added support to declare the Wasm runtime, which will enable the use of this new shim.

Let’s look at an example!

After installing the preview, we can run the following command to start an example Wasm application:

docker run -dp 8080:8080 –name=wasm-example –runtime=io.containerd.wasmedge.v1 –platform=wasi/wasm32 michaelirwin244/wasm-example

Since a few of the flags might be unfamiliar, let’s explain what they’re doing:

–runtime=io.containerd.wasmedge.v1 – This informs the Docker engine that we want to use the Wasm containerd shim instead of the standard Linux container runtime–platform=wasi/wasm32 – This specifies the architecture of the image we want to use. By leveraging a Wasm architecture, we don’t need to build separate images for the different architectures. The Wasm runtime will do the final step of converting the Wasm binary to machine instructions.

After the image is pulled, the runtime reads the ENTRYPOINT of the image to locate and extract the Wasm module. The module is then loaded into the Wasm runtime, started, and networking is configured. We now have a Wasm app running on our machine!

This particular application is a simple web server that says “Hello world!” and echos data back to us. To verify it’s working, let’s first view the logs.

docker logs wasm-example
Server is now running

We can get the “Hello world” message by either opening to http://localhost:8080 or using curl.

curl localhost:8080

And our response will give us a Hello world message:

Hello world from Rust running with Wasm! Send POST data to /echo to have it echoed back to you

To send data to the echo endpoint, we can use curl:

curl localhost:8080/echo -d ‘{“message”:”Hi there”}’ -H “Content-type: application/json”

And we’ll see the data sent back to use in the response:

{“message”:”Hi there”}

To remove the application, you can remove it as you do any other Docker service:

docker rm -f wasm-example

The new integration means you can run a Wasm application alongside your Linux containers (even with Compose). To learn more, check out the docs!

What’s next for Wasm and Docker?

Another great question! Wasm is rapidly growing and evolving, including exploration on how to support multi-threading, garbage collection, and more. There are also many still-to-tackle challenges, including shortening the developer feedback loop and possible paths to production.

So try it out yourself and then let us know your thoughts or feedback on the public roadmap. We’d love to hear from you!
Quelle: https://blog.docker.com/feed/