Choosing the right orchestrator in Google Cloud

What is orchestration? Orchestration often refers to the automated configuration, coordination, and management of computer systems and services. In the context of service-oriented architectures, orchestration can range from simply executing a single service at a specific time and day, to a more sophisticated approach of automating and monitoring multiple services over longer periods of time, with the ability to react and handle failures as they crop up. In the data engineering context, orchestration is central to coordinating the services and workflows that prepare, ingest, and transform data. It can go beyond data processing and also involve a workflow to train a machine learning (ML) model from the data.There is no shortage of orchestration tools in Google Cloud. In this blog post, we will explore service and data orchestration tools and help you choose what’s best for your use case.Orchestration in Google CloudGoogle Cloud Platform offers a number of tools and services for orchestration:Cloud Scheduler for schedule driven single-service orchestrationWorkflows for complex multi-service orchestration Cloud Composer for orchestration of your data workloadsLet’s take a closer look at each of these tools.Cloud SchedulerCloud Scheduler is a service for scheduling the execution of a single service on a recurring schedule — this is about as simple as it gets for orchestration in Google Cloud.Cloud Scheduler uses cron scheduling to trigger the execution of HTTP-based services at a schedule you define.We often see customers using Scheduler alongside Pub/Sub and Cloud Functions to execute their code serverlessly on Google Cloud.Cloud Scheduler is a good fit if you just need to call a single service at regular intervals. But what if you have multiple services that you want to chain together, feeding the output of one service to the next? Or what if you need to apply complex logic to determine how and when services are invoked?  Then, you should start considering Workflows.WorkflowsWorkflows is a service for orchestrating multiple HTTP-based services into a durable and stateful workflow.Like Cloud Scheduler, Worklows enables you to automate the execution of HTTP-based services running on Cloud Functions and Cloud Run, as well as external services and APIs. Unlike Scheduler, Workflows has sophisticated logic that lets you manage the execution of multiple services as part of a wider workflow. You can use either YAML or JSON to express your workflow. You can specify the order of services as steps and define how to handle step failures. The result of one step can be used as an input to other steps throughout the workflow, or as a condition to determine which step to execute next.Workflows is great for chaining microservices together, automating infrastructure tasks such as starting or stopping a VM, and integrating reliably with external systems. It acts as a central source of truth for service integrations, improving observability and error handling in services. It is also completely serverless, so you don’t have to worry about maintaining resources. To execute a workflow you can manually trigger the workflow (via API or UI) or you can set up a recurring schedule with Cloud Scheduler.Workflows is very useful in service-oriented architectures but if your focus is more on engineering data pipelines or big data processing then you should consider using Composer.Cloud Composer Composer is a service designed to orchestrate data driven (particularly ETL/ELT) workflows and is built on the popular open source Apache Airflow project.Composer is fully managed so you don’t have to worry about installing or maintaining Airflow deployments and it supports your pipelines wherever they are, be that on on-premises or across multiple cloud platforms.Like Workflows, you can create a task for each step in your workflow, configure the order of tasks, and specify which task to execute next based on some conditions.Your tasks are expressed in a Python Directed Acyclic Graph (DAG) that can be scheduled to run at a time of your choice:You would use Composer to orchestrate the services that make up your data pipelines, for example, triggering a job in BigQuery or starting a Dataflow pipeline. Operators can be used to communicate with services across multiple cloud environments and on-prem; there are over 150 operators for Google Cloud alone. For example, by passing a few parameters to operators in your DAG file you can easily execute BigQuery jobs or schedule and start pipelines in Dataflow or Dataproc:Composer or Workflows?Both Composer and Workflows support orchestrating multiple services and can handle long running workflows. Despite there being some overlap in the capabilities of these products, each has differentiators that make them well suited to particular use cases. Composer is most commonly used for orchestrating the transformation of data as part of ELT or data engineering or workflows. Workflows, in contrast, is focused on the orchestration of HTTP-based services built with Cloud Functions, Cloud Run, or external APIs. Composer is designed for orchestrating batch workloads that can handle a delay of a few seconds between task executions. It wouldn’t be suitable if low latency was required in between tasks, whereas Workflows is designed for latency sensitive use cases. While you don’t have to worry about maintaining Airflow deployments in Composer, you do need to specify how many workers you need for a given Composer environment. Workflows is completely serverless; there is no infrastructure to manage or scale.Have a look at these example use cases to help you understand which product to use:SummarySo there you have it, a quick overview of the different orchestration tools in Google Cloud and a decision tree on how to choose the right one for your use case. In the end, the systems and services you’re trying to orchestrate will determine the right tool to use. Coming up in the next blog post, we will deep dive into data orchestration in more detail, so stay tuned! In the meantime, check out our Quickstart guides on Cloud Scheduler, Cloud Composer and Workflows to get started!Related ArticleBetter service orchestration with WorkflowsWorkflows is a service to orchestrate not only Google Cloud services such as Cloud Functions and Cloud Run, but also external services.Read Article
Quelle: Google Cloud Platform

Solving for more sustainable and resilient value chains

The past year highlighted the fragility of global supply chains and logistics networks and the need to adapt to rapidly changing business models and customer preferences. The importance of flexible, resilient and sustainable customer value chains is critical as organizations recalibrate to a post-pandemic world. Global supply chains are also subject to environmental risks. In 2020, over 8,000 suppliers disclosing through CDP, a global disclosure system for environmental impacts, reported that US$1.26 trillion of revenue is likely to be at risk over the next five years due to climate change, deforestation, and water insecurity1. Organizations are also likely to be challenged with reputational and regulatory risks across their value chain. For instance, emissions from supply chains are on average 11.4x higher than emissions from a company’s direct operations2. In many sectors, supply chains are responsible for over 80% of total greenhouse gas emissions. There is a strong argument for accelerating efforts to tackle these emissions, as suppliers reported combined savings of US$33.7 billion in 2020 by actively cutting emissions.3 Challenges in creating a sustainable customer value chain Organizations struggle with achieving sustainable value chain goals largely due to three factors: Limited visibility across the end-to-end value chain to drive transparency and accelerate adoption of sustainable best practicesLack of flexibility to adapt to new business models as highlighted during the pandemic and the rapid shift to e-commerce Limited insight into an organization’s operational decisions and their impact in reducing carbon emissions, preventing data-driven decision making.How Google Cloud can helpOur mission at Google Cloud is to accelerate your organization’s ability to digitally transform your business. That includes supply chains. With better insights from data you can automate processes more intelligently. With smarter ML models you can optimize systems and routing. With an open platform you can integrate partner solutions. And you can connect your workforce in real time to collaborate up and down the value chain. Leveraging these technologies, we’re partnering with our customers to tackle the unique sustainability challenges they face to help transform their supply chains.Click to enlargeEnabling sustainable sourcing atUnilever: By combining the power of cloud computing with satellite imagery and AI, Google Cloud and Unilever are building a more holistic view of the forests, water cycles, and biodiversity that intersect Unilever’s supply chain—raising sustainable sourcing standards for suppliers and bringing Unilever closer to its goal of ending deforestation and regenerating nature.Reducing emissions from last-mile logistics and fleet operations atUPS: UPS leverages Google Cloud smart analytics platform to reduce fuel consumption by 10 million gallons a year, reducing carbon emissions and saving up to $400 million a year.Reducing manufacturing waste and improving production quality at LG and GlobalFoundries: LG improved defect detection accuracy by 6% and reduced the time to design and train ML models from days to hours using Google Cloud Vision AI. The Vision AI solution was able to reduce waste and increase customer satisfaction and quality at GlobalFoundries.Reducing packaging at Lush: Lush was able to nearly eliminate plastic packaging by using Google AI to develop an app that leverages AI and augmented reality to recognize products and overlay product information.In solutions engineering our goal is to take these unique experiences and scale them to help organizations reduce emissions and meet their sustainability goals. To facilitate collaboration along this journey we are creating a program to work with our customers and assess where cloud technology can impact their value chains in an environmentally positive way.  Steps for creating more sustainable value chains We’ve developed innovative models to tackle challenges from reducing IT costs to data center transformations to IT infrastructure emissions. We’re excited to apply it to our customers’ sustainability priorities and pain points across the supply chain. We will partner closely to build proof of concepts (PoCs) to tackle new opportunities that help customers achieve their sustainability goals.We’ll start with benchmarks. We measure where customers are relative to industry benchmarks, to better help them achieve their target goals.Next, we’ll assess sustainability and supply chain processes against a maturity curve. This creates a gap analysis to identify areas for improvement.We’ll prioritize the areas of focus using factors such as cost reduction, productivity improvement, revenue impact, and environmental and financial risks.We’ll map the Google Cloud solutions against your areas of focus identifying where cloud technology can potentially reduce environmental impact and operational costs, and where it can enhance security, compliance, and flexibility. With a short list of opportunities, we’ll partner closely with your teams to build PoCs and test the impact of our solutions and ability to scale across your end-to-end value chain.Click to enlargeThis is just the beginning of our journey to help address the most challenging problems across the supply chain and partner with our customers to help them achieve sustainability goals. You can learn more about our sustainability efforts and how we’re integrating circular economy principles into our own supply chain to make it more sustainable. Contact us to tell us what you’re solving for and get started. A Google Cloud expert will help you find the best engagement for you. We look forward to helping you achieve your sustainable value chain goals.1. CDP Global Supply Chain Report 20202. CDP Global Supply Chain Report 20203. CDP Global Supply Chain Report 2020Related ArticleRun a transformed supply chain—see how at Google’s Digital Supply Chain summitCheck out how to build modern digital supply chain and logistics platforms and check out Google’s Supply Chain Summit.Read Article
Quelle: Google Cloud Platform

Proxyless gRPC adds support for advanced traffic management features

For developers building cloud-native microservices, an increasingly common choice is to use gRPC, a high-performance, open-source RPC framework that you can use with Traffic Director for your service mesh. Last year, the gRPC community introduced support for the xDS APIs, bringing service mesh capabilities to proxyless gRPC services. With the most recent release of gRPC, we are adding support for three new capabilities: maximum stream duration (timeout), circuit breaking, and fault injection, enabling you to improve the reliability of your microservices. If you’re already using Traffic Director with proxyless gRPC services, you can add these capabilities via a simple API call. Let’s take a look at these new capabilities.Stop misbehaving calls with maximum stream durationThe addition of maximum stream duration allows you to configure time limits for all RPCs within the mesh. This feature can be used to ensure clients abort operations that are running unusually slowly when it is unlikely they will succeed, enabling them to try another backend or report an error more quickly. The new xDS maximum stream duration is well integrated with gRPC’s per-RPC deadline, which is set by the client application. When both of them are configured, the effective timeout of requests will be the smaller of the two, respecting the requirements for both the service manager and the client application.Prevent request flooding with circuit breakersCircuit breaking can be used to control the maximum number of simultaneous RPCs allowed from each client to your service. This helps prevent excessive load that, if unchecked, could cause a widespread service outage. With circuit breakers, new RPCs won’t be sent out if the client has reached the limit of outstanding requests specified by the service manager. Compared to server-side rate limiters, which deny incoming requests when the server is experiencing heavy load, excessive RPCs are throttled by the clients, so no network or service resources will be consumed.Provision system robustness with fault injection testsFault injection provides a way to test the resiliency of microservices in the presence of different types of failures. With this feature, you can artificially delay or fail a percentage of RPCs to determine how well your system handles these scenarios before they happen due to a real production event. Fault injection can be configured for all RPCs or  a specific set of requests based on headers set by clients. This gives you the ability to stage different failure scenarios without completely bringing down the production service.If you’ve been thinking about adopting a service mesh, improving performance, or increasing the reliability of your system, proxyless gRPC and Traffic Director can help you do that.  Once you have proxyless gRPC applications set up with Traffic Director, you can enable the above reliability features on all your applications via a single API call, as opposed to having to code each application independently. For more information about using gRPC with Traffic Director and these new features, see the following links:Traffic Director with proxyless gRPC services overviewAdvanced traffic management overviewRelated ArticleTraffic Director and gRPC—proxyless services for your service meshWith the addition of xDS API support, you can now use Traffic Director with proxyless gRPC services.Read Article
Quelle: Google Cloud Platform

Faster, cheaper, greener? Pick the Google Cloud region that’s right for you

When it comes to sustainability, we get more done when we move together. That’s why Google Cloud partners with nonprofits, research organizations, governments, and businesses to build technology and tools to accelerate meaningful change. Technologies like machine learning are proving to be invaluable for tackling unique challenges like identifying species in biodiversity and restoration projects such as those being done by Wildlife Insights. Data analytics tools like BigQuery can deliver insights into real-time energy consumption data, helping energy managers at E.ON make decisions that reduce costs and CO2 footprint. And hyper-efficient infrastructure is helping customers like Carrefour reduce their energy use. Using all the tools we have at Google Cloud, we’re committed to helping make your digital transformation a sustainable one too. As we continue to operate the cleanest cloud in the industry we’re working with a growing group of cloud customers focused on reducing the carbon impact of their operations. Over 90% of global IT leaders plan to or currently report on sustainability metrics, with 26% of those leaders accelerating emissions reduction projects in the past year1. In the past year we’ve worked with over 50 customers to evaluate their IT estates for their carbon impact. From digital image libraries to huge data lakes, we’ve seen potential net-carbon reductions from a few thousand kg of CO2e to many kilotons, combining the determination of our customers and Google’s net carbon neutral cloud.  While we match all of the electricity we consume on a global and annual basis with wind and solar purchases (which helps zero out the net carbon impact of Google Cloud Platform and Workspace), we’re working on carbon-free energy, 24/7. Completely decarbonizing the electricity supply is critical to realizing a carbon-free future and averting the worst impacts of climate change.  To help our customers do this, last month we shared the average hourly Carbon Free Energy Percentage (CFE%) for the majority of our Google Cloud regions. Today, we’re sharing a new tool leveraging this data— a Google Cloud region picker—that helps customers assess key inputs like price, latency to their end users and carbon footprint, as they choose which Google Cloud region to run on. Using the region picker, you weigh factors from “Not Important” to “Important” and select the region from where your user traffic emanates, if applicable. For the almost 90% of developers and IT executives we surveyed2 who would move to a more sustainable data center option, this tool should help them make that decision quickly, using just three inputs: Carbon footprint is based on the amount of carbon-free energy supply for each region.Cost uses the price for generic compute instances in the region.  Latency is approximated using physical distance between selected countries and the city or country of the region.The list of recommended Google Cloud regions changes dynamically, stack ranked based on the values you input into the tool. We know different types of workloads have different requirements, so you can easily test different priorities. In our research3, production workloads serving user traffic most frequently ranked performance or latency as their top requirement; internal systems like HR or billing ranked performance and data residency the top requirement. However, for best-effort workloads like batch jobs or backup, carbon scores were ranked as the top characteristic more than any other factor. For companies like Snap Inc., and their sustainability lead Emily Barton, reducing the carbon impact of their digital infrastructure is an important sustainability target for the company. “We’re collaborating with Google to make carbon-free energy data and carbon considerations more useful for users at Snap,” said Emily Barton. Similarly, customers like Salesforce are working to decarbonize its own infrastructure and the services it provides to customers, using our CFE data to help reduce its footprint. We’re working to integrate carbon considerations more commonly into application development, data center migration, multi-region or multi-cloud design and architecture. Our partners like SADA Systems are joining us in the effort. “As a leading Google Cloud partner committed to bringing innovative solutions to customers we’re excited to incorporate sustainability into that commitment,” said Brian Suk, Sr. Solutions Architect. “The CFE measurements and the new tooling being introduced today are already influencing how SADA designs its own Google Cloud-based solutions, and we look forward to evolving our strategy to support more sustainable solutions for our customers.” Building a sustainable future is a team effort. We’re excited to partner with our customers to cut carbon emissions, explore new ways to protect the earth’s resources, better harness renewable energy and simply improve the sustainability of their IT infrastructure. Be sure to check out what we’re up to in cloud sustainability and across Google, and use the region picker for your next Google Cloud project.1. IDG, “No Turning Back: How the Pandemic Has Reshaped Digital Business Agendas”, 20212. Google Cloud survey to US IT executives and developers on carbon-free energy prioritization.3. Google Cloud survey to US IT executives and developers on carbon-free energy prioritization.Related ArticleHow carbon-free is your cloud? New data lets you knowA Google Cloud region’s Carbon-Free Energy percentage (CFE%) lets you choose where best to run your workloads to meet your sustainability…Read Article
Quelle: Google Cloud Platform

What no-code automation looks like with AppSheet

As developers, we talk about automation in the context of technology as often as your friends might talk about Bitcoin — and for good reason: automation has spanned the farthest corners of cloud computing, from infrastructure as code, container orchestration, and DevOps to machine learning and even billing management. It’s making life easier for infrastructure operators, developers, SecOps engineers, and cloud admins by saving them time, which, as we all know, is a finite resource.  More recently, automation has played a pivotal role in not just productivity gains, but also democratizing technology for citizen developers. Google Cloud AutoML for example, has lowered the barrier to entry for machine learning through pre-built models and automated model creation.And now AppSheet is taking the baton and bringing automation to its platform with the GA release of AppSheet Automation. AppSheet is Google Cloud’s intent-aware, no-code application development platform that lets you create applications faster and with less investment than a traditional code-based approach. As an extension of the AppSheet platform, AppSheet Automation offers new, integrated workflow automation capabilities. It leverages Google AI to make it easier to automate business processes, and empowers those without coding skills to reshape their work with features like smarter extraction of structured data from documents and compatibility with a wide range of data sources like Google Workspace Sheets and Drive.AppSheet Automation reclaims developer timeI recently had Jennifer Cadence, Product Marketing Manager for AppSheet, on the GCP Podcast, and she helped paint a clearer picture for me:“Employees are losing an estimated 20% of time to tasks that could be automated1. It’s also mindspace and time we could use to focus on high impact work.”Even as developers, we spend time on manual tasks that lengthen development time or add item after item to our list of things to do. We get requests from business groups to create applications to do things like invoice processing, employee onboarding, and integration with third-party systems. My good friend, John, for example is a data scientist who is still responsible every week for reporting data from a dashboard into a Google Sheet — something he finds frustrating and feels should be automated, so he can focus on high-value work like data experimentation. Likewise, in the past I’ve spent hours on approvals, updating metrics from Salesforce into Sheets, or building an automation that extracts text from a Sheet into a Doc. For employees doing the same, it’s not surprising retention can become problematic. The times I’ve felt the most fulfilled by and creative at my job have been when I get to do more impactful work. AppSheet Automation is designed to give time back to developers and practitioners, and to democratize app development for business users. You can take reasonably complicated ideas and processes and translate them into working apps by using the platform to express them through automation. Technical audiences will see two benefits:Reduce development time by implementing solutions without coding. Reduce backlog by empowering the line of business users who are closest to the problem to build their own solutions. Our survey of early adoptersshows that over 60% of those engaging with AppSheet Automation have an in-house solution but are seeking something more efficient. Let’s talk about how.Automation of course!Through the Automation tab in AppSheet, the first new addition you’ll find are Bots, which are used to create automations by combining events with processes. This is the core of what makes AppSheet Automation easy to use. You envision a process or idea, and structure it intuitively here.Bots can be configured to trigger a process on detection of an event (e.g., a new employee record is created in a Google Sheet), or according to a predetermined schedule. Once enabled, bots run in the background, listening for event triggers. Bots work with your connected data source, and can be created or modified with a few clicks. Changes made are reflected in real time in the app editor so you always know what your automation consists of.Events occur when data is changed (e.g., the priority of a service ticket is changed from low to high) or periodically on a schedule (e.g., every Monday morning at 9 AM).A Process is a sequence of steps (control steps and tasks). A simple process may have just a few steps, such as conditional branches or email tasks, and complete quickly. But a process can be complex, have multiple steps, and take hours or days to complete. It can:Loop over many records Call another processWait for a human to respond to a requestA Task is a simple activity that runs quickly, like “Send an email,” “Change data,” or “Call a webhook.”Modularity is a key design principle of AppSheet Automation. Tasks, Events, and Processes are reusable components. Tasks can be reused in multiple processes. Processes can call other processes. Processes and Events can be reused in multiple bots. This allows for “create once, reuse multiple times” strategies. Intelligent document processingWith AppSheet Automation, you can also leverage intelligent document processing, which uses Document AI to automatically detect invoices, receipts, and forms from documents, PDFs, and images, and to trigger automations and natural language processing. AppSheet Automation connects to documents (and folders) in Google Drive. In the case of documents, you can choose from one of three currently supported types: InvoicesReceiptsW-9 formsIntent-awareUnlike code-driven platforms, AppSheet is a no-code, intent-driven platform. The platform infers the intent of the app being built, which makes the creation and customization process much easier than other options. It uses natural language processing, anticipates the intent of the creator based on keywords, and surfaces contextual suggestions. For example, when you type in “Issue Resolved,” the platform presents bot suggestions. After you select one of them, you have a completely implemented bot ready to go with the event configuration, expressions, the process, and steps. Suggestions are surfaced at multiple levels. When you click to add a new step and start typing in a keyword “re(turn)”, it presents contextual suggestions that make authoring  simple and intuitive.Integration with external data sourcesIn many cases, you need to integrate data from both internal and external company sources, like databases, Airtable, or Box, which requires developer expertise. Remember John, the data scientist? He’s one of a handful of scientists, and his developer team is limited. They can’t afford to spend time learning how to integrate new tools that aren’t part of their primary job, which is why AppSheet and its ability to integrate data sources can be invaluable.It supports various data sources, including Google Workspace, Microsoft Excel, AWS DynamoDB, Dropbox, API management platforms like Apigee, and cloud/on-premises databases. Once you point AppSheet to your data source, it uses machine learning to understand the schema/structure of your data, including dependencies and relationships, and it develops relevant suggestions for bots, events, processes, and steps. You can also tap into Sheets, Drive, and Salesforce through external eventing to create a more streamlined automation experience.How to get startedAppSheet Automation offers intelligent process authoring and rich connectivity, introduces bots, and addresses the long tail of human-centric processes, document based workflows, and app integration use cases. The onus of creating human-centric app workflows no longer needs to fall on development teams, and when it does, AppSheet Automation can help significantly reduce development times. Survey results show that AppSheet Automation has reduced the amount of time to develop automated solutions and time spent on manual tasks. Sample appsThere are a ton of sample apps to get you started. Just login to AppSheet.com and head to “Sample apps” at the top. VideosWe also have many videos to help you learn how to automate processes, integrate with Workspace, build UIs, deploy your apps, and more. Online community There’s already a vibrant community around AppSheet, so check out community.AppSheet.com to join the conversation and engage with other creators from around the world. You can find helpful answers, submit feature requests, and explore AppSheet resources.Head to AppSheet.com to start building with no-code!You can find me on Twitter at @stephr_wong.1 Jobs lost, jobs gained: What the future of work will mean for jobs, skills, and wages How Much Time Are You Wasting on Manual, Repetitive Tasks?
Quelle: Google Cloud Platform

How Lumiata democratizes AI in healthcare with Google Cloud

Editor’s note: Today’s guest post comes from AI for healthcare platform Lumiata. Here’s the story of how they use Google Cloud to power their platform—performing data prepping, model building, and deployment to tackle inherent challenges in healthcare organizations. If ever there was a year for healthcare innovation—2020 was it. At Lumiata, we’ve been on a mission to deliver smarter, more cost-effective healthcare since 2013, but the COVID-19 pandemic added new urgency to our vision of making artificial intelligence (AI) easy and accessible. Using AI in healthcare went from a nice-to-have to a must-have for healthcare organizations. Just imagine how differently you could plan or assess risk if you could identify communities that are more likely to develop long-term effects or co-morbidities and end up in the hospital. Our Lumiata AI Platform helps healthcare organizations use AI to improve quality of care, minimize risk, and reduce costs, without having to build those predictive analytics capabilities themselves. We’re making it possible for healthcare organizations to benefit from the latest technologies to derive valuable insights from their data at scale—and ultimately offer better care to their patients. As an AI-first organization, we’ve been able to try new things to meet our customers’ changing needs, and our Google Cloud infrastructure lets us experiment and develop solutions quickly (check out our previous post on why we chose Google Cloud to help us deliver AI). What our customers need: fast, accessible AIAI provides an enormous opportunity for healthcare organizations, with an almost unlimited number of practical applications. AI isn’t just a solution you can switch on—it requires implementing the right, often purpose-built, solutions to extract the right insights from your data. AI in healthcare is the next frontier, but the transformation can be slow. Without Lumiata’s help, many organizations struggle to operationalize AI, from data prepping to model building and deployment, even if they have identified the problems they would like to solve. Having advanced data science teams isn’t enough—you need to establish a fast, flexible, and resilient infrastructure to deploy projects. Healthcare AI projects are often plagued by a lack of understanding of the complexity of the high-dimensional data and what it takes to simplify it, as well as what’s required from engineering to productionize AI. In addition, it can be difficult to get the appropriate buy-in to make the changes needed to be successful.In addition, healthcare organizations are building technology on waterfall methodologies, which lack the feedback loops and continual improvement to deliver the promised results. Without fast proof that AI is worth the investment, many projects fail before they’ve even started.This is where Lumiata comes in. Our goal is to get customers up and running with the ability to perform fast queries and accurate, AI- and ML-driven predictions in a few weeks. The wealth of healthcare data is ripe for generating AI-powered insights and predictions, but it’s often trapped in legacy systems. Also, many organizations simply don’t have the resources to build everything themselves. We provide predictive products to healthcare businesses looking to leverage machine learning without the heavy lift by offering low- to no-code data modeling tools and solutions—all based on Google Cloud. That way, organizations are empowered to get started and run models when they don’t necessarily have the team do it themselves. We selected Google Cloud because of its security infrastructure, intuitive AI tools, and multi-cloud application management technologies. BigQuery, Google’s serverless data warehouse, enables us to provide access to huge amounts of data. With Google Cloud Dataflow and Apache Beam, we built a data ingestion and extraction process to join and normalize disparate patient records and datasets. The entire system is built on a Google Kubernetes Engine, allowing us to scale quickly to meet infrastructure requirements, and Kubeflow helps us develop and deliver our machine learning pipelines. Additionally, Google Cloud’s fully managed services mean we don’t have to think about building and operating our infrastructure. Instead, we invest our resources in doing more work for our customers and addressing their data needs. Let’s take a look at how Google Cloud helps us deliver AI solutions to our customers through the steps of a typical ML building process.1. Data prepping—from raw input to a 360-degree viewHealthcare organizations often suffer from information data silos that lack interoperability, so there’s no true understanding of the total amount of data and insights available. Most companies rarely have a comprehensive longitudinal person record (LPR) into every person’s health history. When it comes to machine learning, cleaning and preparing data is where teams spend the majority of their time. It’s slow and incredibly time-consuming. In addition, working with on-premises environments doesn’t provide enough elasticity–you need to move quickly, and only the cloud has the capacity to support data prepping for AI. We’ve created a data preparation process that takes raw, messy data and transforms it into fully prepped data for machine learning. Our data management pipeline ingests raw, disparate patient datasets and turns them into what we call Lumiata Person360 records. Using BigQuery and Dataflow, we ingest raw data dumps, link with existing or synthesized identifiers, validate, clean, and normalize it based on medications, procedures, diagnosis codes, and lab results. The data is then tied into a single person record and tagged with proprietary disease codes. Our automated pipeline gives us incredible speed to intake data, and Google Cloud ensures we can scale to handle massive datasets seamlessly. For instance, we’ve been able to take 63 million person records (roughly 2.5 terabytes of data) and run them through our entire data management pipeline in less than four hours. As healthcare organizations handle protected health information and must ensure Health Insurance Portability and Accountability (HIPAA) compliance, it’s imperative that we have the highest level of security and compliance at all times. To ensure this, we deploy single-tenant instances of the entire platform as its own Google Cloud Platform project with its own Kubernetes, networking, buckets, BigQuery tables, and services. 2. Removing the burden of training data modelsOne of the biggest challenges with model building is developing the infrastructure that enables transparent access to various data sources. Infrastructure setup takes time and resources, and often creates complexity when determining how to manage diverse data representations, architecture, and data quality. This is compounded by the fact that ML pipelines must continuously scale as the data for analysis increases. Ultimately, we don’t want our customers to have to worry about the underlying infrastructure. We use Kubernetes and Kubeflow to build scalable ML pipelines and deep learning architectures that can support massive datasets. Our platform unlocks millions of input variables (machine learning features) from Person360 patient records and mixes them with our own internal 110 million-member data asset. We then use this data for training our complex data models to predict cost, risk, disease onset, and medical events.Google’s AI Platform also makes it easier for us to experiment faster with large training datasets from our 120 million records. For instance, we have shifted from more traditional machine learning (like gradient boosted decision trees) toward larger deep learning models that can take multiple predictions across more than 140 medical conditions and analyze them across a specific time dimension. The real value here is the speed at which we can service our customers from the time they drop their data into our platform to the first datasets. Our automated machine learning pipeline enables us to reduce the time it takes to deliver the first outputs—from months to weeks. For instance, we can now train our models with feature matrices containing 11 million people in less than two hours—and all without having to waste time setting up infrastructure for distributed training. 3. Deploying and serving models in productionProductizing complex ML models comes with its own host of challenges. After models are trained and ready for deployment, it can be difficult to maintain consistency as you scale model deployments to meet new use cases or requirements across the organization. Our data science and machine learning engineering teams run offline experiments (outside of Kubeflow) using the Google AI Platform, allowing a single team member to run numerous experiments a day. Once we have a model that works, we version the training pipeline, pre-trained models, and inference pipelines before deploying it onto Kubeflow. The Lumiata AI Platform allows us to benefit from serverless and distributed training—our data scientists are training more models per week, and we have made quicker leaps forward using BERT-based deep learning models. Building on top of Kubernetes and Kubeflow gives us a fast, scalable path to deploy and serve models to our customers. Kubeflow’s reusable components allow us to scale without having to build from scratch. There’s no need to worry about the nuances of training, tuning, and deploying models—customers can simply upload their data and get predictions out on the other end. Running ML and AI in productionThe real impact of simplifying AI implementation is that it opens up previously undiscovered paths for improvement. For instance, we recently launched Pharmacy Intelligence, an AI-powered intervention tool that leverages pharmacy visit data to improve chronic disease management. We partnered up with FGC Health, a Canadian retail pharmacy chain, to help them identify diabetic patients at risk for cardiovascular complications who have gaps in care. The tool will then recommend a simple, actionable intervention, such as a visit to a specialist, drug titration, or adjustments to their existing drug regimen. This is a wonderful example of how using AI to address common gaps in care has the power to save lives. As a company, we see Google Cloud as the core of our platform, enabling us to innovate more rapidly. As a result, we’re delivering solutions for new and interesting problems, such as claims payment integrity, predicting hospital admission and readmission, and identifying disease progression fingerprints for more personalized care. We’re helping healthcare companies become smarter, more powerful, and more effective—leveraging the information they already have in new ways to power the next generation of patient care.Related ArticleRoad to recovery: How Google Cloud is helping states get the COVID-19 vaccine to more peopleGoogle Cloud is proud to partner with a number of states across the U.S. to support vaccination efforts at scale.Read Article
Quelle: Google Cloud Platform

Four consecutive years of 100% renewable energy—and what’s next

We’re proud to announce that in 2020 Google again matched 100 percent of its global electricity use with purchases of renewable energy. We were the first company of our size to achieve this milestone back in 2017, and we’ve repeated the accomplishment in every year since. All told, we’ve signed agreements to buy power from more than 50 renewable energy projects, with a combined capacity of 5.5 gigawatts – about the same as a million solar rooftops.Achieving 100 percent renewable energy year after year is no easy feat, because the amount of computing done in Google data centers continues to grow. This was especially true in 2020 – a year when many peoples’ work, school, doctor’s appointments, first dates, and visits with loved ones moved online. Even as Google Meet and Duo hosted over a trillion minutes of video calls in 2020, our renewable energy procurement kept pace.The path to 100% starts with reducing the amount of energy we use in the first place. Researchers recently found that worldwide data center electricity stayed close to flat in the last decade, even as computing needs grew 550 percent. And Google has led this trend: compared with five years ago, we now deliver around seven times as much computing power with the same amount of electrical power.Last year’s accomplishment was also due to a global package of renewable energy deals that we announced in late 2019. As those projects came online over the course of 2020, hundreds of new turbines and hundreds of thousands of new solar panels began converting wind and sun into electrons.Our renewable energy projects that started operations in 2020 spanned four continents. Some highlights we’re especially excited about:  Google’s first offshore wind project, in the blustery North Sea, began contributing electrons to the grid where we operate our Belgium data center.In Chile, we began purchasing power  from a new solar farm in the Antofagasta region  to match our growing load in South America. Solar panels distributed across hundreds of public housing rooftops helped us source new clean energy in land-constrained Singapore.Across the U.S., large-scale solar and wind projects gave a boost to data centers from Oklahoma to Alabama to Virginia. So what’s next? Though we’re thrilled to have matched Google’s annual electricity consumption with renewable energy for four years running, we’re now building on our progress to target an even larger ambition: by 2030, Google aims to run on entirely 24/7 carbon-free energy, everywhere we operate. As we discuss in a new explainer, achieving this goal means shifting away from a net-zero model of “emit and compensate” and instead targeting “absolute zero,” where we simply never emit carbon from our operations in the first place. Solving this challenge is not only important for Google, but will also be essential for fully transitioning electric grids to carbon-free energy. We hope our efforts to develop solutions for our own operations can lead the way. For more, check out Google CEO Sundar Pichai’s 2021 Earth Day update on our progress.
Quelle: Google Cloud Platform

Colossus under the hood: a peek into Google’s scalable storage system

You trust Google Cloud with your critical data, but did you know that Google also relies on the same underlying storage infrastructure for its other businesses as well? That’s right, the same storage system that powers Google Cloud also underpins Google’s most popular products, supporting globally available services like YouTube, Drive, and Gmail. That foundational storage system is Colossus, which backs Google’s extensive ecosystem of storage services, such as Cloud Storage and Firestore, supporting a diverse range of workloads, including transaction processing, data serving, analytics and archiving, boot disks, and home directories. In this post, we take a deeper look at the storage infrastructure behind your VMs, specifically the Colossus file system, and how it helps enable massive scalability and data durability for Google services as well as your applications. Google Cloud scales because Google scalesBefore we dive into how storage services operate, it’s important to understand the single infrastructure that supports both Cloud and Google products. Like any well-designed software system, all of Google is layered with a common set of scalable services. There are three main building blocks used by each of our storage services:Colossus is our cluster-level file system, successor to the Google File System (GFS).  Spanner is our globally-consistent, scalable relational database.Borg is a scalable job scheduler that launches everything from compute to storage services. It was and continues to be a big influence on the design and development of Kubernetes.These three core building blocks are used to provide the underlying infrastructure for all Google Cloud storage services, from Firestore to Cloud SQL to Filestore, and Cloud Storage. Whenever you access your favorite storage services, the same three building blocks are working together to provide everything you need. Borg provisions the needed resources, Spanner stores all the metadata about access permissions and data location, and then Colossus manages, stores, and provides access to all your data. Google Cloud takes these same building blocks and then layers everything needed to provide the level of availability, performance, and durability you need from your storage services. In other words, your own applications will scale the same as Google products because they rely on the same core infrastructure based on these three services scaling to meet your needs. Colossus in a nutshellNow, let’s take a closer look at how Colossus works. But first, a little background on Colossus:It’s the next-generation of the GFS.Its design enhances storage scalability and improves availability to handle the massive growth in data needs of an ever-growing number of applications. Colossus introduced a distributed metadata model that delivered a more scalable and highly available metadata subsystem.  But how does it all work? And how can one file system underpin such a wide range of workloads? Below is a diagram of the key components of the Colossus control plane:Client libraryThe client library is how an application or service interacts with Colossus. The client is probably the most complex part of the entire file system. There’s a lot of functionality, such as software RAID, that goes into the client based on an application’s requirements. Applications built on top of Colossus use a variety of encodings to fine-tune performance and cost trade-offs for different workloads. Colossus Control PlaneThe foundation of Colossus is its scalable metadata service, which consists of many Curators. Clients talk directly to curators for control operations, such as file creation, and can scale horizontally. Metadata databaseCurators store file system metadata in Google’s high-performance NoSQL database, BigTable. The original motivation for building Colossus was to solve scaling limits we experienced with Google File System (GFS) when trying to accommodate metadata related to Search. Storing file metadata in BigTable allowed Colossus to scale up by over 100x over the largest GFS clusters. D File ServersColossus also minimizes the number of hops for data on the network. Data flows directly between clients and “D” file servers (our network attached disks). CustodiansColossus also includes background storage managers called Custodians. They play a key role in maintaining the durability and availability of data as well as overall efficiency, handling tasks like disk space balancing and RAID reconstruction. How Colossus provides rock-solid, scalable storageTo see how this all works in action, let’s consider how Cloud Storage uses Colossus. You’ve probably heard us talk a lot about how Cloud Storage can support a wide range of use cases, from archival storage to high throughput analytics, but we don’t often talk about the system that lies beneath.With Colossus, a single cluster is scalable to exabytes of storage and tens of thousands of machines. In the example above, for example, we have instances accessing Cloud Storage from Compute Engine VMs, YouTube serving nodes, and Ads MapReduce nodes—all of which are able to share the same underlying file system to complete requests. The key ingredient is having a shared storage pool that is managed by the Colossus control plane, providing the illusion that each has its own isolated file system. Disaggregation of resources drives more efficient use of valuable resources and lowers costs across all workloads. For instance, it’s possible to provision for the peak demand of low latency workloads, like a YouTube video, and then run batch analytic workloads more cheaply by having them fill in the gaps of otherwise idle time.Let’s take a look at a few other benefits Colossus brings to the table. Simplify hardware complexityAs you might imagine, any file system supporting Google services has fairly daunting throughput and scaling requirements that must handle multi-TB files and massive datasets. Colossus abstracts away a lot of physical hardware complexity that would otherwise plague storage-intensive applications. Google data centers have a tremendous variety of underlying storage hardware, offering a mix of spinning disk and flash storage in many sizes and types. On top of this, applications have extremely diverse requirements around durability, availability, and latency. To ensure each application has the storage it requires, Colossus provides a range of service tiers. Applications use these different tiers by specifying I/O, availability, and durability requirements, and then provisioning resources (bytes and I/O) as abstract, undifferentiated units.In addition, at Google scale, hardware is failing virtually all the time—not because it’s unreliable, but because there’s a lot of it. Failures are a natural part of operating at such an enormous scale, and it’s imperative that its file system provide fault tolerance and transparent recovery. Colossus steers IO around these failures and does fast background recovery to provide highly durable and available storage.The end result is that the associated complexity headaches of dealing with hardware resources are significantly reduced, making it easy for any application to get and use the storage it requires.Maximize storage efficiencyNow, as you might imagine it takes some management magic to ensure that storage resources are available when applications need them without overprovisioning. Colossus takes advantage of the fact that data has a wide variety of access patterns and frequencies (i.e., hot data that is accessed frequently) and uses a mix of flash and disk storage to meet any need. The hottest data is put on flash for more efficient serving and lower latency. We buy just enough flash to push the I/O density per gigabyte into what disks can typically provide and buy just enough disks to ensure we have enough capacity. With the right mix, we can maximize storage efficiency and avoid wasteful overprovisioning. For disk-based storage,  we want to keep disks full and busy to avoid excess inventory and wasted disk IOPs. To do this, Colossus uses intelligent disk management to get as much value as possible from available disk IOPs. Newly written data (i.e. hotter data) is evenly distributed across all the drives in a cluster. Data is then rebalanced and moved to larger capacity drives as it ages and becomes colder.  This works great for analytics workloads, for example, where data typically cools off as it ages.  Battle-tested to deliver massive scaleSo, there you have it—Colossus is the secret scaling superpower behind Google’s storage infrastructure. Colossus not only handles the storage needs of Google Cloud services, but also provides the storage capabilities of Google’s internal storage needs, helping to deliver content to the billions of people using Search, Maps, YouTube, and more every single day. When you build your business on Google Cloud you get access to the same super-charged infrastructure that keeps Google running. We’ll keep making our infrastructure better, so you don’t have to.To learn more about Google Cloud’s storage architecture, check out the Next ‘20 session from which this post was developed, “A peek at the Google Storage infrastructure behind the VM.” And check out the cloud storage website to learn more about all our storage offerings.Related ArticleOptimizing object storage costs in Google Cloud: location and classesSaving on Cloud Storage starts with picking the right storage for your use case, and making sure you follow best practices.Read Article
Quelle: Google Cloud Platform

How ShareChat built scalable data-driven social media with Google Cloud

Editor’s note: Today’s guest post comes from Indian social media platform ShareChat. Here’s the story of how they improved performance, app development, and analytics for serving regional content to millions of users using Google Cloud. How do you create a social network when your country has 22 major official languages and countless active regional dialects? At ShareChat, we serve more than 160 million monthly active users who share and view videos, images, GIFs, songs, and more in 15 different Indian languages. We also launched a short video platform in 2020, Moj, which already supports over 80 million monthly active users. Connecting with people in the language they understandAs mobile data and smartphones have become more affordable in India, we noticed a large new segment of people, many in rural areas, being welcomed onto the internet. However, many of them didn’t speak English, and when it comes to accessing content and information—language plays a significant role. Instead of joining other social media sites where English reigned supreme, new internet users chose to join language or dialect-specific Whatsapp groups where they felt more comfortable instead.So, we set out to build a platform where people can share their opinions, document their lives, and make new friends, all in their native language. ShareChat simplifies content and people discovery by using a personalized content newsfeed to deliver language-specific content to the right audience.Given the high-intensity data and high volume of content and traffic, we rely heavily on IT infrastructure. On top of that, a large number of our users rely on 2G networks to post, like, view, or follow each other. Our platform needs to deliver great experiences to people who are spread out across the country and different networks without any reduction in performance.The right cloud partner to support future growthShareChat was born in the cloud—we already knew how to scale systems to serve a large customer base with our existing cloud provider. But like many companies, we struggled with over-provisioning compute and storage to accommodate unpredictable traffic and avoid running out of storage. With demand rising for local language content and an increase in online interactions in response to the COVID-19 crisis, we realized that we would need a more efficient way to scale dynamically and allocate resources as needed.Google Cloud was a natural choice for us. We wanted to partner with a technology-first company that would make it easy (and cost-effective) to manage a strong technology portfolio that would allow us to build whatever we wanted. Google is at the forefront of technology innovation and provided everything we needed to build, run, and manage our applications (including creating an efficient DevOps pipeline to fix and release new features quickly). We had a few issues in mind at the start of discussions with the Google Cloud team, but over time, as we got information and support from them, we realized that these were the partners we wanted in our corner when it came time to tackle our most challenging problems. In the end, we decided to take our entire infrastructure to Google Cloud.To support millions of users, we deploy and scale using Google Kubernetes Engine. While we analyze our data using a combination of managed data cloud services, such as Pub/Sub for data pipelines, BigQuery for analytics, Cloud Spanner for real-time app serving workloads, and Cloud Bigtable for less-indexed databases. We also rely on Cloud CDN to help us distribute high-quality and reliable content delivery at low latency to our users. We now use just half the total core consumption of our legacy environment to run ShareChat’s existing workloads.Google Cloud delivers better outcomes at every level By moving to Google Cloud, we saw major benefits in several key areas: Zero-downtime migration for usersAt the time of migration, we had over 70 terabytes of data, consisting of 220 tables—some of which were up to 14 terabytes with nearly 50 billion rows. Due to our data’s interdependencies, moving services over one at a time wasn’t an option for us. Even though we were migrating such large volumes of data, we didn’t want to impact any of our customers. Latency spikes for out-of-sync data might affect message delivery. For instance, if a message or notification was delayed, we didn’t want to risk a bad user experience causing someone to abandon ShareChat. To prepare for the move, we ran a proof-of-concept cluster for over four months to test database performance in a real-world scenario for handling more than a million queries per second. Using an open-source API gateway, we replicated our legacy data environment into Google Cloud for performance testing and capacity analysis. As soon as we were confident Google Cloud could handle the same traffic as our previous cloud environment, we were ready to execute.Using wrappers, we were able to migrate without having to change anything in our existing application code. The entire migration of 60 million users to Google Cloud took five hours—without any data loss or downtime. Today, ShareChat has grown to 160 million users, and Google Cloud continues to give us the support we need.Scaling globally to meet unexpected demandWe rely on real-time data to drive everything on ShareChat by tracking everything that goes on in our app—from messages and new groups to content people like or who they follow. Our users create more than a million posts per day, so it’s critical that our systems can process massive amounts of data efficiently. We chose to migrate to Spanner for its global consistency and secondary index. Unlike our legacy NoSQL database, we could scale without having to rethink existing tables or schema definitions and keep our data systems in sync across multiple locations. It’s also cost-effective for us—moving over 120 tables with 17 indexes into Cloud Spanner reduced our costs by 30%.Spanner also replicates data seamlessly in multiple locations in real time, enabling us to retrieve documents if one region fails. For instance, when our traffic unexpectedly grew by 500% over just a few days, we were able to scale horizontally with zero lines of code change. We were also launching our Moj video app simultaneously, and we were able to move it to another region without a single issue. Simplifying development and deploymentOn average, we experience about 80,000 requests per second (RPS) –nearly 7 billion RPS per day. That means daily push notifications sent out to the entire user base about daily trending topics can often result in a spike of 130,000 RPS in just a few seconds. Instead of over-provisioning, Google Kubernetes Engine (GKE) enables us to pre-scale for traffic spikes around scheduled events, such as holidays like Diwali, when millions of Indians send each other greetings. Migrating to GKE has also enabled us to adopt more agile ways of work, such as automating deployment and saving time with writing scripts. Even though we were already using container-based solutions, they lacked transparency and coverage across the entire deployment funnel. Kubernetes features, such as sidecar proxy, allows us to attach peripheral tasks like logging into the application without requiring us to make code changes. Kubernetes upgrades are managed by default, so we don’t have to worry about maintenance and stay focused on more valuable work. Clusters and nodes automatically upgrade to run the latest version, minimizing security risks and ensuring we always have access to the latest features.Low latency and real-time ML predictionsEven though many of our users may be accessing ShareChat outside of metropolitan areas, it doesn’t mean they’re more patient if the app loads slowly or their messages are delayed. We strive to deliver a high-performance experience, regardless of where our users are. We use Cloud CDN to cache data in five Google Cloud Point of Presence (PoP) locations at the edge in India, allowing us to bring content as close as possible to people and speeding up load time. Since moving to Cloud CDN, our cache hit ratio has improved from 90% to 98.5%—meaning our cache can handle 98.5% of content requests. As we expand globally, we’d like to use machine learning to reach new people with content in different languages. We want to build new algorithms to process real-time datasets in regional languages and accurately predict what people want to see. Google Cloud gives us an infrastructure optimized to handle compute-intensive workloads that will be useful to us both now—and in the future.  The confidence to build the best platformOur current system now performs better than before we migrated, but we are continuously building new features on top of it. Google’s data cloud has provided us with an elegant ecosystem of services that allows us to build whatever we want, more easily and faster than ever before. Perhaps the biggest advantage of partnering with Google Cloud has been the connection we have with the engineers at Google. If we’re working to solve a specific problem statement and find a specific solution in a library or a piece of code, we have the ability to immediately connect with the team responsible for it. As a result, we have experienced a massive boost in our confidence. We know that we can build a really good system because we not only have a good process in place to solve problems—we have the right support behind us.Related ArticleDatabase observability for developers: introducing Cloud SQL InsightsNew Insights tool helps developers quickly understand and resolve database performance issues on Cloud SQL.Read Article
Quelle: Google Cloud Platform

The 5 benefits of Cloud SQL [infographic]

Tired of spending too much time on database maintenance? You’re not alone. Lots of businesses are turning to fully managed database services to help build and scale infrastructure quickly, freeing up time and resources to spend on innovation instead. At Google Cloud, our managed database services include Cloud SQL, which makes it easy to move MySQL, PostgreSQL, and SQL Server workloads to the cloud. Here are the top five benefits of using Google’s Cloud SQL.Click to enlarge the 5 benefits of Cloud SQL infographicCheck out more details on how managed services like Cloud SQL work and why they matter.
Quelle: Google Cloud Platform