Google Cloud and NVIDIA’s enhanced partnership accelerates computing workloads

Companies from startups to multinationals are striving to radically transform the way they solve their data challenges. As they continue to manage increasing volumes of data, these companies are searching for the best tools to help them achieve their goals—without heavy capital expenditures or complex infrastructure management.Google Cloud and NVIDIA have been collaborating for years to deliver a powerful platform for machine learning (ML), artificial intelligence (AI), and data analytics to help you solve your complex data challenges. Organizations use NVIDIA GPUs on Google Cloud to accelerate machine learning training and inference, analytics, and other high performance computing (HPC) workloads. From virtual machines to open-source frameworks like TensorFlow, we have the tools to help you tackle your most ambitious projects. For instance, Google Cloud’s Dataproc now lets you use NVIDIA GPUs to speed up ML training and development by up to 44 times and reduce costs by 14 times.To continue to help you meet your goals, we’re excited to announce forthcoming support for the new NVIDIA Ampere architecture and the NVIDIA A100 Tensor Core GPU. Google Cloud and the new A100 GPUs will come with enhanced hardware and software capabilities to enable researchers and innovators to further advance today’s most important AI and HPC applications, from conversational AI and recommender systems, to weather simulation research on climate change. We’ll be making the A100 GPUs available via Google Compute Engine, Google Kubernetes Engine, and Cloud AI Platform, allowing customers to scale up and out with control, portability, and ease of use. In addition, Google Cloud’s Deep Learning VM images and Deep Learning Containers will bring pre-built support for NVIDIA’s new generation of libraries to take advantage of A100 GPUs. The Google Cloud, NVIDIA, and TensorFlow teams are partnering to provide built-in support for this new software in allTensorFlow Enterprise versions, so TensorFlow users on Google Cloud can use the new hardware without changing any code or upgrading their TensorFlow versions. Avaya makes customer connections with Google Cloud and NVIDIAAvaya, a leading global provider of unified communications and collaboration, uses Google Cloud and NVIDIA technology to address customers’ critical business challenges. Avaya Spaces, a born-in-the-cloud video collaboration solution, runs on Google Cloud and is deployed in multiple data centers globally. With COVID-19 changing the way we work, this solution has been especially helpful to organizations as they shift to social distancing and working from home. “Moving our video processing over to NVIDIA T4s on Google Cloud opens up new innovation opportunities for our platform. Our direction is to infuse real-time AI capabilities in our user experience to create unique value for our end users,” says Paul Relf, Senior Director of Product Management, Cloud Collaboration at Avaya. “We are heavy users of Google Cloud and the value-added capabilities that are available to us. We are also keenly interested in the new AI capabilities coming from NVIDIA and how we can leverage the combined ecosystem to create better outcomes for our Avaya Spaces users.” There is a wide range of use cases for NVIDIA on Google Cloud solutions, across industries and company sizes. We spoke about some of the AI platform uses—from edge computing to graphics visualization—at NVIDIA’s GTC Digital event. You can check out some of the on-demand sessions we think are particularly interesting below:Building a Scalable Inferencing Platform in GCPGoogle Cloud AutoML Video and Edge DeploymentGPipe: Efficient Training of Giant Neural Networks Using Pipeline ParallelismJAX: Accelerating Machine-Learning Research with Composable Function Transformations in PythonArtificial and Human Intelligence in HealthCareIf you’re interested in learning more about the new A100 GPUs on Google Cloud, fill out this form and we’ll be in touch.
Quelle: Google Cloud Platform

Helping manufacturers during and after COVID-19

The impact of COVID-19 has touched every industry—including manufacturing, which relies heavily on skilled, hands-on workers and complex supply chains. According to a March 2020 survey by the U.S. National Association of Manufacturers, four out of five U.S. manufacturing companies expect to be financially impacted by COVID-19, more than half think they’ll need to change how they operate, and over a third anticipate supply chain disruptions. COVID-19 has also compelled manufacturers to address important issues facing their employees, including remote work and social distancing. It’s a critical time for manufacturers to increase the agility and digitization of their supply chains and operations; here’s how cloud can help businesses resume under new norms.Enabling automationWhile the full impact of COVID-19 is still unknown, many factories are already experiencing decreases in workforce capacity and resources. Manufacturers face tough questions around how to quickly understand this new landscape and develop new operating procedures and automation initiatives that enable them to adapt quickly and enable their employees to safely work on site.We want to help manufacturers address these challenges by offering tools that automate processes, remotely monitor systems, and extend their capabilities beyond the factory floor. Using Vision AI, for example, manufacturers can train machine-learning models to visually inspect goods and processes for quality and compliance, without putting human inspectors at risk. By connecting operational technology (OT) and information technology (IT) via the cloud, operators can monitor and control specific machines or plants remotely, using dashboards and performance views. GlobalFoundries, a leader in the semiconductor manufacturing industry, is already using AutoML Vision to build a visual inspection solution that can detect random defects in wafer map and scanning electron microscope (SEM) images, which are essential elements in the semiconductor manufacturing process. AutoML Vision reads in the images of wafers and sample defects, and trains customized models to detect these defects. AutoML Vision could successfully classify 80% of the images based on a limited amount of training data in the initial pass itself. This fast path to high accuracy let GlobalFoundries quickly move to production, start realizing benefits, and scale up. The foundry subsequently improved quality and customer satisfaction, and 40% of the manual inspection workload has already been successfully shifted to the visual inspection solution built based on AutoML Vision.Supporting remote workGoogle Meet, our premium, secure video meetings solution, can help manufacturers hold daily meetings, training, and onboarding without the need to be on site, while G Suite tools like Google Docs, Sheets, and Slides let teams collaborate on document authoring remotely. Google Meet can also assist with virtual training and the safe and secure on-boarding of new hires.KAESER Compressors, one of the world leaders in compressed air products, accelerated their deployment of G Suite when they needed to rapidly convert their teams to remote work as a result of COVID-19. “We were deeply impressed by how quickly our employees adopted G Suite and we really believe in cloud’s benefits,” says Falko Lameter, CIO at KAESER Compressors. “We have access to more memory, better machines, and more advanced technology, such as machine learning. Google Cloud has shown its commitment to innovation in these areas, and we are looking forward to scaling our collaboration in the future.”Energy solutions provider Viessmann was able to keep up its production, and G Suite has helped them to convert their employees to remote work within a 48-hour span. Since then they have conducted roughly 60,000 Google Meet conferences per month. Additionally, Google Sheets was established as the primary IT dashboard for monitoring KPIs for Viessmann’s IT infrastructure. Being one of the first manufacturing companies that chose Chromebooks granted employees who work from home peace of mind, as their accounts are secure and protected through Google’s modern authentication technologies. Viessmann even developed a ventilator in significantly less time than usual. G Suite helped the involved engineers to collaborate and exchange on the project with ease and thus was essential to speed up the launch process. “The G Suite communications and collaboration infrastructure is self-maintaining, and it has made each of us more independent,” says Alexander Pöllmann, Head of Intranet and Collaboration Services at Viessmann. “We can work together from everywhere across borders and time zones, on any device. This kind of flexibility makes our workforce more agile, and ultimately, happier and more productive.” Koenig & Bauer, the world’s oldest printing press manufacturer, migrated to G Suite in early 2020 to increase productivity and collaboration amongst teams. In light of COVID-19, the timely switch to G Suite helped Koenig & Bauer significantly to keep workplaces connected. Teams in different locations—even at home—have access to tools like video conferencing, calendar functions, word processing, and spreadsheet calculations, and can now share files, all with a single click and on one single platform.1 “Before we moved to G Suite, we had a very heterogeneous, non-collaborative IT environment and the resource consumption was pretty heavy,” says Jürgen Tuffentsammer, CIO of Koenig & Bauer. “Since the migration, our team collaboration has improved and we can share and execute against innovative ideas so much faster than before.”Managing volatility in supply and demandThe COVID-19 pandemic has disrupted global supply chains and distribution channels. Now more than ever, accurately forecasting demand and optimizing supply in this continuously evolving environment requires integrating data from multiple sources and analyzing it in real time. It’s a job analytical tools like BigQuery were designed for. By applying smart analytics and AI, manufacturers can better predict demand and adapt their operations to meet it.AI can also help mitigate problems with last-mile delivery, which account for more than half of all shipping costs. The ability to optimize routes using real-time weather and traffic data, as well as to deploy ML models to predict where new pickups are likely to come from, will enable companies to minimize operating expenses while maximizing services.Last month, Missouri Governor Mike Parson announced the launch of a new tool developed by Google Cloud to help health care providers connect with Missouri manufacturers and suppliers of personal protective equipment (PPE). The Missouri PPE Marketplace tool is a joint effort between the state and the Missouri Hospital Association, built to help those manufacturers that have shifted production to PPE enter the healthcare market and connect with buyers.Additionally, over the past month, the state’s Department of Economic Development (DED) has gathered interest from more than 200 PPE manufacturers and suppliers, inviting them to register in the system. State healthcare agencies and the Missouri Hospital Association are reaching out to healthcare providers across the state to ensure they have access and can connect directly with suppliers through the new tool.Optimizing IT spendMoving to the cloud is an important way businesses can optimize their technology spend and find new efficiencies.  For manufacturers, this can translate into more customers served, more issues resolved, and more adaptability for the overall business. The cloud offers the potential for drastically reduced data costs and infrastructure savings, plus increased performance, simplicity, and scalability across IT environments.  The cloud can support manufacturers in many ways, from improving safety, to weathering market uncertainties, to preparing for the future. For example, we recently joined the Rolls-Royce EMER²GENT alliance, which aims to help foster global economic recovery by identifying early signs of rebounding economies (we’re providing access to our public data sets and BigQuery as part of the effort). Whether it’s modernizing infrastructure, increasing agility, or digitizing supply chains and operations, we’re focused on providing the solutions that will help enable manufacturers to operate during the pandemic and beyond.Visit our website to learn more about manufacturing on Google Cloud.1. Source: Koenig & Bauer Annual Report 2019
Quelle: Google Cloud Platform

Announcing Google Cloud VMware Engine: Accelerating your cloud journey

VMware technologies form the cornerstone of many customer’s enterprise IT environments, and those same enterprises are eager to run their VMware environments in the cloud to scale quickly and benefit from cloud services.Last summer, we announced support for customers to run VMware workloads on Google Cloud, and we have made significant progress since then. In the fall we acquired CloudSimple to provide customers a fully integrated VMware-based solution, and today we’re proud to announce another significant milestone—Google Cloud VMware Engine, an integrated first-party offering with end-to-end support to migrate and run your VMware environment in Google Cloud. This fully managed service is expected to be generally available this quarter out of two US regions, expanding into additional Google Cloud regions globally in the second half of the year. Introducing Google Cloud VMware EngineThe service delivers a fully managed VMware Cloud Foundation stack—VMware vSphere, vCenter, vSAN, NSX-T, and HCX for cloud migration—in a dedicated environment on Google Cloud’s highly performant and reliable infrastructure to support enterprise production workloads. With this service, you can migrate or extend your on-premises workloads to Google Cloud in minutes by connecting to a dedicated VMware environment directly through the Google Cloud Console. This allows you to seamlessly migrate to the cloud without the cost or complexity of refactoring applications, and run and manage workloads consistently with your on-premises environment. By running your VMware workloads on Google Cloud, you reduce your operational burden while benefiting from scale and agility, and maintain continuity with your existing tools, policies, and processes. Importantly, you can quickly meet your business needs by creating a VMware SDDC environment on Google Cloud in a matter of minutes, enabling you to scale business critical applications on-demand. The service is VMware Cloud Verified, the highest level of validation for VMware-based cloud services, to help enable compatibility and operational continuity across on-premises and cloud environments. “VMware and Google Cloud are working together to help power customers’ multi-cloud strategies, and the new Google Cloud VMware Engine will enable our mutual customers to drive digital transformation and business resiliency using the same VMware Cloud Foundation running in their data centers today,” said Ajay Patel, senior vice president and general manager, cloud provider software business unit at VMware. “Google Cloud VMware Engine enables organizations to quickly deploy their VMware environment in Google Cloud, delivering scale, agility and access to cloud-native services while leveraging the familiarity and investment in VMware tools and training.” A differentiated VMware experienceGoogle Cloud VMware Engine is built on Google Cloud’s highly performant, scalable infrastructure with fully redundant and dedicated 100Gbps networking, providing 99.99% availability to meet the needs of your most demanding enterprise workloads. Cloud networking services such as Interconnect and VPN ease access from your on-premises environments to the cloud and high-bandwidth connectivity to cloud services optimize for performance and flexibility while minimizing costs and operational overhead. End-to-end, one stop support is integrated to provide a seamless experience across this service and the rest of Google Cloud.Google Cloud VMware Engine is designed to minimize your operational burden, so you can focus on your business. We take care of the lifecycle of the VMware software stack and manage all related infrastructure and upgrades. Customers can continue to leverage IT management tools and third-party services consistent with their on-premises environment. We’re partnering closely with leading storage, backup, and disaster recovery providers such as NetApp, Actifio, Veeam, Zerto, Cohesity, and Dell Technologies to ensure support for third-party solutions, ease the migration journey, and enable business continuity.An integrated Google Cloud experienceIn addition to the ease of migration, you can benefit from full access to innovative Google Cloud services such as BigQuery, Cloud Operations, Cloud Storage, Anthos, and Cloud AI. Billing, identity management, and access control are also fully integrated into Google Cloud to unify the experience with other Google Cloud products and services. As you look to migrate and modernize workloads over time, these cloud-native services allow you to streamline management, surface new data insights, and deliver new and innovative services to your customers. Unlocking business valueOver the past few months, we’ve engaged with numerous customers through our early access program. Customers have experienced first-hand the rapid and simple migration that Google Cloud VMware Engine enables as they look to extend or migrate workloads into the cloud. Capital markets infrastructure provider Deutsche Börse Group was impressed by the ease and simplicity of migrating VMware workloads to Google Cloud.“As one of the world’s largest market infrastructure providers, implementing innovative and resilient solutions for financial markets is key when it comes to maintaining efficient, stable and most important secure operations,” says Dr. Christoph Böhm, Member of the Executive Board and Chief Information Officer, Deutsche Börse Group. “As a long-term VMware customer we are keen to extend our large landscape towards hyperscaling options, keeping existing control planes and lifecycle management stable. Google Cloud VMware Engine allows us now to quickly extend our VMware environment to Google Cloud, one of Deutsche Börse’s public cloud partners, increasing our business agility and building even higher levels of resiliency. The steps we have gone through so far together are hugely encouraging, giving us innovative and flexible ways in running hybrid cloud scenarios.”QAD, a leading ERP software provider, is also excited about the benefits of running VMware on Google Cloud. “With Google Cloud VMware Engine, we are able to quickly extend our VMware-based platform to Google Cloud to meet our goal of being rapid, agile and effective,” says Scott Lawson, Director, IT Architecture at QAD. “As a leading ERP software provider, partnering with Google Cloud and VMware allows us to reduce our operational burden, improve our disaster recovery capabilities to ensure consistent availability for our customers, and benefit from native Google Cloud services to continuously innovate.” Enabling customer success through our partner ecosystemWe’re proud to partner closely with regional and global system integrators to simplify and enable the success of our mutual customers’ cloud migration journey. Our partners such as Deloitte, Atos, and WWT are committed to building cloud services to help customers adopt Google Cloud VMware Engine and accelerate their digital transformation through native Google Cloud services. Partners can play an essential role to accelerate migration and help you achieve faster time-to-value.  “As customers look to simplify their cloud migration journey, we’re committed to build cloud services to help customers benefit from the increased agility and efficiency of running VMware workloads on Google Cloud,” said Bob Black, Dell Technologies Global Lead Alliance Principal, Deloitte Consulting LLP. “By combining Google Cloud’s technology and Deloitte’s business transformation experience, we can enable our joint customers to accelerate their cloud migration, unify operations, and benefit from innovative Google Cloud services as they look to modernize applications.” Partners also see Google Cloud VMware Engine as a key offering to help customers accelerate their cloud journey. “Running VMware workloads on Google Cloud is a priority for many enterprise customers as they look to benefit from the scale and agility of the cloud while maintaining consistency across hybrid and multi cloud environments,” said Peter Cutts, SVP, Digital Transformation Officer, Atos Cloud Enterprise Solutions. “We are excited for the opportunity to reinforce our partnership with Google Cloud by combining all the value Atos brings to VMware and Google to provide a differentiated experience while enabling customers to benefit from turnkey offerings including cloud native services such as BigQuery, AI & machine learning.”“As a Google Cloud Premier Partner, we are excited about the addition of Google Cloud VMware Engine to the ever-growing list of services already driving value to our mutual customers,” said Michael Taylor, Chief Technology Officer, World Wide Technology. “Hybrid cloud strategies continue to be a focal point for our customers and this offering substantially accelerates the timeframe for organizations to move their workloads to the cloud and modernize their infrastructure.Getting started Google Cloud VMware Engine is expected to be generally available to customers this quarter in the North Virginia (us-east4) and Los Angeles (us-west2) regions. We plan for the service to be available globally in eight additional regions—London, Frankfurt, Tokyo, Sydney, Montréal, São Paulo, Singapore, and Netherlands—in the second half of the calendar year.We are excited for this milestone and committed to delivering an optimum platform to run your VMware workloads alongside Google Cloud services to solve business problems and innovate in new areas. You can find more information including product features and resources on our website. We also invite you to join us for our upcoming webinar where we will provide a more detailed overview of the service, dive into key use cases, and discuss how you can accelerate your cloud migration journey. We look forward to connecting with you.
Quelle: Google Cloud Platform

Cloud cost optimization: principles for lasting success

Cloud is more than just a cost center. Moving to the cloud allows you to enable innovation at a global scale, expedite feature velocity for faster time to market, and drive competitive advantage by quickly responding to customer needs. So it’s no surprise that many businesses are looking to transform their organization’s digital strategy as soon as possible. But while it makes sense to adopt cloud quickly, it’s also important to take time and review key concepts prior to migrating or deploying your applications into cloud. Likewise, if you already have existing applications in the cloud, you’ll want to audit your environment to make sure you are following best practices. The goal is to maximize business value while optimizing cost, keeping in mind the most effective and efficient use of cloud resources.We’ve been working side by side with some complex customers as they usher in the next generation of applications and services on Google Cloud. When it comes to optimizing costs, there are lots of tools and techniques that organizations can use. But tools can only take you so far. In our experiences, there are several high-level principles that organizations, no matter the size, can follow to make sure they’re getting the most out of the cloud. In this blog post, we’ll take a look at some of these concepts, so you can effectively right-size your deployments. Then we’ll also consider the three kinds of cloud cost optimization tools, and provide a framework for how to prioritize cost optimization projects. Finally, if you want more, including prescriptive advice about optimizing compute, networking, storage and data analytics costs on Google Cloud, we’ve regrouped some of most popular blogs on the topic into an all-in-one downloadable ebook, “Understanding the principles of cost optimization.”  Cost optimization with people and processesAs with most things in technology, the greatest standards are only as good as how well they are followed. The limiting factor, more often than not, isn’t the capability of the technology, but the people and processes involved. The intersection of executive teams, project leads, finance, and site reliability engineers (SREs) all come into play when it comes to cost optimization. As a first step, these key stakeholders should meet to design a set of standards for the company that outline desired service-level profitability, reliability, and performance. We highly recommend establishing a tiger team to kickstart this initiative.Using cloud’s enhanced cost visibilityA key benefit of a cloud environment is the enhanced visibility into your utilization data. Each cloud service is tracked and can be measured independently. This can be a double-edged sword: now you have tens of thousands of SKUs and if you don’t know who is buying what services and why, then it becomes difficult to understand the total cost of ownership (TCO) for the application(s) or service(s) deployed in the cloud. This is a common problem when customers make the initial shift from an on-premises capital expenditures (CapEx) model to cloud-based operational expenditures (OpEx). In the old days, a central finance team set a static budget and then procured the needed resources. Forecasting was based on a metric such as historic growth to determine the needs for the next month, quarter, year, or even multiple years. No purchase was made until everyone had the opportunity to meet and weigh in across the company on whether or not it was needed. Now, in an OpEx environment, an engineering team can spin up resources as desired to optimally run their services. We see that for many cloud customers, it’s often something of a Wild West—where engineering spins up resources without standardized guardrails such as setting up budgets and alerts, appropriate resource labeling and frequent cadence to view cost from an engineering and finance perspective. While that empowers velocity, it’s not really a good starting position to effectively design a cost-to-value equation for a service—essentially, the value generated by the service—much less optimize spending. We see customers struggling to identify the cost of development vs. production projects in their environments due to lack of standardized labelling practices. In other cases, we see engineers over-provisioning instances to avoid performance issues, only to see considerable overhead during non-peak times. This leads to wasted resources in the long run. Creating company-wide standards for what type of resources are available and when to deploy them is paramount to optimizing your cloud costs. We’ve seen this dynamic many times, and it’s unfortunate that one of the most desirable features of the cloud—elasticity—is sometimes perceived as an issue. When there is an unexpected spike in a bill, some customers might see the increase in cost as worrisome. Unless you attribute the cost to business metrics such as transactions processed or number of users served, you really are missing context to interpret your cloud bill. For many customers, it’s easier to see that costs are rising and attribute that increase to a specific business owner or group, but they don’t have enough context to give a specific recommendation to the project owner. The team could be spending more money because they are serving more customers—a good thing. Conversely, costs may be rising because someone forgot to shut down an unneeded high-CPU VM running over the weekend—and it’s pushing unnecessary traffic to Australia. One way to fix this problem is to organize and structure your costs in relation to your business needs. Then, you can drill down into the services using Cloud Billing reports to get an at-a-glance view of your costs. You can also get more granular cost views of your environment by attributing costs back to departments or teams using labels, and by building your own custom dashboards. This approach allows you to label a resource based on a predefined business metric, then track its spend over time. Longer term, the goal isn’t to understand that you spent “$X on Compute Engine last month,” but that “it costs $X to serve customers who bring in $Y revenue.” This is the type of analysis you should strive to create.Billing Reports in the Google Cloud console let you explore granular cost detailsOne of the main features of the cloud is that it allows you to expedite feature velocity for faster time to market, and this elasticity is what lets you deploy workloads in a matter of minutes as opposed to waiting months in the traditional on-premises environment. You may not know how fast your business will actually grow, so establishing a cost visibility model up front is essential. And once you go beyond simple cost-per-service metrics, you can start to measure new business metrics like profitability as a performance metric per project. Understanding value vs. costThe goal of building a complex cloud system isn’t merely to cut costs. Take your fitness goals as an analogy. When attempting to become more fit, many people fixate on losing weight. But losing weight isn’t always a great key indicator in and of itself. You can lose weight as an outcome of being sick or dehydrated. When we aim for an indicator like weight loss, what we actually care about is our overall fitness or how we look and feel when being active, like the ability to play with your kids, live a long life, dance—that sort of thing. Similarly, in the world of cost optimization, it’s not about just cutting costs. It’s about identifying waste and ensuring you are maximizing the value of every dollar spent. Similarly, our most sophisticated customers aren’t fixated on a specific cost-cutting number, they’re asking a variety of questions to get at their overall operational fitness: What are we actually providing for our customers (unit)? How much does it cost me to provide that thing and only that thing?How can I optimize all correlated spend per unit created? In short, they have gone ahead and created their own unit economics model. They ask these questions up front, and then work to build a system that enables them to answer these key questions as well as audit their behavior. This is not something we typically see in a crawl state customer, but many of those that are in the walk state are employing some of these concepts as they design their system for the future.Implementing standardized processes from the get-goEnsuring that you are implementing these recommendations consistently is something that must be designed and enforced systematically. Automation tools like Terraform and Cloud Deployment Manager can help create guardrails before you deploy a cloud resource. It is much more difficult to implement a standard retroactively. We have seen everything from IT Ops shutting off or threatening to shut off untagged resources to established “walls of shame” for people who didn’t adhere to standards. (We’re fans of positive reinforcement, such as a pizza, or a trophy, or even a pizza trophy.) What’s an example of an optimization process that you might want to standardize early on? Deploying resources, for one. Should every engineer really be able to deploy any amount of any resource? Probably not. We see this as an area where creating a standard up front can make a big difference.Structuring your resources for effective cost management is important too. It’s best to adopt the simplest structure that satisfies your initial requirements, then adjust your resource hierarchy as your requirements evolve. You can use the setup wizard to guide you through recommendations and steps to create your optimal environment. Within this resource hierarchy, you can use projects, folders, and labels to help create logical groupings of resources that support your management and cost attribution requirements.Example of a resource hierarchy for cloudIn your resource hierarchy, labeling resources is a top priority for organizations interested in managing costs. This is essentially your ability to attribute costs back to a specific business, service, unit, leader, etc. Without labeling resources, it’s incredibly difficult to decipher how much it costs you to do any specific thing. Rather than saying you spent $36,000 on Compute Engine, it’s preferable to be able to say you spent $36,000 to deliver memes to 400,000 users last month. The second statement is much more insightful than the first. We highly recommend creating standardized labels together with the engineering and finance teams, and using labels for as many resources as you can. Review and repeat for best resultsAs a general practice, you should meet regularly with the appropriate teams to review usage trends and also adjust forecasting as necessary. The Cloud Billing console makes it easy to review and audit your cloud spend on a regular basis, while custom dashboards provide more granular cost views. Without regular reviews and appropriate unit economics, as well as visibility into your spend, it’s hard to move beyond being reactive when you observe a spike in your bill.If you’re a stable customer, you can review your spending less frequently, as the opportunities to tweak your strategies will be reliant on items like new Google Cloud features vs. a business change on your product roadmap. But if you’re deploying many new applications and spending millions of dollars per month, a small investment in conducting more frequent cost reviews can lead to big savings in a short amount of time. In some cases, our more advanced customers meet and adjust forecasts as often as every day. When you’re spending millions of dollars a month, even a small percentage shift in your overall bill can take money away from things like experimenting with new technologies or hiring additional engineers. To truly operate efficiently and maximize the value of the cloud takes multiple teams with various backgrounds working together to design a system catered to your specific business needs. Some best practices are to establish a review cadence based on how fast you are building and spending in the cloud. The Iron Triangle is a commonly used framework that measures cost vs. speed vs. quality. You can work with your teams to set up an agreed-upon framework that works for your business. From there, you can either tighten your belt, or invest more.The tools of the cost optimization tradeOnce you have a firm grasp on how to approach cost optimization in the cloud, it’s time to think about the various tools at your disposal. At a high level, cost management on Google Cloud relies on three broad kinds of tools. Cost visibility—this includes knowing what you spend in detail, how specific services are billed, and the ability to display how (or why) you spent a specific amount to achieve a business outcome. Here, keep in mind key capabilities such as the ability to create shared accountability, hold frequent cost reviews, analyze trends, and visualize the impact of your actions on a near-real-time basis. Using a standardized strategy for organizing your resources, you can accurately map your costs to your organization’s operational structure to create a showback/chargeback model. You can also use cost controls like budget alerts and quotas to keep your costs in check over time.Resource usage optimization—this is reducing waste in your environment by optimizing usage. The goal is to implement a specific set of standards that draws an appropriate intersection between cost and performance within an environment. This is the lens to look through when reviewing whether there are idle resources, better services on which to deploy an app, or even whether launching a custom VM shape might be more appropriate. Most companies that are successful at avoiding waste are optimizing resource usage in a decentralized fashion, as individual application owners are usually the best equipped to shut down or resize resources due to their intimate familiarity with the workloads. In addition, you can use Recommender to help detect issues like under- or over-provisioned VM instances or idle resources. Enabling your team to surface these recommendations automatically is the aim of any great optimization effort.Pricing efficiency—this includes capabilities such as sustained use discounts, committed use discounts, flat-rate pricing, per-second billing or other volume discounting features that allow you to optimize rates for a specific service. These capabilities are best leveraged by more centralized teams within your company, such as a Cloud Center of Excellence (CCoE) or FinOps team that can lower the potential for waste while optimizing coverage across all business units. This is something to continue to review both pre-cloud migration as well as regularly once you go live. Considering both people and processes will go a long way toward making sure your standards are useful and aligned to what your business needs. Similarly, understanding Google Cloud’s cost visibility, resource usage optimization, and pricing efficiency features will give you the tools you need to optimize costs across all your technologies and teams.How to prioritize recommendationsWith lots of competing initiatives, it can be difficult to prioritize cost optimization recommendations and ensure your organization is making the time to review these efforts consistently. Having visibility into the amount of engineering effort as well as potential cost savings can help your team establish its priorities. Some customers focus solely on innovation and speed of migration for years on end, and over time their bad optimization habits compound, leading to substantial waste. These funds could have gone towards developing new features, purchasing additional infrastructure, or hiring more engineers to improve their feature development velocity. It’s important to find a balance between cost and velocity and understand the ramifications of leaning too far in one direction over another. To help you prioritize one cost optimization recommendation over another, it’s a good idea to tag recommendations with an estimate of two characteristics:Effort: Estimated level of work (in weeks) required to coordinate the resources and implement a cost optimization recommendation.Savings: Amount of estimated potential savings (in percentage per service) that you may realize by implementing a cost optimization recommendation.While it’s not always possible to estimate with pinpoint accuracy how much a cost savings measure will save you before testing, it’s important to try and make an educated guess for each effort. For instance, knowing that a certain change could potentially save you 60% on your Cloud Storage for project X should be enough to help with the prioritization matrix and establishing engineering priorities with your team. Sometimes you can estimate actual savings. Especially with purchasing options, a FinOps team can estimate the potential savings by taking advantage of features like committed use discounts for a specific amount of their infrastructure. By performing this exercise, you want the team to be able to make informed decisions on where engineering is going, so they can focus their energy from a culture standpoint. From principles to practiceOptimizing cloud costs isn’t a checklist, it’s a mindset; you’ll have the best results if you think strategically and establish strong processes to help you stay on track. But there are also lots of service-specific steps you can take to getting your bill under control. For more tactical advice, check out these posts on how to save on your Google Cloud compute, storage, networking, data analytics, and serverless applications. Or, for a handy reference, download our “Understanding the principles of cost optimization” ebook, which regroups several of these topics in one place.
Quelle: Google Cloud Platform