Casa dos Ventos advances sustainability mission with SAP S/4HANA on Google Cloud

Over a 12-month period, Casa dos Ventos migrated 90% of its processing workloads and fully implemented SAP S/4HANA on Google Cloud. When the migration was complete, data-processing operations became more agile, allowing the company’s IT team to dramatically reduce the time they needed to process and analyze data as well as to respond to customers and regulatory authorities.Sustainability is a key concern for global organizations, and Casa dos Ventos, one of the largest suppliers of wind energy in Brazil, makes sustainability part of its mission: promoting the “environmentally responsible development” of Brazil.In Brazil, wind is the second-largest source of power in the country’s energy matrix, according to the Brazilian Wind Power Association (ABEEolica). Since its founding in 2007, Casa dos Ventos has been on a path of steady growth—the company now represents about 30% of all wind farms in operation or under construction in Brazil. Combined, these projects will generate approximately 10% of all energy produced in that country.But growth on this scale generates more than energy: It produces vast amounts of data that needs to be processed and analyzed consistently to study wind behavior, control turbines, and forecast power production and climate, to name just a few examples. These tasks are mission-critical to ensure the efficient operation of Casa dos Ventos’ wind farms.With the company’s continued expansion, it became clear that its on-premises infrastructure no longer had the capacity to process, orchestrate, and analyze such massive amounts of data. Instead of helping to solve business issues, the IT team was spending much of its time maintaining servers and managing databases and systems. According to Roberto Oikawa, CIO of Casa dos Ventos, it was taking 15 days to calculate the amount of energy generated by just one wind farm using in-house servers.An urgent need to streamline dataTo keep pace with its growing number of projects, the company needed a solution that would centralize its workflows while providing scalability and flexibility. “We needed a platform capable of loading tons of data, with streamlined support for resources used to process this data,” Oikawa says, “and allowing us to create prediction models and run machine learning processes quickly and reliably.” After 12 years in business, the company realized it was clearly time to move its data operations to the cloud.Casa dos Ventos had already been using Google Workspace for collaboration and communication, and had adopted SAP as a business-management solution. These were decisive factors in selecting Google Cloud to host its SAP S/4HANA environment.“We opted for Google Cloud because we were seeking a strategic, long-term partnership,” says Oikawa. “This isn’t just service delivery. Google Cloud brings its solid knowledge about SAP, which gave us peace of mind for decision-making.” The team ultimately decided on a hybrid environment with a cloud-first strategy that includes using physical servers for simpler, less time-sensitive tasks.More time to focus on business growthOver a 12-month period, the company migrated 90% of its processing workloads and fully implemented SAP S/4HANA on Google Cloud. When the migration was complete, data-processing operations became more agile, allowing the company’s IT team to dramatically reduce the time they needed to process and analyze data as well as to respond to customers and regulatory authorities. The time needed to predict the amount of energy generated by a specific project went from 15 days to just one day. Thanks to the new cloud infrastructure and scalable services, the company was also able to process 20 years of data in less than two hours during its weekly data processes. In addition, the company has been able to optimize wind-farm operations using Google Cloudsolutions, including AI Platform, and BigQuery for assessing costs for each project. These changes mean that Casa dos Ventos’ IT staff can spend more time working with engineering teams on high-value business issues and less time dealing with infrastructure problems.Looking ahead, Casa dos Ventos is on track to generate 1.5GW in wind power over the next year or two, an achievement that is being supported by the company’s transformation to cloud technology. “When looking at the business side, Google Cloud improved our results in terms of timing, reliability, and innovation,” Oikawa says, “Our new Energy Trading unit was conceived as a digital unit from its inception.” Lucas Araripe, Casa dos Ventos’ head of projects and new business, says that innovation, in fact, is the main purpose of his company’s work: “We always want to be ahead of others in terms of knowledge, innovation, and technology. Our partnership with Google Cloud is part of that.” Learn more about Casa dos Ventos’ SAP deployment on Google Cloud as well as other SAP customer deployments on our YouTube channel.
Quelle: Google Cloud Platform

Three ways of receiving events in Cloud Run

Cloud Run and Eventarc is a great combination for building event-driven services with different event routing options. There are two trigger types (Audit Logs and Pub/Sub) to choose from in Eventarc. Eventarc uses Pub/Sub as its underlying transport layer and provides convenience and standardization on top of it. If you wanted to, you could skip Eventarc and read messages directly from Pub/Sub in Cloud Run. This blog post details three ways of receiving events in Cloud Run and provides a decision framework on how to choose. 3 ways of receiving eventsThe three main ways of receiving events in Cloud Run are:Audit Logs via EventarcPub/Sub via EventarcPub/Sub directOption 1 is for the 90+ Google Cloud services that support Audit Logs. Eventarc reads those Audit Logs, converts them to CloudEvents, and sends them to Cloud Run services. See this guide for more on how to receive Cloud Storage events via Audit Logs using Eventarc.Option 2 is for custom applications or Google Cloud services that have a Pub/Sub integration. Eventarc reads messages from the Pub/Sub topic, converts them to CloudEvents, and sends them to Cloud Run services, as shown in this guide. Option 3 is also for custom applications or Google Cloud services that have a Pub/Sub integration. Instead of using Eventarc, you can have Pub/Sub directly HTTP push the Pub/Sub message to the Cloud Run service, as shown in this guide. Decision frameworkGiven the three event routing options, how do you decide which one is best for the service you’re interested in reading events from? Here’s decision framework to help you decide:The reasoning behind the decision framework boils down to answering these questions: Does the Google Cloud service or application support Audit Logs, Pub/Sub, or both?When both Audit Logs and Pub/Sub are supported, does the Audit Log entry have more or different info than the Pub/Sub message or vice versa?In the Pub/Sub case, do you care about standardizing on Eventarc and CloudEvents format and the convenience of not having to create Pub/Sub topics and subscriptions?Let’s explore these questions with some concrete scenarios. In the custom application scenario, there are no Audit Logs because it’s not a Google Cloud service, so you only need to consider Pub/Sub options. The question then becomes whether you want to get Pub/Sub messages directly or via Eventarc. Pub/Sub is probably the more familiar route whereas Eventarc provides a standard way of receiving events from multiple sources and a standard CloudEvent format. This is useful if you intend to read from multiple event sources and not just from Pub/Sub.If you want to read events from a Google Cloud service that supports only Audit Logs, then your only option is Eventarc. It becomes more interesting when a Google Cloud service supports both Audit Logs and Pub/Sub (e.g., Cloud Build, Cloud Storage). In these cases, the type/content of Audit Logs and Pub/Sub messages should drive your choice. For example, Cloud Storage can generate Audit Logs and it can also send a message to a Pub/Sub topic when a new object is created. The contents (bucket name, object name) and latency of both are roughly the same, so you can choose either.Cloud Build can also generate both Audit Logs and Pub/Sub messages but the Pub/Sub based build notifications have build status (success, failure) whereas Audit Logs of Cloud Build mainly provide info about admin operations such as build creation and deletion. If build status is important, it makes sense to choose Pub/Sub over Audit Logs.In both Cloud Storage and Cloud Build, if you go with Pub/Sub, you still need to decide whether you’ll read them with or without Eventarc. As already explained in the custom application scenario, this is a choice between a familiar setup with Pub/Sub vs. convenience and standardization that comes with Eventarc and CloudEvents. I should note that Eventarc aims to improve upon its event sources and contents of events in future iterations. You should expect to see richer events beyond what Audit Logs provide today.As always, feel free to reach out to me on Twitter @meteatamel for any questions or feedback.Related ArticleDemystifying event filters in EventarcLearn how to create triggers with the right filters in EventarcRead Article
Quelle: Google Cloud Platform

HSBC deploys Dialogflow, easing call burden on policy experts

Banks are among the most highly regulated businesses on the planet, subject to regulations based on geography as well as their own internal policies and procedures. HSBC, a global giant in commercial and personal banking, has experts all over the world who support thousands of their risk management colleagues each time they need an internal policy question answered. The sheer volume of HSBC’s business means that making everyday risk management decisions can result in tens of thousands of calls to internal policy experts every year. Because questions can come from all over the world, naturally there can be delays due to time zone differences and individual workloads. And depending on the day, people’s responses can vary. Steve Suarez, HSBC’s Global Head of Innovation, Finance & Risk and Gareth Butler, Head of Risk Transformation and innovation Lead for Asia Pacific, thought the bank could use artificial intelligence (AI) to take a fresh approach to operational risk and resilience. The goals would be to answer questions faster and to improve the overall consistency and quality of policy response. Building with Contact Center AI to answer questions quickly and consistentlyHSBC wanted to use AI and machine learning (ML) to reduce the time employees were spending on manually intensive queries, improve the consistency of policy response, and understand what kinds of questions were being asked. The bank envisioned building a solution that would support its large-scale, global environment. After evaluating the main AI solutions on the market, HSBC selected Google Cloud for the project, leveraging their existing strategic relationship. The bank worked with the customer engineering team at Google Cloud, who helped it to connect with partner KPMG’s Innovation Division. Together they formed a project team to architect and deliver Operational Resilience and Risk Application (ORRA), the first of what HSBC hopes will be many FAQ and document search-enabled chatbots. ORRA performs dynamic document search and powers natural conversations with Google Cloud Dialogflow, a core component ofGoogle Cloud Contact Center AI. Easily accessible to all employees from the HSBC intranet, ORRA answers queries on internal policy and framework areas applicable across the bank. And with Dialogflow, HSBC was able to build a conversational platform that quickly and accurately addressed user needs at scale. HSBC chose Dialogflow as a cost efficient, feature-rich solution for large scale conversational FAQ flow. The team created the initial knowledge base consisting of intents, utterances, and answers and used the bulk upload function to load it into the Dialogflow console. The wide range of native features was utilized to finesse intents, train responses, and to create synonyms and entities. They added smalltalk to humanize the bot responses and to create a “real-world personality”. Within the Google Cloud chatbot architecture the team implemented an inhouse document search capability which returns search responses through the friendly user interface (UI). Dialogflow coupled with this search capability enabled a groundbreaking solution for HSBC employees. Users now get direct answers to questions with the flexibility of an app using Natural Language Processing (NLP) to interrogate large documents for supplementary answers in milliseconds – all through the same UI. Dialogflow’s native machine learning and NLP technologies improve the user experience and reduce the setup required to develop complex conversation architectures.Using machine learning to inform decision making“The knowledge gained from analyzing the type, frequency, and source of queries is in itself valuable business intelligence on internal demands for information,” said Suarez. “ORRA learns from every conversation, and at the most basic level, the more questions employees have about a specific policy, the more likely that policy may be due for simplification or revision.” “In addition to providing rapid access to information, ORRA also brings important benefits for learning and development and the embedding of policies and procedures” said Butler. “As query flow increases, the architecture uses machine learning and user feedback to determine the best response to give”.“It’s about the speed of getting your information,” said Suarez. “Before the FAQ chatbot, somebody would ask you a question and if you didn’t know the answer or you gave an inconsistent answer, you’d have to do a bit of research and then come back to them. People go out and use Google to answer questions every day and receive instant, precise responses. Similarly, we’re now getting information to users in a way that feels familiar to them without having to read through an entire policy document.“ He said this gives risk managers access to immediate, accurate policy information and frees up time for subject matter experts to focus on adding value in less routine areas of their jobs. Creating conversation architecture that scales across the organizationFuture versions of ORRA will include guidance on judgment calls. “As we move forward, we’ll be adding in more around risk acceptance, risk issue, and risk relevance,” said Chris Wilson, Head of Architecture, Policy & Regulatory Mapping at HSBC. The next evolution of HSBC’s FAQ chatbot will involve scaling the architecture to capture other policies and procedural data. “This is only the beginning of our conversational AI journey,” adds Butler. The solution can accommodate more policies and documents, it can also be enhanced to support multiple languages, mobile interactions, and speech and can search numeric as well as text-based data stores. “This opens up more possibilities in this space as more information is hosted on the cloud,” Butler said.The bank is focused on how the AI/ML solution can be extended to other areas of the organization. Having invested the resources to develop and implement ORRA, they believe future expansion will be simple and cost-effective. Given the pervasiveness of chatbots across all areas of our lives, Suarez sees this as an imperative for the bank. “In the future, we are going to be interacting with chatbots frequently for routine transactions,” he said. “As a leader in the financial services industry and a technology innovator, HSBC is taking the first steps in using cloud-based chatbot technology to get fast, accurate answers to our customers.”
Quelle: Google Cloud Platform

Open innovations, scaling data science, and amazing data analytics stories from industry leaders

February might be the shortest month of the year, but it was certainly one of our busiest for data analytics at Google! From our partnership announcement with Databricks to the launch of Dataproc Hub and BigQuery BI Engine, and the incredible journeys of Twitter, Verizon Media, and J.B. Hunt—this month was full of great activities for our customers, partners, and the community at large.Our commitment to an open approach for data analyticsMuch has been written about our launches over the past month, and while it would be too much to list all the great reviews and articles, I thought I’d direct you to SiliconAngle Maria Deutscher’s story from last week on our commitment to an open data analytics approach.  Her piece, covering last week’s BI Engine and Materialized Views launches, does a great job highlighting how data analytics, and BigQuery in particular, play a key role in our overall strategy. The average organization has tens (sometimes hundreds) of BI tools. These tools might be ours, our partners’, or custom applications customers have built using packaged and open-source software. We’re delighted by the amazing support this effort has gathered from our partners: from Microsoft to Tableau, Qlik, ThoughtSpot,  Superset, and many more.Getting started with BI Engine PreviewWe are committed to creating the best analytics experience for all users by meeting them in the tools they already know and love. That’s why BI Engine works seamlessly with BI tools without requiring any additional changes from end-users. We can’t wait to tell you how customers are adopting this new offering. Join our webinar “Delivering fast and fresh data experiences” by registering here.Running data science securely at scale Running data science at scale has been a challenge for many organizations. Data scientists want the freedom to use the tools they need, while IT leaders need to set frameworks to govern that work.  Dataproc Hub is the solution that provides freedom within a governed framework. This new functionality lets data scientists easily scale their work with templated and reusable configurations and ready-to-use big data frameworks.  At the same time, it provides administrators with integrated security controls, the ability to set auto scaling policies, auto-deletions, and timeouts to ensure that permissions are always in sync and that the right data is available to the right people.  Dataproc Hub is both integrated and open. AI Platform Notebooks customers who want to use BigQuery or Cloud Storage data for model training, feature engineering, and preprocessing greatly benefit from this new functionality. With Dataproc Hub, data scientists can leverage APIs like PySpark and Dask without much setup and configuration work, as well as accelerate their Spark XGBoost pipelines with NVIDIA GPUs to process their data 44x faster at a 14x reduction in cost vs. CPUs. You’ll find more information about our Dataproc Hub launch here, and if you’d like to dive into model training with RAPIDS, Dask, and NVIDIA GPUs on AI Platform, this blog is a great place to start.As Scott McClellan, Sr Director, Data Science Product Group at NVIDIA wrote this past week, it’s time to make “data science at scale more accessible”.  We’re proud to count NVIDIA as a partner in this journey!Dataproc in a minuteAs I wrote in my post last month, our goal is to democratize access to data science and machine learning for everyone. You don’t have to be a data scientist to take advantage of Google’s Data Analytics machine learning capabilities.  Any Google Workspace user can use machine learning right from Connected Sheets. To get started, check out this blog: How to use a machine learning model from a Google Sheet using BigQuery ML.  That’s right, you can tap into the power of machine learning right from Google Sheets, our spreadsheet application which, today, counts over 2 billion users. So, don’t be shy, start using data at scale and make an impact!Building the future together is betterThis past month, we were particularly inspired by Nikhil Mishra’s, Sr. Director of Engineering at Verizon Media, guest post about Verizon Media’s migration journey to the cloud. Mishra dives deep into the process that led to their final decision, from identifying problems to solution requirements to the entire proof of concept used to select BigQuery and Google’s Looker. This is a must-read for those looking for practical guidance to modernize and optimize for scale, performance, and cost.Employing the right cloud strategy is critical to our customers’ transformation journey and if you’re looking for straightforward guidance, another great customer example to follow is Twitter. In his interview with Venturebeat, Twitter platform leader Nick Torno explains how the company leverages Google BigQuery, Dataflow, and Machine Learning to improve the experience of people using Twitter. The piece concludes with guidance for breaking down silos and future-proofing your data analytics environment while delivering value quickly through business use cases.We were also delighted to support J.B. Hunt, one of the largest transportation logistics companies in North America, in their goal to develop new services to digitally transform the shipping and logistics experience for shippers, carriers, and service providers. Real-time data is a cornerstone in the $1 trillion logistics industry, and today’s carriers rely on a patchwork of IT systems across supply chain, capacity utilization, pricing, and transportation execution. J.B. Hunt’s 360 platform aims to centralize data from across these different systems, helping to reduce waste, friction, and inefficiencies.You might also find inspiration in hearing about how Google Cloud is helpingFord transform their automotive technologies and enabling BNY Mellon to better predict billions of dollars in daily settlement failures. We also recently agreed to extend our partnership with the U.S. National Oceanic and Atmospheric Administration (NOAA), empowering them to continue sharing their data more broadly than ever—with some pretty cool results. Feature highlights you might have missedAt Google Cloud, the aim is always to continuously improve and introduce new features and functionality that make a difference for our customers. Last month, we announced the public preview launch of the replication application in Data Fusion to enable low-latency, real-time data replication from transactional and operational databases such as SQL Server and MySQL directly into BigQuery. Data Fusion’s simple, wizard-driven interface lets citizen developers set up replication easily. It comes with an assessment tool that not only identifies schema incompatibilities, connectivity issues, and missing features prior to starting replication, but also provides corrective actions. Replication in Data Fusion means that you’ll benefit from end-to-end visibility: real-time operational dashboards to monitor throughput, latency, and errors in replication jobs, zero-downtime snapshot replication into BigQuery, and support for CDC streams, so users have access to the latest data in BigQuery for analysis and action.Cloud Data Fusion’s integration within the Google Cloud platform ensures that the highest levels of enterprise security and privacy are observed while making the latest data available in your data warehouse for analytics. This launch includes support for Customer-Managed Encryption Keys (CMEK) and VPC-SC. If you’re new to Data Fusion, I suggest you check out Chapter 1 of our blog series on data lake solution architecture with Data Fusion and Cloud Composer.   Speaking of fast-moving and ever-changing data, you might want to check out the latest best practices for continuous model evaluation with BigQuery ML by Developer Advocates Polong Lin and Sara Robinson.  Their post takes us through a full model’s life cycle—from creating it with BigQuery ML, evaluating data with ML.EVALUATE, creating a Stored Procedure to assess incoming data to using it to insert evaluation metrics into a table. This blog shows the power of an integrated platform built with BigQuery and Cloud Scheduler, and what you can achieve—from using Cloud Functions to visualizing model metrics in Data Studio. It has fantastic guidance that I hope you’ll enjoy!Finally, we also covered data traceability this past month with a post on how to architect a data lineage system using BigQuery, Data Catalog, Pub/Sub & Dataflow. Data lineage is critical for performing data forensics, identifying data dependencies, and above all, securing business data.Data Catalog provides a powerful interface that allows you to sync and tag business metadata to data across Google Cloud services as well as your own on-premises data centers and databases. Read thisgreat article for insights on our recommended architecture for the most common user journeys and start here to build a data lineage system using BigQuery Streaming, Pub/Sub, ZetaSQL, Dataflow, and Cloud Storage.See how BlackRock uses Data Catalog: Data discovery and Metadata management in Action!That’s it for February! I can’t wait to hear back from you about what you think, and I’m looking forward to sharing everything we’ve got coming up in March.
Quelle: Google Cloud Platform

Transform data to secure it: Use Cloud DLP

When you want to protect data in-motion, at rest or in use, you usually think about data discovery, data loss detection and prevention. Few would immediately consider transforming or modifying data in order to protect it. But doing so can be a powerful and relatively easy tactic to prevent data loss.  Our data security vision includes transforming data to secure it, and that’s why our DLP product includes powerful data transformation capabilities.So what are some data modification techniques that you can use to protect your data and the use cases for them?Delete sensitive elementsLet’s start with a simple example: one of the best ways to protect payment card data and comply with PCI DSS is to simply delete it. Deleting sensitive data as soon as it’s collected (or better yet, never collecting it in the first place) saves resources on encryption, data access control and removes – not merely  reduces – the risk of data exposure or theft.More generally, deleting the data is one way to practice data minimization.  Having less data that attracts the attackers is both a security best practice (one of the few that is as true in the 1980s as in 2020s) and a compliance requirement (for example, it serves as one of the core principles of GDPR)Naturally, there are plenty of types of sensitive data that you can’t simply delete, and for which this strategy will not work, like business secrets or patient information at a hospital. But for many cases, transforming data to protect it satisfies the triad of security, compliance and privacy use cases.In many cases, data retains its full value even when sensitive or regulated elements are removed. Customer support chat logs work just as well after an accidentally shared payment card number is removed. A doctor can make a diagnosis without seeing a Social Security Number (SSN) or Medical Record Number (MRN). Transaction trend analysis works just as well when bank account numbers are not included. For many contexts, the sensitive, personal or regulated parts don’t matter at all. Another area this works well is when a communication’s purpose is satisfied even with data removed. For example, a support rep can help a customer use an app without knowing that customer’s first name and last name.As another example, our DLP system can clean up the datasets used to train an AI, so that the AI systems can learn without being exposed to any personal or sensitive data. Even first and last names can be automatically removed from a stream of data before it’s used to train an AI. Does your DLP do that? In practice, this tactic can be applied to both structured (databases) and unstructured (email, chats, image captures, voice recordings) data. Removing “toxic” elements that are a target for attackers or subject to regulations reduces the risk, and preserves the business value of a dataset.sourceTransforming data as part of DLP goes beyond just deleting it. Various forms of data masking (both static and dynamic) are key to this approach. DLP can mean simply removing sensitive data from view, like obscuring what is shown to a call center employee. Notably, Cloud DLP works on stored or streamed data including unstructured text, structured tabular data, and even images. Paired with services like Speech-to-Text, Cloud DLP can even be used to redact audio data or transcripts. Ultimately, the goal of any DLP strategy is to reduce the risk of sensitive data falling into the wrong hands. This is subtly different and broader than merely securing the data. If we can reduce the risk of holding the data, we in turn reduce the risk of losing it. Replace sensitive elements with safe equivalentsSometimes we can’t remove even small parts of sensitive data, but we can replace them with safer elements through tokenization. This is alsoa feature of Google Cloud DLP.One of the advantages of tokenization is it can be reversible. Tokenization both reduces risk and helps ensure compliance with PCI DSS or other regulations—depending on the data being replaced. We can tokenize sensitive elements of data in storage or during display in order to reduce its risk. An insurance company may collect and use customer driver’s license numbers for record validation, and replace those numbers with a token when displayed elsewhere.Another situation in which tokenization is particularly helpful is when two datasets need to be joined for analysis, and the best place to join them is a sensitive piece of data like an SSN. For instance, when a patient records database needs to be joined to a lab results database, or loan applications to financial records, we can tokenize the sensitive columns of both datasets using the same algorithm and parameters, and they can be joined without exposing any sensitive data. Take fraud analysis as another example. Our case study shows that DLP can be used to remove international mobile subscriber identity (IMSI) numbers from the data stored in BigQuery. The data can be restored later, such as when fraud is confirmed and the investigation is ongoing. Note the staggering volumes of data being processed.Now, some readers may point out that tokenization and DLP are traditionally considered different technologies. Cloud DLP is a broader system that covers both of these functions, as well as several others in one scalable, cloud-native solution. This allows us to solve for the greater goal of reducing risk while retaining the business value of a dataset.Transform personal dataThe risk of losing data is not only that criminals may steal and use it to defraud your company. There’s also the risk of privacy violations, off-policy use and other situations that come from the exposure of personal data. The loss of personal data is a twofold risk; both that of security and of privacy.This makes transformation of data for DLP a worthwhile tactic for both privacy and security purposes. For example, an organization may be sharing data with a partner to run a trend analysis of their mutual customers. Generalizing demographics such as age, zip code, and job title can help reduce the risk of these partial identifiers from linking to a specific individual. This is useful for both citizen data collected by government agencies and healthcare research done by the Universities, for example.Similarly, a user may share transactional data that includes dates that someone could use to triangulate their location, like travel dates, purchase dates, or calendar information. Cloud DLP can prevent this misuse with a date shifting technique that shifts dates per “customer” so that behavioral analysis can still be done, but the actual dates are blurred. Again, this is not a feature of any traditional DLP system.Note that many of these methods are not reversible, and irrevocably change or destroy elements of a dataset. Yet they preserve the value of that dataset for specific business use cases, while reducing the risk inherent in that data. This makes using DLP a worthy consideration for teams looking to reduce both security and privacy risk, while retaining the ability to derive value from a dataset, without having to waste compute resources on encryption and granular access control. The constant balancing act between risk and utility becomes significantly easier when employing this approach.Google Cloud DLP can help you employ all of these strategies. Read more about the future of DLP in Not just compliance: reimagining DLP for today’s cloud-centric world. If you are Google Cloud customer, go here to get started with DLP.Related ArticleNew whitepaper: Designing and deploying a data security strategy with Google CloudOur new whitepaper helps you start a data security program in a cloud-native way and adjust your existing data security program when you …Read Article
Quelle: Google Cloud Platform

Build your future with GKE

American poet Maya Angelou said ”If you don’t know where you’ve come from, you don’t know where you’re going.” We agree. Today, as we kick off the Build with Google Kubernetes Engine event, and fresh off our GKE Autopilot launch, we wanted to take a step back and reflect on just how far GKE has come. In justsix short years, GKE has become one of the most widely adopted services for running modern cloud-native applications, used by startups and Fortune 500 companies alike. This enthusiasm inspires us to push the limits of what’s possible with Kubernetes, making it easier for you to focus on creating great services for your users, while we take care of your Kubernetes clusters. So let’s take a look at where we’ve been with Kubernetes and where we are today—so we can build the future together.   Sustained innovationA lot has changed in the container orchestration space since we created Kubernetes and opened it up to the world more than 6 years ago. It’s a little hard to remember, but back when we first designed Kubernetes, there was no industry standard for managing fleets of containerized applications at scale. Because we had developed so many technical innovations for containers already (e.g., container optimized OS), it was only natural for us to propose a new approach for managing containers—one based on our experience at the time, launching billion containers every week for our internal needs.In 2015, we co-founded the Cloud Native Computing Foundation (CNCF) as a vendor-neutral home for the Kubernetes project. Since then, a diverse, global community of developers has contributed to—and benefitted from—the project. Last year alone, developers from 500+ companies contributed to Kubernetes, and all the major cloud providers have followed in our footsteps in offering a managed Kubernetes service. This broad industry support for the technology we developed helps us deliver on our vision: giving customers the choice to run their workloads where and when they want, without being stuck on a legacy cloud provider with proprietary APIs.Community leadershipSince its inception as an internal Google project, we’ve only continued to invest in Kubernetes. Under the auspices of the CNCF, we’ve made over 680,000 additional contributions to the project, including over 123,000 contributions in 2020. That’s more than all the other cloud providers combined. When you truly want to take advantage of Kubernetes, there’s no match for Google’s expertise—or GKE.We also actively support CNCF with credits to host Kubernetes on Google Cloud, enabling 100 million container downloads every day and over 400,000 integration tests per month, totaling over 300,000 core hours on GKE and Google Cloud. (Yes, you read that right, the Kubernetes project itself is built and served from GKE and Google Cloud.) Customer outcomesAs the creators of Kubernetes, and with all this continued investment, it’s not surprising that we have a great managed Kubernetes service; in fact, I think we can credibly claim it’s the best one in the market. Enterprises flock to GKE to solve for speed, scale, security and availability. Among the Fortune 500, five out of top 10 telecommunications, media and gaming companies, 6 out of top 10 healthcare and lifesciences companies, and 7 out of top 10 retail and consumer packaged goods companies all use GKE. Leading technology companies are also embracing GKE, for example, Databricks is enabling customers to leverage a Google Kubernetes Engine-based Databricks service on Google Cloud.When it comes to scale, GKE is second to none. After all, Google itself operates numerous globally available services like YouTube, Gmail and Drive, so we know a thing or two about deploying workloads at scale. We bring this expertise to Kubernetes in a way that only Google can. For example, Bayer Crop Science used GKE to seamlessly scale their research workloads over 200x with 15,000 node clusters. GKE offers native security capabilities such as network policy logging, hardened sandbox environment, vulnerability scanning, shielded nodes (that use a cryptographically verifiable check) and confidential nodes—all designed to simplify implementing a defense-in-depth approach to security, so you can operate safely at scale. Customers like Shopify trust GKE to help them handle terrific scale with no interruptions. Over the most recent Black Friday Cyber Monday period, Shopify processed over $5B in transactions!It also offers a series of industry-first capabilities such as release channels, multi-cluster support, four-way auto-scaling, including node auto repair to help improve availability. And that’s just its feature set—GKE also helps optimize costs with efficient bin packing and auto-scaling. Customers like OpenX are saving up to 45% using GKE. GKE Autopilot momentumThis leads us to GKE Autopilot, a new mode of operation for GKE that helps reduce the operational cost of managing clusters, optimize your clusters for production, and yield higher workload availability. Already since its launch last month, customers like Strabag and Via Transportation report seeing dramatic improvement in the performance, security, and resilience of their Kubernetes environments, all while spending less time managing their clusters. In short, we’ve worked hard to deliver the most configurable, secure, scalable, and automated Kubernetes service on the market today. And we’re just getting started. With 5+ years of ongoing investment in Kubernetes, you can be confident that GKE will be there to support your success and growth— today and into the future.Interested in showing just how much you love GKE? Join the Google Cloud {Code_Love_Hack} hackathon to show us how you use GKE, containers, and Cloud Code to spread the love of coding! Registration is open and we’re excited for all the great projects you’ll make using GKE!Related ArticleLooking ahead as GKE, the original managed Kubernetes, turns 5Happy birthday, GKE. As we look ahead, we wanted to share five ways we’re continuing our work to make GKE the best place to run Kubernetes.Read Article
Quelle: Google Cloud Platform

Introducing #AskGoogleCloud: A community driven YouTube live series

We’re excited to introduce a new series on the Google Cloud Tech YouTube channel connecting the cloud community directly with our Google Cloud product experts. We’ll be selecting questions that use #AskGoogleCloud on Twitter and YouTube and answer them in a featured premiere each quarter by experts in that area. Each premiere will be paired with a live chat so you can ask your questions live and get them answered with the speakers featured.Our first segment will be aired on March 12th, 2021 on the topic of serverless architectures featuring Developer Advocates Stephanie Wong, Martin Omander and James Ward. They’ll be addressing questions on the best workloads for serverless, the differences between “serverless” and “cloud native”, how to accurately estimate costs for using Cloud Run and much more. Serverless content on demandServerless ExpeditionsWe have tons of serverless content on demand on the Google Cloud Tech YouTube channel. Be sure to check out Serverless Expeditions, a series that covers all things serverless from using Python on Google Cloud with Cloud Run, to Cloud Functions vs. Cloud Run to Securing a REST API with JWT, concepts.Serverless containers with Cloud RunCheck out this video to learn how to deploy serverless containers in 3 environments using Cloud Run and Knative. In this demo, we deploy a serverless microservice that transforms word documents to PDFs.Building APIs for serverless workloads with Google CloudThis video demonstrates Google Cloud’s API Gateway—a tool that helps you create, secure, and monitor APIs for many of Google Cloud serverless backends such as Cloud Functions, Cloud Run, and more. Learn how you can provide secure access to your backend services through a well-defined REST API, providing consistency across all of your services, regardless of the service implementation.6 strategies for scaling your serverless applicationsThis video walks you through a few tips that can help you scale serverless workloads while protecting underlying resources. Learn how to configure instance scaling limits, use Cloud Tasks to limit the rate of work done, utilize stateful storage, and more. This is a great episode to understand how to improve performance and scalability for your serverless applications.Stay updated with the latest videosThe Google Cloud Tech YouTube channel has new daily videos to help you build what’s next with secure infrastructure, developer tools, APIs, data analytics and machine learning. Subscribe to get notified of our newest content!
Quelle: Google Cloud Platform

How Songkick harmonized their infrastructure with Memorystore

Songkick is a U.K.-based concert discovery service and live music platform owned by Warner Music Group connecting music fans to live events. Annually, we help over 175 million music fans around the world track their favorite artists, discover concerts and live streams, and buy tickets with confidence online and via their mobile apps and website. We have about 15 developers across four teams, all based in London, and my role is to provide support across those teams by helping them to make technical decisions and to architect solutions. After migrating to Google Cloud, we wanted a fully managed caching solution that would integrate well with the other Google tools that we’d come to love, and free our developers to work on innovative, customer-delighting products. Memorystore, Google’s scalable, secure, and highly available memory service for Redis, helped us meet those goals. Fully managed Memorystore removed hassles Our original caching infrastructure was built solely with on-premises Memcached, which we found simple to use at the time. Eventually, we turned to Redis to leverage advanced features like dictionaries and increments. In our service-oriented architecture, we had both of these open source data stores working for us. We had two Redis clusters—one for persistent data, and one as a straightforward caching layer between our front end and our services.When we were making decisions about how to use Google Cloud, we realized there was no real advantage for having two caching technologies (Memcached and Redis) and decided to use only Redis because everything we used it for could be handled by Redis and this way we don’t need knowledge in both databases. We do know that Redis can be more complex to use and manage but that wasn’t a big concern for us because it would be completely managed by Google Cloud when we use Memorystore. With Memorystore automating complex tasks for Redis like enabling high availability, failover, patching, and monitoring, we could focus that time now on new engineering opportunities.We considered the hours we spent fixing broken Redis clusters and debugging network problems. Our team is more heavily experienced in developing versus managing infrastructure, so problems with Redis had proven distracting and time-consuming for the team. Also, with a self-managed tool, there would potentially be some user-facing downtime. But Memorystore was a secure, fully managed option that was cost-effective and promised to save us those hassles. It offered the benefits of Redis without the cost of managing it. Choosing it was a no-brainer. How Memorystore works for usLet’s look at a couple of our use cases for Memorystore. We have two levels of caching on Memorystore—the front end caches results from API calls to our services and some services cache database results. Usually, our caching key for the front end services is the URL and any primitive values that will get passed. With the URL and the query parameters, the front end looks to see if it already has a result for it, or if it needs to then go talk to the service. We have a few services where we actually have a caching layer within the service itself that talks to Redis first before deciding whether it needs to go, then invokes our business logic and talks to the databases. That caching sits in front of the service, operating on the same principle as the front-end caching. We also use Fastly as a caching layer in front of our front ends. So, on an individual page level, the whole page may be heavily cached in Fastly, such as when a page is for a leaderboard of the top artists on the platform.Memorystore comes in for user-level content, such as if there’s an event page that pulls some information about the artist and some information about the event, and maybe some recommendations for the artists. If the Fastly cache on the artist page had expired, it would go to the front end, which would know to talk to the various services to display all of the requested information on the page. In this case, there might be three separate bits of data sitting in our Redis cache. Our artist pages have components that are not cached in Fastly, so there we rely much more heavily on Redis. Our Redis cache TTL (time-to-live) tends to be quite low; sometimes we have just a ten-minute cache. Other times, with very static data, we can cache it in Redis for a few hours. We determine a reasonable caching time for each data item, and then set the TTL based on that determination. A particular artist might be called 100,000 times a day, so even putting just a ten-minute cache on that makes a huge difference in how many calls a day we have to put into our service. For this use case, we have one highly available Memorystore cluster of about 4 GB of memory, and we use a cache eviction policy of allkeys-lru (least recently used). Right now on that cluster, we’re getting about 400 requests per second in peaks. That’s an average day’s busy period, but it’ll spike much higher than that in certain circumstances. We had two different Redis clusters in our old infrastructure. The first is as just described. The second was persistent Redis. When considering migration to Google Cloud, we decided to use Redis in the way it really excels in and decided to simplify, and re-architect the four or five features that use the persistent Redis, either using Cloud SQL for MySQL or using BigQuery. Sometimes we use Redis to aggregate data, and now that we’re on Google Cloud, we could just use BigQuery and have far better analysis options than we had for aggregating on Redis.We also use Memorystore as a distributed mutex. There are certain actions in our system where we don’t want the same thing happening concurrently—for example, a migration of data for a particular event, where two admins might be trying to pick up the same piece of work at the same time. If that data migration happened concurrently, it could prove damaging to our system. So we use Redis here as a mutex lock between different processes, to ensure they happen consecutively instead of concurrently. Memorystore and Redis work for us in peaceful harmonyWe have not seen any problems with Redis since the migration. We also love the monitoring capabilities you get out of the box with Memorystore. When we deploy a new feature, we can easily check if it suddenly fills the cache, or if we have a really low hit ratio that indicates we’ve made an error in our implementation.Another benefit: the Memorystore interface works exactly like you’re just talking to Redis. We use ordinary Redis in a Docker container in our development environments, so when we’re running it locally, it’s seamless to check that our caching code is doing exactly what it’s meant to. We have both production and staging environments, both Virtual Private Clouds, each with its own Memorystore cluster. We have unit tests, which never really touch Redis, and integration tests, which talk to a local MySQL in Docker and a Redis in Docker as well. And we also have acceptance tests—browser automation tests that run in the staging environment, which talk to Cloud SQL and Memorystore.Planning encores with MemorystoreFor a potential future use case for Memorystore, we’re almost certainly going to be adding Pub/Sub to our infrastructure, and we’ll be using Redis to deduplicate some messages coming from Pub/Sub, such as when we don’t want to send the same email twice in quick succession. We’re looking forward to Pub/Sub’s fully managed services as well, since we’re currently running RabbitMQ, which too often requires debugging. We performed an experiment using Pub/Sub for the same use case, and it worked really well, so it made for another easy decision.Memorystore is just one of Google’s data cloud solutions we use everyday. Additional ones include Cloud SQL, BigQuery and Dataflow for an ETL pipeline, data warehousing, and our analytics products. There, we aggregate data that the artist is interested in, feed that back into MySQL, and then surface that in our artist products. Once we have Pub/Sub, we’ll have virtually every bit of Google Cloud database type. That’s evidence of how we feel about Google Cloud’s tools.Learn more about the services and products making music at Songkick. Curious to learn more about Memorystore? Check out the Google Cloud blog for a look at performance tuning best practices for Memorystore for Redis.Related ArticleGo faster and cheaper with Memorystore for Memcached, now GALearn about fully managed Memorystore for Memcached, which is compatible with open-source Memcached protocol and can save database costs …Read Article
Quelle: Google Cloud Platform

Scaling workloads across multiple dimensions in GKE

In Google Kubernetes Engine (GKE), application owners can define multiple autoscaling behaviors for a workload using a single Kubernetes resource: Multidimensional Pod Autoscaler (MPA). The challenges of scaling Pods horizontally and verticallyThe success of Kubernetes as a widely adopted platform is grounded in its support for a variety of workloads and their many requirements. One of the areas that has continuously improved over time is workload autoscaling. Dating back to the early days of Kubernetes, Horizontal Pod Autoscaler (HPA) was the primary mechanism for autoscaling Pods. By the very nature of its name, it provided users the ability to have Pod replicas added when a user-defined threshold of a given metric was crossed. Early on this was typically CPU or Memory usage, though now there’s support for custom and external metrics.A bit further down the line, Vertical Pod Autoscaler (VPA) added a new dimension to workload autoscaling. Much like its name suggests, VPA had the ability to make recommendations on the best amount of CPU or Memory that Pods should be requesting based on usage patterns. Users can then either review those recommendations and make the call as to whether or not they should be applied, or entrust VPA to apply those changes automatically on their behalf.Naturally, Kubernetes users have sought to get the benefits from both of these forms of scaling.While these autoscalers work well independent of one another, the results of running both at the same time can produce unexpected results.Picture this example:HPA adjusts the number of replicas for a Pod to maintain a target 50% CPU utilizationVPA, when configured to automatically apply recommendations, could fall into a loop of continuously shrinking CPU requests – a direct result of HPA maintaining its relatively low target for CPU utilization!Part of the challenge here is that when configured to act autonomously, VPA applies changes for both CPU and memory. Thus, the contention can be difficult to avoid as long as VPA is automatically applying changes.Users have since accepted compromises in one of two ways: Using HPA to scale on CPU or memory and using VPA only for recommendations, building their own automation to review and actually apply the recommendationsUsing VPA to automatically apply changes to CPU and memory, while using HPA based on custom or external metricsWhile these workarounds are suitable for a handful of use cases, there are still workloads that would benefit from autoscaling across the dimensions of both CPU and memory. For example, web applications may require horizontal autoscaling on CPU when CPU bound – but may also desire vertical autoscaling on memory for reliability in the event of misconfigured memory that results in OOMkilled events for the container.Multidimensional Pod Autoscaler The first feature available in MPA allows users to scale Pods horizontally based on CPU utilization and vertically based on memory, available in GKE clusters versions 1.19.4-gke.1700 or newer.In the MPA schema, there are two critical constructs that enable users to configure their desired behavior: goals and constraints. See the below manifest for an MPA resource, which has been shortened for readability:Goals allow for users to define targets for metrics. The first supported metric is target CPU utilization, similar to how users define target CPU utilization in an HPA resource. The MPA will attempt to ensure that these goals are met by distributing load across additional replicas of a given Pod.Constraints, on the other hand, are a bit more stringent. These take precedence over goals, and can be applied either to global targets – think min and max replicas of a given Pod – or specific resources. In the case of vertical autoscaling, this is where users get to a.) specify that memory is controlled by MPA and b.) define the upper and lower boundaries for memory requests for a given Pod should they need to.Let’s test this out! We’ll use Cloud Shell as our workstation and create a GKE cluster with a version that supports MPA:We’ll use the standard php-apache example Pods from the Kubernetes documentation on HPA. These manifests will create three Kubernetes objects – a Deployment, a Service, and a Multidimensional Pod Autoscaler.The Deployment consists of a php-apache Pod, is exposed via a Service type: LoadBalancer, and is managed by a Multidimensional Pod Autoscaler (MPA). The Pod template in the Deployment is configured to request 100 millicores in CPU and 50 mebibytes in memory. The MPA is configured to aim for 60% CPU utilization and adjusting Pod memory requests based on usage.Once we have the resources deployed, grab the External IP address for the php-apache Service.kubectl get svcWe will then use the hey utility to send artificial traffic to our php-apache Pods and thus trigger action from the MPA, accessing the Pods via the Service’s external IP address.hey -z 1000s -c 1000 http://<your-service-external-ip>The MPA will then scale the Deployment horizontally, adding Pod replicas to handle the incoming traffic. kubectl get pods -wWe can also observe the amount of CPU and memory each Pod replica is using:kubectl top podsIn the output from the previous command, Pods should be utilizing well over the memory requests that we specified in the Deployment. Digging into the MPA object, we can see that the MPA notices that as well, recommending an increase in memory requests.kubectl describe mpaEventually, we should see MPA actuate these recommendations and scale the Pods vertically. We will know this is complete by observing an annotation in the Pod that denotes action was taken by the MPA, as well as the new memory requests adjusted to reflect the MPA’s action.kubectl describe pod $POD_NAMEConclusionMultidimensional Pod Autoscaler solves a challenge that many GKE users have faced, exposing a new method to control horizontal and vertical autoscaling via a single resource. Try it in GKE versions 1.19.6-gke.600+, currently available in the GKE Rapid Channel. Stay tuned for additional functionality in MPA!A special thanks to Mark Mirchandani, Jerzy Foryciarz, Marcin Wielgus, and Tomek Weksej for their contributions to this blog post.Related ArticleKubernetes best practices: upgrading your clusters with zero downtimeJust like your applications, Kubernetes is constantly getting new features and security updates. Learn how GKE can make upgrading your Ku…Read Article
Quelle: Google Cloud Platform

Hackathons aren’t just for programmers anymore

Companies and organizations all over the world hold hackathons to help their software developers sharpen skills, improve collaboration, cultivate creativity, and experiment with new ideas. But what about non-technical employees? Is coding ability a prerequisite to participating in hackathons and creating technical solutions? Thanks to no-code application development, the answer is no. Whereas non-technical employees would have been relegated to sideline roles in hackathons a few years ago, today’s businesses are increasingly investing in platforms that let anyone build apps and technology optimizations without writing any code. Hackathons needn’t apply only to software developers, and with no-code tools, “non-technical” employees can transform into “citizen developers” who are empowered to use technology to build creative solutions. From apps that manage inventory and logistics to apps for collecting data in the field to process optimizations that automate time-consuming steps, scores of solutions are needed but never built because traditional developers have finite resources and time–but no-code and citizen developers can change all of that by democratizing the tools of innovation. AppSheet, Google Cloud’s no-code application development platform, was built for this democratization, and we’ve observed across many of our customers that hackathons are an excellent way to kickstart citizen development programs and inspire the workforce. For example, Globe Telecom, a leading mobile network provider in the Philippines, anticipated 30 or so teams for its first no-code hackathon–but employees were so enthusiastic, over 100 teams applied to participate. The winning app helps optimize workflows for discovering illegal signal booster activity, which can be a threat to the integrity of Globe’s services, and other finalists included inventory management apps for the company’s retail outlets, development planning and evaluation apps to help employees collaborate on goals in real-time, and even a vehicle maintenance app that includes gas price reporting. These kinds of projects don’t replace traditional IT, but they provide a pathway for apps that IT never would have time to build and pick some of the low-hanging fruit in the IT backlog. As a result, just as no-code can empower citizen developers to build solutions, it can also free up traditional developers to work on more sophisticated or strategic initiatives. To help your business enjoy these benefits, today we’re kicking off a two-part series to help your organization run a successful no-code hackathon. In this article, we’ll explore some of the benefits you can expect from your hackathon as well as the goals you should keep in mind, and in our next article, we’ll dive into a framework to help you pull off the big event.  How no-code hackathons drive value Though no-code hackathons can produce fantastic apps, the goal is not per se producing an astonishingly innovative app out of the gate so much as creating the culture in which citizen development can thrive in partnership with classical IT over the long run. Hackathons promote no-code app development. One of the foremost goals is of course defining what citizen development is and creating space for employees to experiment with solutions. Often, non-technical employees are uniquely positioned to assist with solving common problems simply because they are the ones that encounter these issues each and every day. Imagine a developer trying to assess an operation lead’s request for an application that handles equipment assessment tracking and requires a multi-step approval process for 20 employees on a manufacturing floor. Because the people working on the manufacturing floor are likely the only ones intimately familiar with this task  and its pain points, it could take a traditional developer countless cycles to achieve the desired result—and that’s if the manufacturing team can make a successful case to have the request added to the IT backlog in the first place. But with a no-code platform, the people working on the manufacturing floor can actually build the app themselves. Citizen developers are powerful because they are often the subject matter experts who can describe the need and the process–and to unleash that power, they simply require a  platform and program to help them execute. Hackathons drive platform adoption. By communicating the potential of citizen development, hackathon organizers set the stage for the next goal: actually getting employees to adopt the no-code platform. By introducing a no-code platform during a time-boxed event with a goal in mind, hackathons provide a structured balance between experimentation and urgency, and they generate scalable best practices and insights that can be leveraged for future no-code projects. We’ve found that these efforts, and the “learn by doing” culture they promote, increase the likelihood that hackathon participants will continue to use the platform, and that platform adoption will spread to other parts of the workforce. When the person closest to the problem is empowered to do so, an organization can remain future-focused, rather than scrambling to address deficits or petition for limited IT resources. Hackathons foster collaboration between IT and business users. As mentioned, citizen development is not meant to replace IT so much as to give line-of-business workers more tools to innovate, remove smaller projects from the IT backlog, provide pathways for apps that otherwise might never have been built, and free up developers to focus on high-value strategic initiatives. Additionally, citizen development should not happen behind IT’s back. For example, a strong no-code platform lets employees experiment and build through self-service tools while still giving administrators governance guardrails and visibility into and control over how the databases and APIs behind the no-code platform are being accessed. All of this means that IT and citizen developers are not adversarial, “us versus them” communities but rather symbiotic forces. Hackathons can help establish this relationship from the start, both defining best practices for collaboration between business and IT teams and aligning everyone around the shared benefits of the no-code program.Executing on the promise of no-code hackathonsNow that you’re acquainted with the benefits and goals of a no-code hackathon, you’re probably wondering, “how do I execute all of this within my specific organization?” We’ll address that question in our next installment–but until then, check out this articleto see more of the things people are building, without coding, in AppSheet.
Quelle: Google Cloud Platform