Opinary generates recommendations faster on Cloud Run

Editor’s note:Berlin-based startup Opinary migrated their machine learning pipeline from Google Kubernetes Engine (GKE) to Cloud Run. After making a few architectural changes, their pipeline is now faster and more cost-efficient. They reduced the time to generate a recommendation from 20 seconds to a second, and realized a remarkable 50% cost reduction.In this post, Doreen Sacker and Héctor Otero Mediero share with us a detailed and transparent technical report of the migration.Opinary asks the right questions to increase reader engagementWe’re Opinary, and our reader polls appear in news articles globally. The polls let users share their opinion with one click and see how they compare to other readers. We automatically add the most relevant reader polls using machine learning. We’ve found that the polls help publishers increase reader retention, boost subscriptions, and improve other article success metrics. Advertisers benefit from access to their target groups contextually on premium publishers’ sites, and from high-performing interaction with their audiences.Let’s look at an example of one of our polls. Imagine reading an article on your favorite news site about whether or not to introduce a speed limit on the highway. As you might know, long stretches of German Autobahn still don’t have a legal speed limit, and this is a topic of intense debate. Critics of speeding point out the environmental impact and casualty toll. Opinary adds this poll to the article:Diving into the architecture of our recommendation systemHere’s how we’ve architected our system originally on GKE. Our pipeline starts with an article URL, and delivers a recommended poll to add to the article. Let’s take a more detailed look at the various components that make this happen. Here’s a visual overview: First, we’ll push a message with the article URL to a Pub/Sub topic (a message queue). The recommender service pulls the message from the queue in order to process it. Before this service can recommend a poll, it needs to complete a few steps, which we’ve separated out into individual services. The recommender service sends a request to these services one-by-one and stores the results in a Redis store. These are the steps: The article scraper service scrapes (downloads and parses) the article text from the URL.The encoder service encodes the text into text embeddings (we use the universal sentence encoder).The brand safety service detects if the article text includes descriptions of tragic events, such as death, murder, or accidents, because we don’t want to add our polls into these articles. With these three steps completed, the recommendation service can recommend a poll from our database of pre-existing polls, and submit it to an internal database we call Rec Store. This is how we end up recommending adding a poll about introducing a speed limit on the German Autobahn.Why we decided to move to Cloud RunCloud Run looked attractive to us for two reasons. First, because it automatically scales down all the way to zero container instances if there are no requests, we expected we would save costs (and we did!). Second, we liked the idea of running our code on a fully-managed platform without having to worry about the underlying infrastructure, especially since our team doesn’t have a dedicated data engineer (we’re both data scientists).As a fully-managed platform, Cloud Run has been designed to make developers more productive. It’s a serverless platform that lets you run your code in containers, directly on top of Google’s infrastructure. Deployments are fast and automated. Fill in your container image URL and seconds later your code is serving requests. Cloud Run automatically adds more container instances to handle all incoming requests or events, and removes them when they’re no longer needed. That’s cost-efficient, and on top of that Cloud Run doesn’t charge you for the resources a container uses if it’s not serving requests. The pay-for-use cost model was the main motivation for us to migrate away from GKE. We only want to pay for the resources we use – and not for a large idle cluster during the night.Enabling the migration to Cloud Run with a few changesTo move our services from GKE to Cloud Run, we had to make a few changes. Change the Pub/Sub subscriptions from pull to push. Migrate our self-managed Redis database in the cluster to a fully-managed Cloud Memorystore instance. This is how our initial target architecture looks in a diagram: Changing Pub/Sub subscriptions from pull to pushSince Cloud Run services scale with incoming web requests, your container must have an endpoint to handle requests. Our recommender service originally didn’t have an endpoint to serve requests, because we used the Pub/Sub client library to pull messages. Google recommends to use push subscriptions instead of pull subscriptions to trigger Cloud Run from Pub/Sub. With a push subscription, Pub/Sub delivers messages as requests to an HTTPS endpoint. Note that this doesn’t need to be Cloud Run, it can be any HTTPS URL. Pub/Sub guarantees delivery of a message by retrying requests that return an error or are too slow to respond (using a configurable deadline). Introducing a Cloud Memorystore Redis instanceCloud Run adds and removes container instances to handle all incoming requests. Redis doesn’t serve HTTP requests, and it likes to have one or a few stateful container instances attached to a persistent volume, instead of disposable containers that start on-demand.We created a Memorystore Redis instance to replace the in-cluster Redis instance. Memorystore instances have an internal IP address on the project’s VPC network. Containers on Cloud Run operate outside of the VPC. That means you have to add a connector to reach internal IP addresses on the VPC. Read the docs to learn more about Serverless VPC access.Making it faster using Cloud TraceThis first part of our migration went smoothly, but while we were hopeful that our system would perform better, we would still regularly spend almost 20 seconds generating a recommendation. We used Cloud Trace to figure out where requests were spending time. This is what we found:To handle a single request our code made roughly 2,000 requests to Redis. Batching all these requests into one request was a big improvement. The VPC connector has a default maximum limit on network throughput that was too low for our workload. Once we changed it to use larger instances, response times improved.As you can see below, when we rolled out these changes, we realized a noticeable performance benefit. Waiting for responses is expensiveThe changes described above led to scalable and fast recommendations. We reduced the average recommendation time from 10 seconds to under 1 second. However, the recommendation service was getting very expensive, because it spent a lot of time doing nothing, waiting for other services to return their response.The recommender service would receive a request, and wait for other services to return a response. As a result, many container instances in the recommender service were running but were essentially doing nothing except waiting. Therefore, the pay-per-use cost model of Cloud Run leads to high costs for this service. Our costs went up by a factor of 4 compared with the original setup on Kubernetes.Rethinking the architectureTo reduce costs, we needed to rethink our architecture. The recommendation service was sending requests to all other services, and would wait for their responses. This is called an orchestration pattern. To have the services work independently, we changed to a choreography pattern. We needed the services to execute their tasks one after the other, but without a single service waiting for other services to complete. This is what we ended up doing:We changed the initial entrypoint to be the article scraping service, rather than the recommender service. Instead of returning the article text, the scraping service now stores the text in a Cloud Storage bucket. The next step in our pipeline is to run the encoder service, and we invoke it using an EventArc trigger.EventArc lets you asynchronously deliver events from Google services, including those from Cloud Storage. We’ve set an EventArc trigger to fire an event as soon as the article scraper service adds the file to the Cloud Storage bucket. The trigger sends the object information to the encoder service using an HTTP request. The encoder service does its processing and saves the results in a Cloud Storage bucket again. One service after the other can now process and save the intermediate results in Cloud Storage for the next service to use.Now that we asynchronously invoke all services using EventArc triggers, no single service is actively waiting for another service to return results. Compared with the original setup on GKE, our costs are now 50% lower. Advice and conclusionsOur recommendations are now fast, scalable, and our costs are half as much as the original cluster setup.Migrating from GKE to Cloud Run is easy for container-based applications.Cloud Trace was useful for identifying where requests were spending time.Sending a request from one Cloud Run service to another and synchronously waiting for the result turned out to be expensive for us. Asynchronously invoking our services using EventArc triggers was a better solution. Cloud Run is under active development and new features are being added frequently, which makes it a nice developer experience overall.Related ArticleHow to use Google Cloud Serverless tech to iterate quickly in a startup environmentHow to use Google Cloud Serverless tech to iterate quickly in a startup environment.Read ArticleRelated ArticleCloud Wisdom Weekly: 5 ways to reduce costs with containersUnderstand the core features you should expect of container services, including specific advice for GKE and Cloud Run.Read ArticleRelated ArticleHow Einride scaled with serverless and re-architected the freight industryEinride, a Swedish freight mobility company, is partnering with Google Cloud to reimagine the freight industry as we know it.Read Article
Quelle: Google Cloud Platform

Accelerate integrated Salesforce insights with Google Cloud Cortex Framework

Enterprises across the globe rely on a number of strategic independent software vendors like Salesforce, SAP and others to help them run their operations and business processes. Now more than ever, the need to sense and respond to new and changing business demands has increased and the availability of data from these platforms is integral for business decision making. Many companies today are looking for accelerated ways to link their enterprise data with surrounding data sets and sources to gain more meaningful insights and business outcomes. Getting there faster given the complexity and scale of managing and tying this data together can be an expensive and challenging proposition.To embark on this journey, many companies choose Google’s Data Cloud to integrate, accelerate and augment business insights through a cloud first data platform approach with BigQuery to power data driven innovation at scale. Next, they take advantage of best practices and accelerator content delivered with Google Cloud Cortex Framework to establish an open, scalable data foundation that can enable connected insights across a variety of use cases. Today, we are excited to announce the next offering of accelerators available that expand Cortex Data Foundation to include new packaged analytics solution templates and content for Salesforce. New analytics content for SalesforceSalesforce provides a powerful Customer Relationship Management (CRM) solution that is widely recognized and adopted across many industries and enterprises. With increased focus on engaging customers better and improving insights on relationships, this data is highly valuable and relevant as it spans many business activities and processes including sales, marketing, and customer service. With Cortex Framework, Salesforce data can now be more easily integrated into a single, scalable data foundation in BigQuery to unlock new insights and value. With this release, we take the guesswork out of the time, effort, and cost to establish a Salesforce data foundation in BigQuery. You can deploy Cortex Framework for Salesforce content to kickstart customer-centric data analytics and gain broader insights across key areas including: accounts, contacts, leads, opportunities and cases. Take advantage of the predefined data models for Salesforce along with analytics examples in Looker for immediate customer relationship focused insights, or easily join Salesforce data with other delivered data sets, such as Google Trends, Weather, or SAP to enable richer, connected insights. The choice is yours, and the sky’s the limit with the flexibility of Cortex to enable your specific use cases.By bringing Salesforce data together with other public, community, and private data sources, Google Cloud Cortex Framework helps accelerate the ability to optimize and innovate your business with connected insights.What’s nextThis release extends upon prior content releases for SAP and other data sources to further enhance the value of Cortex Data Foundation across private, public and community data sources. Google Cloud Cortex Framework continues to expand content to help better meet the needs of customers on data analytics transformation journeys. Stay tuned for more announcements coming soon.To learn more about Google Cloud Cortex Framework, visit our solution page, and try out Cortex Data Foundation today to discover what’s possible.Related ArticleAccelerating SAP CPG enterprises with Google Cloud Cortex FrameworkGoogle Cloud Cortex Framework launches analytics content to make it easier for SAP enterprises to solve common Consumer Packaged Goods us…Read Article
Quelle: Google Cloud Platform

Hierarchical Firewall Policy Automation with Terraform

Firewall rules are an essential component of network security in Google Cloud. Firewalls in Google Cloud can broadly be categorized into two types; Network Firewall Policies and Hierarchical Firewall Policies. While Network Firewalls are directly associated with a VPC to allow/deny the traffic, Hierarchical Firewalls can be thought of as the policy engine to use Resource Hierarchy for creating and enforcing policies across the organization. Hierarchical policies can be enforced at the organization level or at the folder(s) level. Like Network Firewall rules, hierarchical firewall policy rules can allow or deny traffic AND can also delegate the evaluation to lower level policies or to the network firewall rules itself (with a go_next). Lower-level rules cannot override a rule from a higher place in the resource hierarchy. This lets organization-wide admins manage critical firewall rules in one place.So, now let’s think of a few scenarios where Hierarchical Firewall policies will be useful1. Reduce the number of Network Firewall: Example: say in xyz.com got 6 Shared VPCs based upon their business segments. It is a security policy to refuse SSH access to any VMs in the company, i.e. deny TCP port 22 traffic. With Network Firewalls, this rule needs to be enforced at 6 places (each Shared VPC). Growing number of granular Network firewall rules for each network segment means more touch points, i.e. means more chances of drift and accidents. Security admins get busy with hand holding and almost always become a bottleneck for even simple firewall changes. With Hierarchical firewall Policies, Security Admins can create a common/single policy to deny TCP port 22 traffic and enforce it to xyz.com org. OR explicitly target one/many Shared VPCs from the policy. This way a single policy can define the broader traffic control posture.  2. Manage critical firewall rules using centralized policies AND safely delegate non-critical controls at VPC levelExample: At xyz.com SSH to GCEs is strictly prohibited and non-negotiable. Auditors need this. While allowing/denying TCP traffic to port 443 depends on which Shared VPC the traffic is going to. In this case security admins can create a policy to deny TCP port 22 traffic and enforce this policy to the xyz.com. Another policy is created for TCP port 443 traffic to say “go_next” and decide at the next lower level if this traffic is allowed. Then, have a Network Firewall rule to allow/deny 443 traffic at the Shared VPC level. This way Security Admin has broader control at a higher level to enforce traffic control policies and delegate where possible. Ability to manage the most critical firewall rules at one place also frees project level administrators (e.g., project owners, editors or security admin) from having to keep up with changing organization-wide policies. With hierarchical firewall policies, Security admin can centrally enforce, manage and observe the traffic control patterns.Create, Configure and Enforce Hierarchical Firewall PoliciesThere are 3 major components of Hierarchical Firewall Policies; Rules, Policy and Association. Broadly speaking a “Rule” is a decision making construct to declare if the traffic should be allowed, denied or delegated to the next level for decision. “Policy” is a collection of rules, i.e. one or more rules can be associated with a Policy. “Association” tells the enforcement point of the policy in the Google Cloud resource hierarchy. These concepts are extensively explained on the product page.A simple visualization of Rules, Policy and Association looks likeInfrastructure as Code (Terraform) for Hierarchical Firewall PoliciesThere are 3 Terraform Resources that need to be stitched together to build and enforce Hierarchical Firewall Policies.  #1 Policy Terraform Resource – google_compute_firewall_policyIn this module the most important parameter is the “parent” parameter. Hierarchical firewall policies, like projects, are parented by a folder or organization resource. Remember this is NOT the folder where the policy is enforced or associated. It is just a folder which owns Policy(s) that you are creating. Using a Folder to own the hierarchical firewall policies, also simplifies the IAM to manage who can create/modify these policies, i.e. just assign the IAM to this folder. For a scaled environment it is recommended to create a separate “firewall-policy” folder to host all of your Hierarchical Firewall Policies.Samplecode_block[StructValue([(u’code’, u’/*rn Create a Policyrn*/rnresource “google_compute_firewall_policy” “base-fw-policy” {rn parent = “folders/<folder-id>”rn short_name = “base-fw-policy”rn description = “A Firewall Policy Example”rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eeca59971d0>)])]You can get the Folder ID of the “firewall-policy” folder using below commandgcloud resource-manager folders list –organization=<your organization ID> –filter='<name of the folder>’For example, if your firewall policy folder is called ‘firewall-policy’ then use gcloud resource-manager folders list –organization=<your organization ID> –filter=’firewall-policy’ #2 Rules Terraform Resource – google_compute_firewall_policy_ruleMost of the parameters in this resource definition are very obvious but there are a couple of them that need special consideration.disabled – Denotes whether the firewall policy rule is disabled. When set to true, the firewall policy rule is not enforced and traffic behaves as if it did not exist. If this is unspecified, the firewall policy rule will be enabled. enable_logging – enabling firewall logging is highly recommended for many future operational advantages. To enable it, pass true to this parameter.target_resources – This parameter comes handy when you want to target certain Shared VPC(s) for this rule. You need to pass the URI path for the Shared VPC. Top get the URI for the VPC use this command code_block[StructValue([(u’code’, u’gcloud config set project <Host Project ID>rngcloud compute networks list –uri’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eeca4d95c50>)])]SampleHere is some sample Terraform code to create a Firewall Policy Rule with priority 9000 to deny TCP port 22 traffic from 35.235.240.0/20 CIDR block (used for identity aware proxy)code_block[StructValue([(u’code’, u’/*rn Create a Firewall rule #1rn*/rnresource “google_compute_firewall_policy_rule” “base-fw-rule-1″ {rn firewall_policy = google_compute_firewall_policy.base-fw-policy.idrn description = “Firewall Rule #1 in base firewall policy”rn priority = 9000rn enable_logging = truern action = “deny”rn direction = “INGRESS”rn disabled = falsern match {rn layer4_configs {rn ip_protocol = “tcp”rn ports = [22]rn }rn src_ip_ranges = [“35.235.240.0/20″]rn }rn target_resources = [“https://www.googleapis.com/compute/v1/projects/<PROJECT-ID>/global/networks/<VPC-NAME>”]rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eeca58a7750>)])]#3 Association Terraform Resource – google_compute_firewall_policy_associationIn the attachment_target pass the folder ID where you want to enforce this policy, i.e. everything under this folder (all projects) will get this policy. In the case of Shared VPCs, the target folder should be the parent of your host project. Samplecode_block[StructValue([(u’code’, u’/*rn Associate the policy rn*/rnresource “google_compute_firewall_policy_association” “associate-base-fw-policy” {rn firewall_policy = google_compute_firewall_policy.base-fw-policy.idrn attachment_target = “folders/<Folder ID>”rn name = “Associate Base Firewall Policy with dummy-folder”rn}’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eeca58a73d0>)])]Once these policies are enforced, you can see it on the console under “VPC Network->Firewall” as something like below.In the Firewall Policy Folder, the created Hierarchical Firewall Policy will show up. Remember there are 4 default firewall rules that come with each policy, so even when you create a single rule in your policy, rule count will be 5, as shown in the panel below.Go into the Policy to see the rules you created and association of the policy (shown in 2 panels). SummaryHierarchical Firewall Policy simplifies the complex process of enforcing consistent traffic control policies across your Google Cloud environment. With Terraform modules and automation shown in this article, it gives Security admins ability to build guardrails using a policy engine and known Infrastructure as Code platform. Check out the Hierarchical Firewall Policy doc and how to use them. 
Quelle: Google Cloud Platform

CISO Survival Guide: Vital questions to help guide transformation success

Part of being a security leader whose organization is taking on a digital transformation is preparing for hard questions – and complex answers – on how to implement a transformation strategy. In our previous CISO Survival Guide blog, we discussed how financial services organizations can more securely move to the cloud. We examined how to organize and think about the digital transformation challenges facing the highly-regulated financial services industry, including the benefits of the Organization, Operation, and Technology (OOT) approach, as well as embracing new processes like continuous delivery and required cultural shifts.As part of Google Cloud’s commitment to shared fate, today we offer tips on how to ask the right questions that can help create the conversations that lead to better transformation outcomes for your organization. While there often is more than one right answer, a thoughtful, methodical approach to asking targeted questions and maintaining an open mind about the answers you hear back can help achieve your desired result. These questions are designed to help you figure out where to start and where to end your organization’s security transformation. By asking the following questions, CISOs and business leaders can develop a constructive, focused dialogue which can help determine the proper balance between implementing security controls and fine-tuning the risk tolerance set by the executive management and the board of directors.aside_block[StructValue([(u’title’, u’Hear monthly from our Cloud CISO in your inbox’), (u’body’, <wagtail.wagtailcore.rich_text.RichText object at 0x3eba10dcfc90>), (u’btn_text’, u’Subscribe today’), (u’href’, u’https://go.chronicle.security/cloudciso-newsletter-signup?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY23-Cloud-CISO-Perspectives-newsletter-blog-embed-CTA&utm_content=-&utm_term=-‘), (u’image’, None)])]To start the conversation, begin by asking: What defines our organization’s culture?How can we best integrate the culture with our security goals?CISOs should ask business leaders:What makes a successful transformation? What are the key goals of the transformation?What data is (most) valuable?  What data can be retired, reclassified, or migrated?  What losses can we afford to take and still function?  What is the real risk that the organization is willing to accept?Business leaders should ask CISOs and the security team:What are the best practices for protecting our valuable data?What is the business impact of implementing those controls?  What are the top threats that we need to address?CISOs and business leaders should ask: Which threats are no longer as important? Where could we potentially use spending for more cost-effective controls such as firewalls and antivirus software?What benefits do we get from refactoring our applications?Are we really transforming, or lifting and shifting?How should we perform identity and access management to meet our business objectives?What are the core controls needed to ensure enterprise-level performance for the first workloads?CISOs and risk teams should ask:How can we use the restructuring of an existing body of code to streamline security functions?How should we monitor our security posture to ensure we are aligned with our risk appetite?Business and technical teams should ask:What’s our backup plan? What do we do if that fails?Practical advice and the realities of operational transformationSome organizations have been working in the cloud for more than a decade and have already addressed many operational procedures, sometimes with painful lessons learned along the way. If you’ve been operating in the cloud securely for that long, we recognize that there’s a lot to be gained from understanding your approaches to culture, operational expertise, and technology. However, there are still many organizations that have not thought through how they will operate in a cloud environment until it’s almost ready – and at that point, it might be too late. If you can’t detail how a cloud environment will operate before its launch, how will you know who should be responsible for maintaining it? Who are the critical stakeholders, along with those responsible for engineering and maintaining specific systems, who should be identified at the start of the transformation?  There are likely several groups of stakeholders, such as those aligned with operations for transformation, and those focused on control design for cloud aligned with operations. If you don’t have the operators involved in the design phase, you’re destined to create clever security controls with very little practical value because those tasked with day-to-day maintenance most likely won’t have the expertise or training to effectively operate these controls. This is complicated by the fact that many organizations are struggling to recruit and retain resources with the right skills to operate in the cloud. We believe that training current employees to learn new cloud skills, and giving them the time away from other responsibilities, can help build skilled, diverse cloud security teams.If your organization continually experiences high turnover in security leadership and skilled staff, it’s up to you to navigate your culture to ensure greater consistency. You can, of course, choose to supplement internal knowledge with trusted partners – however, that’s an expensive strategy for ongoing operational cost.We met recently with a security organization that turns over skilled staff and leadership every two to three years. This rate of churn results in a continual resetting of security goals. This particular team joked that it’s like “Groundhog Day” as they constantly re-evaluate their best security approaches yet make no meaningful progress. This is not a model to emulate.Many security controls fail not because they are improperly engineered, but because the people who use them – your security team – are improperly trained and insufficiently  motivated. This is especially true for teams with high turnover rates and other organizational misalignments. A security control that blocks 100% of attacks might be engineered correctly, but if you can’t efficiently operate it, the effectiveness of the control will plummet to zero over time. Worse, it then becomes a liability because you incorrectly assume you have a functioning control.In our next blog, we will highlight several proven approaches that we believe can help guide your security team through your organization’s digital transformation. To learn more now, check out:Previous blogPodcast: CISO walks into the cloud: Frustrations, successes, lessons… and does the risk change?Report: CISO’s Guide to Cloud Security TransformationRelated ArticleCISO Survival Guide: How financial services organizations can more securely move to the cloudThe first in a series of CISO survival guide blog posts offers cloud security advice for CISOs in financial services organizations tackli…Read Article
Quelle: Google Cloud Platform

Announcing the GA of BigQuery multi-statement transactions

Transactions are mission critical for modern enterprises supporting payments, logistics, and a multitude of business operations. And in today’s modern analytics-first and data-driven era, the need for the reliable processing of complex transactions extends beyond just the traditional OLTP database; today businesses also have to trust that their analytics environments are processing transactional data in an atomic, consistent, isolated, and durable (ACID) manner. So BigQuery set out to support DML statements spanning large numbers of tables in a single transaction and commit the associated changes atomically (all at once) if successful or rollback atomically upon failure. And today, we’d like to highlight the recent general availability launch of multi-statement transactions within BigQuery and the new business capabilities it unlocks. While in preview, BigQuery multi-statement transactions were tremendously effective for customer use cases, such as keeping BigQuery synchronized with data stored in OLTP environments, the complex post processing of events pre-ingested into BigQuery, complying with GDPR’s right to be forgotten, etc. One of our customers, PLAID, leverages these multi-statement transactions within their customer experience platform KARTE to analyze the behavior and emotions of website visitors and application users, enabling businesses to deliver relevant communications in real time and further PLAID’s mission to Maximize the Value of People with the Power of Data.“We see multi-statement transactions as a valuable feature for achieving expressive and fast analytics capabilities. For developers, it keeps queries simple and less hassle in error handling, and for users, it always gives reliable results.”—Takuya Ogawa, Lead Product EngineerThe general availability of multi-statement transactions not only provides customers with a production ready means of handling their business critical transactions in a comprehensive manner within a single transaction, but now also provides customers with far greater scalability compared to what was offered during the preview. At GA, multi-statement transactions increase support for mutating up to 100,000 table partitions and modifying up to 100 tables per transaction. This 10x scale in the number of table partitions and 2x scale in the number of tables was made possible by a careful re-design of our transaction commit protocol which optimizes the size of the transactionally committed metadata. The GA of multi-statement transactions also introduces full compatibility with BigQuery sessions and procedural language scripting. Sessions are useful because they store state and enable the use of temporary tables and variables, which then can be run across multiple queries when combined with multi-statement transactions. Procedural language scripting provides users the ability to run multiple statements in a sequence with shared state and with complex logic using programming constructs such as IF … THEN and WHILE loops.For instance, let’s say we wanted to enhance the current multi-statement transaction example, which uses transactions to atomically manage the existing inventory and supply of new arrivals of a retail company. Since we’re a retailer monitoring our current inventory on hand, we would now also like to add functionality to automatically suggest to our Sales team which items we should promote with sales offers when our inventory becomes too large. To do this, it would be useful to include a simple procedural IF statement, which monitors the current inventory and supply of new arrivals and modifies a new PromotionalSales table based on total inventory levels. And let’s validate the results ourselves before committing them as one single transaction to our sales team by using sessions. Let’s see how we’d do this via SQL.First, we’ll create our tables using DDL statements:code_block[StructValue([(u’code’, u’CREATE OR REPLACE TABLE my_dataset.Inventoryrn(product string,rnquantity int64,rnsupply_constrained bool);rn rnCREATE OR REPLACE TABLE my_dataset.NewArrivalsrn(product string,rnquantity int64,rnwarehouse string);rn rnCREATE OR REPLACE TABLE my_dataset.PromotionalSalesrn(product string,rninventory_on_hand int64,rnexcess_inventory int64);’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4c43fb6cd0>)])]Then, we’ll insert some values into our Inventory and NewArrivals tables:code_block[StructValue([(u’code’, u”INSERT my_dataset.Inventory (product, quantity)rnVALUES(‘top load washer’, 10),rn (‘front load washer’, 20),rn (‘dryer’, 30),rn (‘refrigerator’, 10),rn (‘microwave’, 20),rn (‘dishwasher’, 30);rn rnINSERT my_dataset.NewArrivals (product, quantity, warehouse)rnVALUES(‘top load washer’, 100, ‘warehouse #1′),rn (‘dryer’, 200, ‘warehouse #2′),rn (‘oven’, 300, ‘warehouse #1′);”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4c40dced50>)])]Now, we’ll use a multi-statement transaction and procedural language scripting to atomically merge our NewArrivals table with the Inventory table while taking excess inventory into account to build out our PromotionalSales table. We’ll also create this within a session, which will allow us to validate the tables ourselves before committing the statement to everyone else.code_block[StructValue([(u’code’, u”DECLARE average_product_quantity FLOAT64;rn rnBEGIN TRANSACTION;rn rnCREATE TEMP TABLE tmp AS SELECT * FROM my_dataset.NewArrivals WHERE warehouse = ‘warehouse #1′;rnDELETE my_dataset.NewArrivals WHERE warehouse = ‘warehouse #1′;rn rn#Calculates the average of all product inventories.rnset average_product_quantity = (SELECT AVG(quantity) FROM my_dataset.Inventory);rn rnMERGE my_dataset.Inventory IrnUSING tmp TrnON I.product = T.productrnWHEN NOT MATCHED THENrnINSERT(product, quantity, supply_constrained)rnVALUES(product, quantity, false)rnWHEN MATCHED THENrnUPDATE SET quantity = I.quantity + T.quantity;rn rn#The below procedural script uses a very simple approach to determine excess_inventory based on current inventory being 120% of the average inventory across all products.rnIF EXISTS(SELECT * FROM my_dataset.Inventoryrn WHERE quantity > (1.2 * average_product_quantity)) THENrn INSERT my_dataset.PromotionalSales (product, inventory_on_hand, excess_inventory)rn SELECTrn product,rn quantity as inventory_on_hand,rn quantity – CAST(ROUND((1.2 * average_product_quantity),0) AS INT64) as excess_inventoryrn FROM my_dataset.Inventoryrn WHERE quantity > (1.2 * average_product_quantity);rnEND IF;rn rnSELECT * FROM my_dataset.NewArrivals;rnSELECT * FROM my_dataset.Inventory ORDER BY product;rnSELECT * FROM my_dataset.PromotionalSales ORDER BY excess_inventory DESC;rn#Note the multi-statement SQL temporarily stops here within the session. This runs successfully if you’ve set your SQL to run within a session.”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4c4268fad0>)])]From the results of the SELECT statements, we can see the warehouse #1 arrivals were successfully added to our inventory and the PromotionalSales table correctly reflects what excess inventory we have. Looks like these transactions are ready to be committed.However, just in case there were some issues with our expected results, if others were to query the tables outside the session we created, the changes wouldn’t have taken effect. Thus, we have the ability to validate our results and could roll them back if needed without impacting others.code_block[StructValue([(u’code’, u’#Run in a different tab outside the current session. Results displayed will be consistent with the tables before running the multi-statement transaction.rnSELECT * FROM my_dataset.NewArrivals;rnSELECT * FROM my_dataset.Inventory ORDER BY product;rnSELECT * FROM my_dataset.PromotionalSales ORDER BY excess_inventory DESC;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4c4268f650>)])]Going back to our configured session, since we’ve validated our Inventory, NewArrivals, and PromotionalSales tables are correct, we can go ahead and commit the multi-statement transaction within the session, which will propagate the changes outside the session too.code_block[StructValue([(u’code’, u’#Now commit the transaction within the same session configured earlier. Be sure to delete or comment out the rest of the SQL text run earlier.rnCOMMIT TRANSACTION;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4c42de2590>)])]And now that the PromotionalSales table has been updated for all users, our sales team has some ideas of what products they should promote due to our excess inventory.code_block[StructValue([(u’code’, u’#Results now propagated for all users.rnSELECT * FROM my_dataset.PromotionalSales ORDER BY excess_inventory DESC;’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4c42de2c10>)])]As you can tell, using multi-statement transactions is simple, scalable, and quite powerful, especially combined with other BigQuery Features. Give them a try yourself and see what’s possible.
Quelle: Google Cloud Platform

New year, new skills – How to reach your cloud career destination

Cloud is a great place to grow your career in 2023. Opportunity abounds, with cloud roles offering strong salaries and scope for growth as a constantly evolving field.1 Some positions do not require a technical background, like project managers, product owners and business analysts. For others, like solutions architects, developers and administrators, coding and technical expertise are a must. Either way, cloud knowledge and experience are required to land that dream job. But where do you start? And how do you keep up with the fast pace of ever-changing cloud technology? Check out these tips below. There are also suggested training opportunities to help support your growth, including no-cost options below!Start by looking at your experienceYour experience can be a great way to get into cloud, even if it seems non-traditional. Think creatively about transferable skills and opportunities. Here are a few scenarios where you might find yourself today:You already work in IT, but in legacy systems or the data center. Forrest Brazeal, Head of Content Marketing at Google Cloud, talks about that in detail in this video.Use your sales experience to become a sales engineer, or your communications experience to become a developer advocate. Stephanie Wong, Developer Advocate at Google Cloud, discusses that here.You don’t have that college degree that is included in the job requirements. I’ve talked about that in a recent video here.Your company has a cloud segment, but your focus is in another area. Go talk to people! Access your colleagues who do what you want to do. Get their advice for skilling up.Define where you need to fill in gapsIf you are looking at a technical position, you will need to show cloud applicable experience, so learn about the cloud and build a portfolio of work. Here are a few key skills we recommend everyone have to start1:Code is non-negotiable. People who come from software development backgrounds typically find it easier to get into and maneuver through the cloud environment because of their coding experience. Automation, basic data manipulation and scaling is a daily requirement. If you don’t have a language you already know, learning Python is a great place to begin.Understand Linux. You’ll need to know the Linux filesystem, basic Linux commands and fundamentals of containerization.Learn core networking concepts like the IP Protocol and the others that layer on top of it, DNS, and subnets.Make sure you understand the cloud itself, and in particular the specifics about Google Cloud for a role at Google.Familiarity with open source tooling. Terraform for automation and Kubernetes for containers are portable between clouds and are worth taking the time to learn.Boost your targeted hands-on skillsCheck out Google Cloud Skills Boost for a comprehensive collection of training to help you upskill into a cloud role, including hands-on labs that get you real-world experience in Google Cloud. New users can start off with a 30 day no-cost trial2. Take a look at these recommendations:No-cost labs and coursesA Tour of Google Cloud Hands-on Labs- 45 minutesA Tour of Google Cloud Sustainability – 60 minutesIntroduction to SQL for BigQuery and Cloud SQL – 60 minutes Infrastructure and Application Modernization with Google Cloud – Introductory course with three modules Preparing for Google Cloud certification – Courses to help you prepare for Google Cloud certification examsBuild hands on projectsThis part is critical for the interview portion. Take the cloud skills you have learned and create something tangible that you can use as a story during an interview. Consider building a project on Github so others can see it working live, and document it well. Be sure to include your decision making process. Here is an example:Build an API or a web applicationDevelop the code for the applicationPick the infrastructure to deploy that application in the cloud, choose your storage option, and a database with which it will interact Get valuable cloud knowledge for non-technical rolesFor tech-adjacent roles, like those in business, sales or administration, having a solid knowledge of cloud principles is critical. We recommend completing the Cloud Digital Leader training courses, at no cost. Or go the extra mile and consider taking the Google Cloud Digital Leader Certification exam once you complete the training:No-cost courseCloud Digital Leader Learning Path – understand cloud capabilities, products and services and how they benefit organizations $99 registration feeGoogle Cloud Digital Leader Certification – validate your cloud expertise by earning a certificationCommit to learning in the New YearA couple of other resources we have are the Google Cloud Innovators Program, which will help you grow on Google Cloud and connect you with other community members. There is no-cost to join, and it will give you access build your skills and the future of cloud! Join today.Start your new year strong, whether you are exploring Google Cloud Data, DevOps or Networking certifications by completing Arcade games each week. This January play to win in The Arcade while you learn new skills and earn prizes on Google Cloud Skills Boost. Each week we will feature a new game to help you show and grow your cloud skills, while sampling certification-based learning paths.  Make 2023 the year to build your cloud career and commit to learning all year, with our $299/year annual subscription. This subscription includes $500 of Google Cloud credits (and a bonus $500 of Google Cloud credits after you successfully certify), a $200 certification voucher, $299 annual subscription to Google Cloud Skills Boost with access to the entire training catalog, live-learning events and quarterly technical briefings with executives. 1. Starting your career in cloud from IT – Forrest Brazeal, Head of Content Marketing, Google Cloud2. Credit card required to activate a 30 day no-cost trial for new users.
Quelle: Google Cloud Platform

Optimize and scale your startup – A look into the Build Series

At Google Cloud, we want to provide you with the access to all the tools you need to grow your business. Through the Google Cloud Technical Guides for Startups, leverage industry leading solutions with how-to video guides andresourcescurated for startups. This multi-series contains 3 chapters: Start, Build and Grow, which matches your startup’s  journey:The Start Series: Begin by building, deploying and managing new applications on Google Cloud from start to finish.The Build Series: Optimize and scale existing deployments to reach your target audiences.The Grow Series: Grow and attain scale with deployments on Google Cloud.Additionally, at Google we have the Google for Startups Cloud Program, which is designed to help your business get off the ground and enable a sustainable growth plan for the future. The start of the Build Series delineates the benefits of the program, the application process, and more to help your business get started on Google Cloud.A quick recap of the Build SeriesOnce you have applied for the Google for Startups Cloud Program, there’s so much to explore and try out on Google Cloud. Figuring out a rapid but solid application development process can be key to many businesses in reducing time to market. Furthermore, learning what database to use to handle application data can be tricky. Deep dive into our Firestorevideo which walks through how Firestore can help you unlock application innovation with simplicity and speed.We then move on to a deep dive intoBigQuery and how it can help businesses. BigQuery is designed to support analysis over petabytes of data regardless of whether it’s structured or unstructured. This video is the goto video for getting started on BigQuery!If you are someone looking to run your Spark and Hadoop jobs faster and on the cloud, look to Dataproc. To learn more about Dataproc and how this has helped other customers with their Hadoop clusters, click the video below to learn all things Dataproc related.Next, we find out what Dataflow can bring to your business; some advantages, sample architectures, demos on the console, and how other customers are using Dataflow. We also talked about Machine Learning, starting from selecting the right ML solution to Machine Learning APIs on cloud to exploring Vertex AI. Following that we look into API management in Google Cloud and how Apigee helps operate your APIs with enhanced scale, security, and automation.We ended the series with the last two episodes focusing around security deep-dive and using Cloud Tasks and Cloud Scheduler.Coming up next – The Grow SeriesDive into the next chapter of this multi-series, with our upcoming Grow Series, where we will be focusing on growing and attaining scale with deployments on Google Cloud.Check out our website and join us by checking out the video series on the Google Cloud Tech channel, and subscribe to stay up to date. See you in the cloud!
Quelle: Google Cloud Platform

New control plane connectivity and isolation options for your GKE clusters

Once upon a time, all Google Kubernetes Engine (GKE) clusters used public IP addressing for communication between nodes and the control plane. Subsequently, we heard your security concerns and introduced private clusters enabled by VPC peering. To consolidate the connectivity types, starting in March 2022, we began using Google Cloud’s Private Service Connect (PSC) for new public clusters’ communication between the GKE cluster control plane and nodes, which has profound implications for how you can configure your GKE environment. Today, we’re presenting a new consistent PSC-based framework for GKE control plane connectivity from cluster nodes. Additionally, we’re excited to announce a new feature set which includes cluster isolation at the control plane and node pool levels to enable more scalable, secure — and cheaper! — GKE clusters. New architectureStarting with GKE version 1.23 and later, all new public clusters created on or after March 15th, 2022 began using Google Cloud’s PSC infrastructure to communicate between the GKE cluster control plane and nodes. PSC provides a consistent framework that helps connect different networks through a service networking approach, and allows service producers and consumers to communicate using private IP addresses internal to a VPC. The biggest benefit of this change is to set the stage for using PSC-enabled features for GKE clusters.Figure 1: Simplified diagram of PSC-based architecture for GKE clustersThe new set of cluster isolation capabilities we’re presenting here is part of the evolution to a more scalable and secure GKE cluster posture. Previously, private GKE clusters were enabled with VPC peering, introducing specific network architectures. With this feature set, you now have the ability to:Update the GKE cluster control plane to only allow access to a private endpointCreate or update a GKE cluster node pool with public or private nodesEnable or disable GKE cluster control plane access from Google-owned IPs.In addition, the new PSC infrastructure can provide cost savings. Traditionally, control plane communication is treated as normal egress and is charged for public clusters as a normal public IP charge. This is also true if you’re running kubectl for provisioning or other operational reasons. With PSC infrastructure, we have eliminated the cost of communication between the control plane and your cluster nodes, resulting in one less network egress charge to worry about.Now, let’s take a look at how this feature set enables these new capabilities.Allow access to the control plane only via a private endpointPrivate cluster users have long had the ability to create the control plane with both public and private endpoints. We now extend the same flexibility to public GKE clusters based on PSC. With this, if you want private-only access to your GKE control plane but want all your node pools to be public, you can do so. This model provides a tighter security posture for the control plane, while leaving you to choose what kind of cluster node you need, based on your deployment. To enable access only to a private endpoint on the control plane, use the following gcloud command:code_block[StructValue([(u’code’, u’gcloud container clusters update CLUSTER_NAME \rn –enable-private-endpoint’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e7046d9dd10>)])]Allow toggling and mixed-mode clusters with public and private node poolsAll cloud providers with managed Kubernetes offerings offer both public and private clusters. Whether a cluster is public or private is enforced at the cluster level, and cannot be changed once it is created. Now you have the ability to toggle a node pool to have private or public IP addressing. You may also want a mix of private and public node pools. For example, you may be running a mix of workloads in your cluster in which some require internet access and some don’t. Instead of setting up NAT rules, you can deploy a workload on a node pool with public IP addressing to ensure that only such node pool deployments are publicly accessible. To enable private-only IP addressing on existing node pools, use the following gcloud command:code_block[StructValue([(u’code’, u’gcloud container node-pools update POOL_NAME \rn –cluster CLUSTER_NAME \rn –enable-private-nodes’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e704796ec90>)])]To enable private-only IP addressing at node pool creation time, use the following gcloud command:code_block[StructValue([(u’code’, u’gcloud container node-pools create POOL_NAME \rn –cluster CLUSTER_NAME \rn –enable-private-nodes’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e7046dc6710>)])]Configure access from Google Cloud In some scenarios, users have identified workloads outside of their GKE cluster, for example, applications running in Cloud Run or any GCP VMs sourced with Google Cloud public IPs were allowed to reach the cluster control plane. To mitigate potential security concerns, we have introduced a feature that allows you to toggle access to your cluster control plane from such sources. To remove access from Google Cloud public IPs to the control plane, use the following gcloud command:code_block[StructValue([(u’code’, u’gcloud container clusters update CLUSTER_NAME \rn –no-enable-google-cloud-access’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e7046df2510>)])]Similarly, you can use this flag at cluster creation time.Choose your private endpoint addressMany customers like to map IPs to a stack for easier troubleshooting and to track usage. For example — IP block x for Infrastructure, IP block y for Services, IP block z for the GKE control plane, etc. By default, the private IP address for the control plane in PSC-based GKE clusters comes from the node subnet. However, some customers treat node subnets as infrastructure and apply security policies against it. To differentiate between infrastructure and the GKE control plane, you can now create a new custom subnet and assign it to your cluster control plane.code_block[StructValue([(u’code’, u’gcloud container clusters create CLUSTER_NAME \rn –private-endpoint-subnetwork=SUBNET_NAME’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e7046dcd9d0>)])]What can you do with this new GKE architecture?With this new set of features, you can basically remove all public IP communication for your GKE clusters! This, in essence, means you can make your GKE clusters completely private. You currently need to create the cluster as public to ensure that it uses PSC, but you can then update your cluster using gcloud with the –enable-private-endpoint flag, or the UI, to configure access via only a private endpoint on the control plane or create new private node pools. Alternatively, you can control access at cluster creation time with the –master-authorized-networks and –no-enable-google-cloud-access flags to prevent access from public addressing to the control plane.Furthermore, you can use the REST API or Terraform Providers to actually build a new PSC-based GKE cluster with the default (thus first) node pools to have private nodes. This can be done by setting the enablePrivateNodes field to true (instead of leveraging the public GKE cluster defaults and then updating afterwards, as currently required with gcloud and UI operations). Lastly, the aforementioned features extend not only to Standard GKE clusters, but also to GKE Autopilot clusters.When evaluating if you’re ready to move these PSC-based GKE cluster types to take advantage of private cluster isolation, keep in mind that the control plane’s private endpoint has the following limitations:Private addresses in URLs for new or existing webhooks that you configure are not supported. To mitigate this incompatibility and assign an internal IP address to the URL for webhooks, set up a webhook to a private address by URL, create a headless service without a selector and a corresponding endpoint for the required destination.The control plane private endpoint is not currently accessible from on-premises systems.The control plane private endpoint is not currently globally accessible: Client VMs from different regions than the cluster region cannot connect to the control plane’s private endpoint.All public clusters on version 1.25 and later that are not yet PSC-based are currently being migrated to the new PSC infrastructure; therefore, your clusters might already be using PSC to communicate with the control plane.To learn more about GKE clusters with PSC-based control plane communication, check out these references:GKE Concept page for public clusters with PSCHow-to: Change Cluster Isolation pageHow-to: GKE node pool creation page with isolation feature flagHow-to: Schedule Pods on GKE Autopilot private nodesgcloud reference to create a cluster with a custom private subnetTerraform Providers Google: release v4.45.0 pageGoogle Cloud Private Services Connect page.Here are the more specific features in the latest Terraform Provider, handy to integrate into your automation pipeline:Terraform Providers Google: release v4.45.0gcp_public_cidrs_access_enabledenable_private_endpointprivate_endpoint_subnetworkenable_private_nodes
Quelle: Google Cloud Platform

Document AI adds three new capabilities to its OCR engine

Documents are indispensable parts of our professional and personal lives. They give us crucial insights that help us become more efficient, that organize and optimize information, and that even help us to stay competitive. But as documents become increasingly complex, and as the variety of document types continues to expand, it has become increasingly challenging for people and businesses to sift through the ocean of bits and bytes in order to extract actionable insights. This is where Google Cloud’s Document AI comes in. It is a unified, AI-powered suite for understanding and organizing documents. Document AI consists of Document AI Workbench (state-of-the-art custom ML platform), Document AI Warehouse (managed service with document storage and analytics capabilities), and a rich set of pre-trained document processors. Underpinning these services is the ability to extract text accurately from various types of documents with a world-class Document Optical Character Recognition (OCR) engine.Google Cloud’s Document AI OCR takes an unstructured document as input and extracts text and layout (e.g., paragraphs, lines, etc.) from the document. Covering over 200 languages, Document AI OCR is powered by state-of-the-art machine learning models developed by Google Cloud and Google Research teams. Today, we are pleased to announce three new OCR features in Public Preview that can further enhance your document processing workflows. 1. Assess page-level quality of documents with Intelligent Document Quality (IDQ) With Document AI OCR, Google Cloud customers and partners can programmatically extract key document characteristics – word frequency distributions, relative positioning of line items, dominant language of the input document, etc. – as critical inputs to their downstream business logic. Today, we are adding another important document assessment signal to this toolbox: Intelligent Document Quality (IDQ) scores. IDQ provides page-level quality metrics in the following eight dimensions:Blurriness Level of optical noise DarknessFaintnessPresence of smaller-than-usual fontsDocument getting cut offText spans getting cut offGlares due to lighting conditionsBeing able to discern the optical quality of documents helps assess which documents must be processed differently based on their quality, making the overall document processing pipeline more efficient. For example, Gary Lewis, Managing Director of lending and deposit solutions at Jack Henry, noted, “Google’s Document AI technology, enriched with Intelligent Document Quality (IDQ) signals, will help businesses to automate the data capture of invoices and payments when sending to our factoring customers for purchasing. This creates internal efficiencies, reduces risk for the factor/lender, and gets financing into the hands of cash-constrained businesses quickly.”Overall, document quality metrics pave the way for more intelligent routing of documents for downstream analytics. The reference workflow below uses document quality scores to split and classify documents before sending them to either the pre-built Form Parser (in the case of high document quality) or a Custom Document Extractor trained specifically on lower-quality datasets.2. Process digital PDF documents with confidence with built-in digital PDF supportThe PDF format is popular in various business applications such as procurement (invoices, purchase orders), lending (W-2 forms, paystubs), and contracts (leasing or mortgage agreements). PDF documents can be image-based (e.g., a scanned driver’s license) or digital, where you can hover over, highlight, and copy/paste embedded text in a PDF document the same way as you interact with a text file such as Google Doc or Microsoft Word. We are happy to announce digital PDF support in Document AI OCR. The digital PDF feature extracts text and symbols exactly as they appear in the source documents, therefore making our OCR engine highly performant in complex visual scenarios such as rotated texts, extreme font sizes and/or styles, or partially hidden text.  Discussing the importance and prevalence of PDF documents in banking and finance (e.g., bank statements, mortgage agreements, etc.), Ritesh Biswas, Director, Google Cloud Practice at PwC, said, “The Document AI OCR solution from Google Cloud, especially its support for digital PDF input formats, has enabled PwC to bring digital transformation to the global financial services industry.”3. “Freeze” model characteristics with OCR versioningAs a fully managed cloud-based service, Document AI OCR regularly upgrades the underlying AI/ML models to maintain its world-class accuracy across over 200 languages and scripts. These model upgrades, while providing new features and enhancements, may occasionally lead to changes in OCR behavior compared to an earlier version. Today, we are launching OCR versioning, which enables users to pin to a historical OCR model behavior. The “frozen” model versions, in turn, give our customers and partners peace of mind, ensuring consistent OCR behavior. For industries with rigorous compliance requirements, this update also helps maintain the same model version, thus minimizing the need and effort to recertify stacks between releases. According to Jaga Kathirvel, Senior Principal Architect at Mr. Cooper, “Having consistent OCR behavior is mission-critical to our business workflows. We value Google Cloud’s OCR versioning capability that enables our products to pin to a specific OCR version for an extended period of time.”With OCR versioning, you have the full flexibility to select the versioning option that best fits your business needs.Getting Started on Document AI OCRLearn more about the new OCR features and tutorials in the Document AI Documentation or try it directly in your browser (no coding required). For more details on what’s new with Document AI, don’t forget to check out our breakout session from Google Cloud Next 2022.
Quelle: Google Cloud Platform

Google Cloud wrapped: Top 22 news stories of 2022, according to you

What a year! Over here at Google Cloud, we’re winding things down, but not before taking some time to reflect on everything that happened over the past twelve months. Inspired by the custom Spotify Wrapped playlist playing in our earbuds, we pulled the data about the best-read Google Cloud news posts of the year, to better understand which stories resonated most with you. Many of your favorite stories came as no surprise, as they tracked with major news, product launches, and events. But there were some sleeper hits in there too — stories whose viral success and staying power took us a bit by surprise. We also uncovered some fascinating data about the older posts that you keep coming back to, month after month, year after year (stay tuned for more on that in 2023). So, without further ado, here are the top 22 Google Cloud news stories of 2022, according to you, our readers.Here’s what to know about changes to kubectl authentication coming in GKE v1.26How Google Cloud blocked the largest Layer 7 DDoS attack at 46 million rpsProtecting customers against cryptomining threats with VM Threat Detection in Security Command CenterIntroducing the next evolution of Looker, your unified business intelligence platformEven more pi in the sky: Calculating 100 trillion digits of pi on Google CloudIntroducing AlloyDB for PostgreSQL: Free yourself from expensive, legacy databasesIntroducing Blockchain Node Engine: fully managed node-hosting for Web3 developmentIntroducing Google Public SectorGoogle + Mandiant: Transforming Security Operations and Incident ResponseRaising the bar in Security Operations: Google Acquires SiemplifyThe L’Oréal Beauty Tech Data Platform – A data story of terabytes and serverlessBuild a data mesh on Google Cloud with Dataplex, now generally availableGoogle Cloud launches new dedicated Digital Assets TeamContact Center AI reimagines the customer experience through full end-to-end platformUnveiling the 2021 Google Cloud Partner of the Year Award WinnersAutomate Public Certificates Lifecycle Management via RFC 8555 (ACME)AlloyDB for PostgreSQL under the hood: Intelligent, database-aware storageBringing together the best of both sides of BI with Looker and Data StudioSupercharge your event-driven architecture with new Cloud Functions (2nd gen)Announcing the 2022 Accelerate State of DevOps Report: A deep dive into securityMaking Cobalt Strike harder for threat actors to abuse Securing tomorrow today: Why Google now protects its internal communications from quantum threatsRecognize any of your favorites? We thought you might. See anything you missed? Now’s your chance to catch up.aside_block[StructValue([(u’title’, u’A transformative top 10′), (u’body’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ee4dce80b90>), (u’btn_text’, u’Read the top 10′), (u’href’, u’https://cloud.google.com/blog/transform/top-10-digital-transformation-cloud-stories-trends-2022′), (u’image’, <GAEImage: Transform Top 10 2022>)])]Let’s take a deeper look at these top posts as they landed throughout the year. JanuaryRaising the bar in Security Operations: Google Acquires Siemplify (#10)We set off some new year’s fireworks by acquiring security operations specialist Siemplify, combining their proven security orchestration, automation and response technology with our Chronicle security analytics to build a next-generation security operations workflow.Google Cloud launches new dedicated Digital Assets Team (#13)News flash: blockchain technology has huge potential. So it was no big surprise that readers responded with gusto to the news of Google Cloud’s new Digital Assets Team, whose charter is to support customers’ needs in building, transacting, storing value, and deploying new products on blockchain-based platforms.FebruaryProtecting customers against cryptomining threats with VM Threat Detection in Security Command Center (#3)Who wants their VMs to be hijacked by hackers mining crypto? No one. To help, we added a new layer of threat detection to our Security Command Center that can help detect threats such as cryptomining malware inside virtual machines running on Google Cloud. Here’s what to know about changes to kubectl authentication coming in GKE v1.26 (#1)The open-source Kubernetes community made a big move when it decided to require that all provider-specific code that currently exists in the OSS code base be removed (starting with v1.26). We responded with a blockbuster post (the #1 post of the year, in terms of readership) that outlines how this move impacts the client side. Supercharge your event-driven architecture with new Cloud Functions (2nd gen) (#19)Developers eyeing serverless platforms responded with enthusiasm to news of our next-generation Functions-as-a-Service product, which offers more powerful infrastructure, advanced control over performance and scalability, more control around the functions runtime, and support for triggers from over 90 event sources. Build a data mesh on Google Cloud with Dataplex, now generally available (#12)Building a data mesh is hard to do. But doing so lets data teams centrally manage, monitor, and govern their data across all manner of data lakes, data warehouses, and data marts, so they can make the data available to various analytics and data science tools. With Dataplex, data teams got a new way to do just that.MarchThe L’Oréal Beauty Tech Data Platform – A data story of terabytes and serverless (#11)Serverless, event-driven architecture, cross-cloud analytics… This customer story from L’Oréal about how it built its Beauty Tech Data Platform had it all. Contact Center AI reimagines the customer experience through full end-to-end platform(#14)Customers rely on contact centers for help when they encounter urgent problems with a product or service, but contact centers often struggle to provide timely help. To bridge this gap with the power of AI, Google Cloud built Contact Center AI (CCAI) to streamline and shorten this time to value. CCAI Platform, the addition announced here, expanded this effort by introducing end-to-end call center capabilities.Automate Public Certificates Lifecycle Management via RFC 8555 (ACME) (#16)With this announcement, Google Cloud customers were able to acquire public certificates for their workloads that terminate TLS directly or for their cross-cloud and on-premises workloads using the Automatic Certificate Management Environment (ACME) protocol. This is the same standard used by Certificate Authorities to enable automatic lifecycle management of TLS certificates.AprilBringing together the best of both sides of BI with Looker and Data Studio (#18)When Google Cloud acquired Looker in 2020 for its business intelligence and analytics platform, inquiring minds instantly began asking what would become of Data Studio, Google’s existing self-serve BI solution. This blog began to answer that question.May Introducing AlloyDB for PostgreSQL: Free yourself from expensive, legacy databases (#6)Live from Shoreline at Google I/O, we made one of our largest product announcements of the year, launching a PostgreSQL database that can handle both transactional and analytical workloads, without sacrificing performance.AlloyDB for PostgreSQL under the hood: Intelligent, database-aware storage (#17)Readers couldn’t get enough about AlloyDB, piling on to learn about the inner workings of its database-aware storage (not to mention its columnar engine). June/July Even more pi in the sky: Calculating 100 trillion digits of pi on Google Cloud (#5)A follow up to a reader favorite from 2019, we broke the record (again) by calculating the most digits of pi, leaning into significant advancements in Google Cloud compute, networking and storage. Unveiling the 2021 Google Cloud Partner of the Year Award Winners (#15)Who consistently demonstrates a creative spirit, collaborative drive, and a customer-first approach? Google Cloud partners, of course! With this blog, we were proud to recognize you and to call you our partners!Introducing Google Public Sector (#8)The U.S. government had been asking for more choice in cloud vendors who could support its missions, and protect the health, safety, and security of its citizens. With the announcement of Google Public Sector, a subsidiary of Google LLC that will bring Google Cloud and Google Workspace technologies to U.S. public sector customers, we delivered.AugustHow Google Cloud blocked the largest Layer 7 DDoS attack at 46 million rps (#2)Distributed denial-of-service (DDoS) attacks have been increasing in frequency and growing in size exponentially. In this post, we described how Cloud Armor protected one Google Cloud customer from the largest DDoS attack ever recorded — an attack so large that it was like receiving all of the requests that Wikipedia receives in a day in just 10 seconds. SeptemberGoogle + Mandiant: Transforming Security Operations and Incident Response (#9)Here, we took a moment to reflect on the completion of our acquisition of threat intelligence firm Mandiant. Bringing Mandiant into the Google Cloud fold will allow us to deliver a security operations suite to help enterprises globally stay protected at every stage of the security lifecycle, and focus on eliminating entire classes of threats. Announcing the 2022 Accelerate State of DevOps Report: A deep dive into security (#20)For eight years now, DevOps professionals have pored over the results of DORA’s annual Accelerate State of DevOps Report. This year’s installment focused on the relationship between security and DevOps, using the Supply-chain Levels for Secure Artifacts (SLSA) and NIST Secure Software Development frameworks. OctoberIntroducing the next evolution of Looker, your unified business intelligence platform (#4)In April, we began to lay out our strategy for Looker and Data Studio. At Google Cloud Next ‘22, we took the next step, consolidating the two under the Looker brand umbrella, and adding important new capabilities. Introducing Blockchain Node Engine: fully managed node-hosting for Web3 development (#7)Remember how in January we said that blockchain has a lot of potential? About that. News of the fully managed Blockchain Node Engine node-hosting service took readers by storm, catapulting it to the top ten of 2022, with just over two months left in the year. November/DecemberMaking Cobalt Strike harder for threat actors to abuse (#21)Legitimate versions of Cobalt Strike are a very popular red team software tool, but older, cracked versions are often used by malicious hackers to spread malware. We made available to the security community a set of open-source YARA Rules that can be deployed to help stop the illicit use of Cobalt Strike. Securing tomorrow today: Why Google now protects its internal communications from quantum threats (#22)Google and Google Cloud have taken steps to harden our cryptographic algorithms used to protect internal communications against quantum computing threats. We explain here why we did it, and what challenges we face to achieve this type of future-proofing.That’s a wrap!Barring any last minute surprises, we’re pretty confident that what we have here is the definitive list of your favorite news stories of 2022 — you’ve got great taste. We can’t wait to see what stories inspire you in the new year. Happy holidays, and thanks for reading!
Quelle: Google Cloud Platform