Assembling and managing distributed applications using Google Cloud Networking solutions

In the cloud era, modern applications are assembled from services running across different environments. The benefit of this approach is that organizations can choose which services to use that best serve their needs when building applications. But assembling applications from disparate component services also brings complexity, including:How to connect services together in a reliable and secure mannerEfficiently managing traffic in a consistent way across distributed servicesDefining clear boundaries between the teams that build services vs teams that consume themAs we discussed at the Google Cloud Networking Spotlight, Next-generation application delivery with Google Cloud, we recently introduced solutions to help you reduce the complexity of assembling and managing distributed applications. These solutions include three core networking solutions that allow you to more efficiently orchestrate services into cohesive applications, and include:New Cloud Load Balancers based on the Open Source Envoy Proxy. These Load Balancers give you common traffic management capabilities when using our fully-managed Load Balancers, or when using xDS-based solutions, such as the Envoy Proxy. Ultimately, these Load Balancers allow common traffic management capabilities to be used in services running across different environments.Hybrid Load Balancing and Hybrid Connectivity solutions that connect services across hybrid network environments, so that services work together no matter in which environment the services reside. These include connecting services together that run in Google Cloud, in multicloud environments, or on-premises.Private Service Connect, which allows you to more seamlessly connect services together across different networks. This solution also clearly separates the organizations that develop and maintain services (service producers) from organizations that use the services (service consumers).The Google Cloud networking stackfig 1. Overview of core Google Cloud network productsTo put these solutions into context, let’s first review a high-level overview of core Google Cloud network products.At the foundation of the Google Cloud product stack are connectivity solutions such as Network Connectivity Center, which includes physical Interconnects and VPNs that enable secure and reliable connectivity to on-premises and multicloud deployments into a single coherent connectivity layer.The next layer consists of cloud infrastructure tools that secure your network perimeter, allowing you to make enterprise-wide guarantees of what data can get in and out of your network. Layered on top of that, service networking products let your developers think in services. Instead of worrying about lower-level network constructs like IPs and ports,  these tools let developers think in terms of assembling reusable services into business applications. At the top of the stack are application delivery solutions, allowing you to deliver applications at massive scale. These solutions include Cloud Load Balancers, CDN and Cloud Armor products. And wrapped around it all is Network Intelligence Center, a single-pane of glass view of what’s happening with your network.Together, these solutions are enable three primary Google Cloud Networking capabilities, including:Universal advanced traffic management with Cloud Load Balancing and the Envoy ProxyConnecting services across multicloud and on-premises hybrid network deploymentsSimplifying and securing service connectivity with Private Service ConnectFor the remainder of this blog we will explore these solutions in more detail, and how they work together to give your users the best experience consuming your distributed applications, wherever they are in the world.Advanced traffic management with Cloud Load Balancing and Envoy ProxyWe are excited to introduce our next generation of Google Cloud Load Balancers. They include a new version of our Global External HTTPS Load Balancer and a new Regional External HTTPS Load Balancer. These new load balancers join our existing Internal HTTPS Load Balancer and represent the next generation of our load balancer capabilities. These new Cloud Load Balancers use the Envoy Proxy, a Cloud Native Computing Foundation (CNCF) open source project, providing a consistent data plane and feature set that supports advanced traffic management.fig 2. Overview of the next generation Google Cloud Load BalancersOur next-generation Cloud Load Balancers provide new traffic management capabilities such as advanced routing and traffic policies so you can steer traffic with the flexibility required for complex workloads. A few examples of the advanced traffic management capabilities include:Request mirroring for use cases such as out-of-path feature validationWeighted traffic splitting for use cases such as canary testingFault injection to enable reliability validation such as chaos testingNew load balancing algorithms and session-state affinity optionsAnd because our next-generation Load Balancers are based on open-source technology, they can be used to modernize and efficiently manage services across distributed environments. For example, you can use our Cloud Load Balancers in conjunction with open source Envoy sidecar proxies running in a multicloud or on-premises environment to create a universal traffic control and data-plane solution across heterogeneous architectures. You can optionally use Traffic Director, a fully managed control plane for service mesh architectures to more efficiently manage traffic across xDS-compatible proxies, such as Envoy Proxies.For an example of how to use this universal traffic management architecture across distributed applications, imagine you want to roll out a new service that is used in a distributed system, for example, a shopping cart service that is used in multiple commerce applications. To properly canary-test the rollout, you can use the weighted backend service capability built into Cloud Load Balancers, and in Envoy Sidecar proxies managed by Traffic Director. Here, by incrementally varying the weights, you can safely deploy a new feature or version of a service across distributed applications in a coordinated and consistent manner, and enable uniform canary testing of a new service across your full architecture.fig 3. Canary testing across distributed applicationsHere are more resources for learning about advanced traffic management on Google Cloud:Overview of Google Cloud load balancersAdvanced Traffic management overview for global external HTTP(S) load balancersExternal HTTPs LB with Advanced Traffic Management (Envoy) CodelabSolutions for managing hybrid architectures, multicloud and on-premises deploymentsConsider when you have distributed applications that run on on-premises, inside Google Cloud or in other cloud or software-as-a-service (SaaS) providers. Hybrid Load Balancing and Hybrid Connectivity lets you bring the distributed pieces together. It helps you take a pragmatic approach to adapt to changing market demands and incrementally modernize applications, leveraging the best service available and ultimately providing flexibility to adapt to changing business demands. Hybrid Load Balancing intelligently manages and distributes traffic across a variety of distributed application use cases.fig 4. Hybrid Load Balancing and Hybrid Connectivity use casesGoogle Cloud Hybrid Load Balancing and Hybrid Connectivity solutions include components designed to securely and reliably connect services and applications across different networks. These connectivity options include private Interconnects (Partner & Direct), VPN, or even the public internet, so you can use both private and public connectivity to assemble application services. And our Cloud Load Balancers can manage traffic regardless of where the backend services reside.fig 5. Hybrid Load Balancing deploymentsHybrid Load Balancing and Connectivity can be combined with our next generation of Google Cloud Load Balancers to provide advanced traffic management across Google Cloud, on-premises and in multicloud distributed application deployments. Check out these resources for more on managing hybrid, multicloud and on-premises architectures:Hybrid Load Balancing overviewExternal HTTP(s) Hybrid load balancer CodelabSimplifying and securing connectivity with Private Service ConnectServices that are used across distributed applications are often authored and maintained by one team (service producers) and used by other teams building individual applications (service consumers). This approach provides significant benefits when assembling distributed applications as it enables service reuse and separation of roles for teams building and using services. However, there are also complexities connecting and managing these services across environments.Private Service Connect provides a network-agnostic connectivity layer and a built-in service ownership structure to efficiently reuse services across distributed applications. Private Service Connect provides a method of connecting two cloud networks together without peering and also without sharing IP address space — a seamless way of communicating with services within Google Cloud or across on-premises or multicloud deployments.Private Service Connect provides you with a private IP address from inside your VPC. You can use it to privately access Google services such as Google Cloud Storage or BigQuery, third-party SaaS services such as MongoDB or Snowflake, or even your own services that may be deployed across different VPCs within your organization.fig 6. Private Service Connect overviewPrivate Service Connect also lets you separate the concerns of consumers (the teams that initiate a connection to a service) from the producers (the teams offering a service to be connected to). Because these roles are built-in to Private Service Connect, you don’t have to go through the toil of defining your own organizational structure, but can simply leverage the identity and access permissions already available to you on Google Cloud. Here are more resources on Private Service Connect:Private Service Connect OverviewPrivate Service Connect CodelabConclusionWe hope the solutions presented here give engineers and cloud architects the tools to build robust distributed applications in a hybrid and multicloud environment at scale, allowing you to think less about the details of your network, and more about assembling applications from services that deliver the best value to your users. To learn more about these advanced use cases — and to see how our customers use Google Cloud Networking tools in action — register for our Networking Spotlight today May 24, or on demand thereafter.Related ArticleIntroducing Media CDN—the modern extensible platform for delivering immersive experiencesWe’re excited to announce the general availability of Media CDN — a content and media distribution platform with unparalleled scale.Read Article
Quelle: Google Cloud Platform

Running a virtual event globally, how Gramercy Tech leveraged serverless technology

Editor’s note: In today’s guest post we hear from Gramercy Tech on their experience working with Google Cloud’s serverless technologies as both a customer and vendor. Large events are always a lot of work. By leveraging the pre-provisioned infrastructure of Cloud Run in multiple regions and global load balancing, the team could focus on content and event experiences.As the world gradually emerges from the Covid era, the lessons learned from fully online virtual events will have lasting effects on the way events are managed and attended. When Google Cloud approached Gramercy Tech for help running a global internal sales event we took the opportunity to look at how to enhance the capabilities of our Eventfinity platform to better take advantage of Serverless technologies and multi-regional deployments to deliver both a better user experience, more efficiently, and with less operational burden.With a global audience, the demands of an event site show highly varied loads both at different time zones, but also different parts of the event lifecycle. Registration windows may see bursts of activity at open and closing and during the event mixtures of live and recorded content can cause periods of both low and high traffic. Participants in a global conference are all integrating this time commitment into their busy schedules, and should be able to expect a quick and responsive event site throughout the experience. To meet these goals, we were able to easily adapt our platform to the following global serverless architecture:Deployed architecture across three regionsUsing standard containers, it was easy to package and deploy our primary application services in Cloud Run. We were able to deploy this to multiple regions and use Cloud Load Balancer to ensure that traffic from anywhere in the world was routed to the nearest instance. With Cloud SQL we were able to establish read replication globally. Core functions in our platform are kept speedy by using Redis and Memcache which we were able to deploy using Cloud Memorystore. By taking advantage of these managed services we were able to quickly get this architecture deployed and could focus on running our platform, setting up multi region infrastructure was something we didn’t realize could be so simple.Media and content could be offloaded to Cloud CDN letting us focus on experiences, not on moving bytes. As the application tier handles many of the direct user interactions with our platform, it sees very elastic demand. Cloud Run was a game changer for us, the speed to deploy updates across environments, as well as the scaling up and down instances saves so much time and money.Overall, this experience has guided our teams into using container systems more since so much of GCP leverages containers whereas our past infrastructure was all bare metal servers running code directly. This push has definitely caused us to replatform our entire design using containers and serverless infrastructure wherever possible, making us faster and more stable all around. After this experience we plan to move the bulk of our systems to GCP.  At Gramercy we’ve constantly evolved our technology to meet the needs of the time, from events going paperless pre-covid, to fully virtual during covid to the new world of hybrid events. It has been great to work with Google and Google Cloud to keep event management on the cutting edge.Related ArticleThe L’Oréal Beauty Tech Data Platform – A data story of terabytes and serverlessData warehousing at L’Oréal got a makeover when the company selected BigQuery and Cloud Run as the basis for its revamped data analytics …Read Article
Quelle: Google Cloud Platform

Get value from data quickly with Informatica Data Loader for BigQuery

If data is currency in today’s digital environment, then organizations should waste no time in making sure every business user has fast access to data-driven insights.  Informatica and Google Cloud are working together to make it happen. We’re excited to share that Informatica will provide a free service on Google Cloud called Informatica Data Loader for Google BigQuery, which accelerates data uploads and keeps data flowing so that people can get to the insights and answers they need faster. The company made the announcement at Informatica World, on May 24, 2022, describing Informatica Data Loader as a tool to mitigate lengthy data upload times and associated high costs —  challenges that are only growing as organizations ingest more data from more sources.Maintaining a healthy data pipeline from multiple platforms, applications, services, and other sources requires more work as the number of sources grows. But with Informatica Data Loader, companies can quickly ingest data for free from over 30 common sources into their Google BigQuery cloud data warehouse while Informatica technology automates pipeline ingestion on the back end. This shortens time-to-value for data projects from what could be weeks or months to just minutes, and it frees people up for more strategic data work. The Informatica Data Loader empowers Google Cloud customers to: Centralize disparate data sources on BigQuery for better visibility into data resources and faster delivery to whoever needs the dataQuickly load data into BigQuery in only three steps, with zero setup, zero code, and zero costOperationalize data pipelines with the power, performance, and scale of Informatica’s Intelligent Data Management Cloud (IDMC) at no costReduce maintenance resource requirements by eliminating the need to fix broken pipelines and keep up with changing APIsAllow non-technical users across the organization to easily access, manage, and analyze dataInformatica partnership streamlines data transformationThis isn’t the first time Google Cloud has partnered with Informatica to help customers get the most value from their data. Google Cloud-validated connectors from Informatica help customers streamline data transformations and quickly move data from any SaaS application, on-premises database, or big data source into Google BigQuery. ​​Our partnership has helped hundreds of customers on Google Cloud.“Data is fundamental to digital transformation, and we partner closely with Informatica to make it very easy for enterprises to bring their data from across platforms and environments into the cloud,” said Gerrit Kazmaier, VP and GM of Databases, Data Analytics, and Business Intelligence at Google Cloud. “The launch of Informatica Data Loader will further simplify the path for customers to bring data into BigQuery for analysis, and accelerate their data-driven business transformations.”According to Informatica, Data Loader is the industry’s first zero-cost, zero-code, zero-DevOps, zero-infrastructure-required cloud data management SaaS offering for departmental users. Google Cloud customers can access Informatica Data Loader directly from the Google BigQuery console and ingest data from dozens of common sources, including MongoDB, ServiceNow, Oracle SQL Server, NetSuite, Microsoft SharePoint, and more. The Informatica IDMC solution is available in the Google Cloud Marketplace, but Informatica is making Informatica Data Loader available to all Google BigQuery customers, whether they use IDMC or not. Informatica Data Loader shares a common AI-powered metadata intelligence and automation layer with IDMC, but companies can subscribe to each use case individually. “Expanding our strategic partnership with Google Cloud beyond enterprise cloud data management to offer free, fast, and frictionless  data loading to all Google customers represents a new chapter in our partnership and brings the power of IDMC to everyone,” said Jitesh Ghai, Informatica’s Chief Product Officer. “With the launch of Informatica Data Loader for Google BigQuery, we are enabling every organization to put the power of their data in the hands of business users so they can move from data ingestion to insights at a speed never before possible.”Learn more about the Informatica Data Loader for Google BigQuery here.Related ArticleSecurely exchange data and analytics assets at scale with Analytics Hub, now available in previewEfficiently and securely exchange valuable data and analytics assets across organizational boundaries with Analytics Hub. Start your free…Read Article
Quelle: Google Cloud Platform

Notified team gets smart on MLOps through Advanced Solutions Lab for Machine Learning

Editor’s note: Notified, one of the world’s largest newswire distribution networks, launched a public relations workbench that uses artificial intelligence to help customers pinpoint relevant journalists and expand media coverage. Here’s how they worked with Google Cloud and the Advanced Solutions Lab to train their team on Machine Learning Operations (MLOps).At Notified, we provide a global newswire service for customers to share their press releases and increase media exposure. Our customers can also search our database of journalists and influencers to discover writers who are likely to write relevant stories about their business. To enhance our offering, we wanted to use artificial intelligence (AI) and natural language processing (NLP) to uncover new journalists, articles, and topics—ultimately helping our customers widen their outreach. While our team has expertise in data engineering, product development, and software engineering, this was the first time we deployed an NLP API to be applied to other products. The deployment was new territory, so we needed a solid handle on MLOps to ensure a super responsive experience for our customers. That meant nailing down the process—from ingesting data, to building machine learning (ML) pipelines, and finally deploying an API so our product team could connect their continuous integration/continuous delivery (CI/CD) pipelines. First, I asked around to see how other companies solved this MLOps learning gap. But even at digital-first organizations, the problem hadn’t been addressed in a unified fashion. They may have used tools to support their MLOps, but I couldn’t find a program that trained data scientists and data engineers on the deployment process.Teaming up with Google Cloud to tailor an MLOps curriculumSeeing that disconnect, I envisioned a one-week MLOps hackathon to ramp up my team. I reached out to Google Cloud to see if we could collaborate on an immersive MLOps training. As an AI pioneer, I knew Google would have ML engineers from Advanced Solutions Lab (ASL) who could coach my team to help us build amazing NLP APIs. ASL already had a fully built, deep-dive curriculum on MLOps, so we worked together to tailor our courses and feature a real-world business scenario that would provide my team with the insights they needed for their jobs. That final step of utilization, including deployment and monitoring, was crucial. I didn’t want to just build a predictive model that no one can use. ASL really understood my vision for the hackathon and the outcomes I wanted for my team. They never said it couldn’t be done, we collaborated on a way to build on the existing curriculum, add a pre-training component, and complete it with a hackathon. The process was really smooth because ASL had the MLOps expertise I needed, they understood what I wanted, and they knew the constraints of the format. They were able to flag areas that were likely too intensive for a one-week course, and quickly provided design modules we hadn’t thought to cover. They really were a true part of our team.. In the end—just four months after our initial conversation—we launched our five-week MLOps program. The end product went far beyond my initial hackathon vision to deliver exactly what I wanted, and more.Starting off with the basics: Pre-workThere was so much we wanted to cover in this curriculum that it made sense to have a prerequisite learning plan ahead of our MLOps deep dive training with the ASL team. Through a two-week module, we focused on the basics of data engineering pipelines and ramped up on KubeFlow—an ML toolkit for Kubernetes—as well as NLP and BigQuery, a highly scalable data warehouse on Google Cloud. Getting back in the classroom: MLOps trainingAfter the prerequisite learning was completed, we transitioned into five days of live, virtual training on advanced MLOps with the ASL team. This was a super loaded program, but the instructors were amazing. For this component, we needed to center on real-world use cases that could connect back to our newswire service, making the learning outcomes actionable for our team. We wanted to be extremely mindful of data governance and security so we designed a customized lab based on public datasets. Taking a breather and asking questions: Office hoursAfter nearly three weeks, our team members needed a few days off to absorb all the new information and process everything they had learned. There was a risk of going into the hackathon and being burnt out. Office hours solved that. We gave everyone three days to review what they had learned and get into the right headspace to ace the hackathon. Diving in: Hackathon and deploymentFinally, the hackathon was a chance for our team to implement what they had learned, drill down on our use cases, and actually build a proof of concept–or best-case scenario— working model. Our data scientists built an entity extraction API and a topics API using Natural Language AI to target articles housed in our BigQuery environment. On the data engineering side, we built a pipeline by loading data into BigQuery. We also developed a dashboard that tracks pipeline performance metrics such as records processed and key attribute counts.For our DevOps genius, Donovan Orn, the hackathon was where everything started to click. “After the intensive, instructor-led training, I understood the different stages of MLOps and continuous training, and was ready to start implementing,” Orn said. “The hackathon made a huge difference in my ability to implement MLOps and gave me the opportunity to build a proof of concept. ASL was totally on point with their instruction and, since the training, my team has put a hackathon project into production.”Informing OSU curriculum with a new approach to teaching MLOps The program was such a success that I plan to use the same framework to shape the MLOps curriculum at Oklahoma State University (OSU) where I’m a corporate advisory board member. The format we developed with ASL will inform the way we teach MLOps to students so they can learn the MLOps interactions between data scientists and data engineers that many organizations rely on today. Our OSU students will practice MLOps through real-world scenarios so they can solve actual business problems. And the best part is ASL will lead a tech talk on Vertex AI to help our students put it into practice.Turning our hackathon exercise into a customer-ready serviceIn the end, both my team and Notified customers have benefited from this curriculum. Not only did the team improve their MLOps skills, but they also created two APIs that have already gone into production and significantly augmented the offering we’re delivering to customers. We’ve doubled the number of related articles we’re able to identify and we’re discovering thousands of new journalists or influencers every month. For our customers, that means they can cast a much wider net to share their stories and grow their media coverage. Up next is our API that will pinpoint more reporters and influencers to add to our database of curated journalists.Related ArticleUnlock real-time insights from your Oracle data in BigQueryA tutorial on how to replicate operational data from an Oracle database into BigQuery so that you can keep multiple systems in sync real-…Read Article
Quelle: Google Cloud Platform

Google Cloud Data Heroes Series: Meet Antonio, a Data Engineer from Lima, Peru

Google Cloud Data Heroes is a series where we share stories of the everyday heroes who use our data analytics tools to do incredible things. Like any good superhero tale, we explore our Google Cloud Data Heroes’ origin stories, how they moved from data chaos to a data-driven environment, what projects and challenges they are overcoming now, and how they give back to the community.In this month’s edition, we’re pleased to introduce Antonio! Antonio is from Lima, Peru and works as a full time Lead Data Engineer at Intercorp Retail and a Co-founder of Datapath. He’s also a part time data teacher, data writer, and all around data enthusiast. Outside of his allegiance to data, Antonio is a big fan of the Marvel world and will take any chance to read original comic books and collect Marvel souvenirs. He’s also an avid traveler and enjoys the experience of reliving family memories through travel. Antonio is proudly pictured here atop a mountain in Cayna, Peru, where all of his grandparents lived.When were you introduced to Google Cloud and how did it impact your career? In 2016, I applied for a Big Data diploma at the Universidad Complutense del Madrid, where I had my first experience with cloud. That diploma opened my eyes to a new world of technology and allowed me to get my first job as a Data Engineer at Banco de Crédito del Perú (BCP), the largest bank and the largest supplier of integrated financial services in Perú and the first company in Peru using Big Data technologies. At BCP, I developed pipelines using Apache Hadoop, Apache Spark and Apache Hive in an on-premise platform. In 2018, while I was teaching Big Data classes at the Universidad Nacional de Ingeniería, I realized that topics like deploying a cluster in a traditional PC were difficult for my students to learn without their own hands-on experience. At the time, only Google Cloud offered free credits, which was fantastic for my students because they could start learning and using cloud tools without worrying about costs.In 2019, I wanted a change in my career and left on-prem technologies to specialize in cloud technologies. After many hours of study and practice, I got the Associate Cloud Engineer certification at almost the same time I applied for a Data Engineer position at Intercorp, where I would need to use GCP data products. This new job pushed me to build my knowledge and skills on GCP and matched what I was looking for. Months later, I obtained the Professional Data Engineer certification. That certification, combined with good performance at work, allowed me to get a promotion to Data Architect in 2021. In 2022, I have started in the role of Lead Data Engineer.How have you given back to your community with your Google Cloud learnings?To give back to the community, once a year, I organize a day-long conference called Data Day at Universidad Nacional Mayor de San Marcos where I talk about data trends, give advice to college students, and call for more people to find careers in cloud. I encourage anyone willing to learn and I have received positive comments from people from India and Latin America. Another way I give back is by writing articles sharing my work experiences and publishing them on sites like Towards Data Science, Airflow Community and the Google Cloud Community Blog. Can you highlight one of your favorite projects you’ve done with GCP’s data products?At Intercorp Retail, the digital marketing team wanted to increase online sales by giving recommendations to users. This required the Data & Analytics team to build a solution to publish product recommendations related to an item a customer is viewing on a web page. To achieve this, we built an architecture that looks like the following diagram.We had several challenges. First, finding a backend that supports millions of requests per month. So after some research, we decided to go with Cloud Run because of the ease of development and deployment. The second decision was to define a database for the backend. Since we needed a database that responds in milliseconds, we chose Firestore.Finally, we needed to record all the requests made to our API to identify any errors or bad responses. In this scenario, Pub/Sub and Dataflow allowed us to do it in a simple way without worrying about scaling. After two months, we were ready to see it on a real website (see below). For future technical improvements we’re considering using Apigee as our API proxy to gather all the requests and take them to the correct endpoint. Cloud Build will be our alternative to our deployment process.What’s next for you and what do you hope people will take away from your data hero story? Thanks to the savings that I’ve collected while working in the past five years, I recently bought a house in Alabama. For me, this was a big challenge because I have only lived and worked outside of the United States. In the future, I hope to combine my data knowledge with the real estate world and build a startup to facilitate the home buying process for Latin American people.I’ll also focus on gaining more hands-on experience in data products, and giving back to my community through articles and soon, videos. I dream one day to present a successful case of my work in a big conference like the Google Cloud Next.If you are reading this and you are interested in the world of data and cloud, you just need an internet connection and some invested effort to kickstart your career. Even if you are starting from scratch and are from a developing country like me, believe that it is possible to be successful. Enjoy the journey and you’ll meet fantastic people along the way. Keep learning just like you have to exercise to keep yourself in shape. Finally, if there is anything that I could help you with just send me a message and I would be happy to give you any advice.Begin your own Data Hero journeyReady to embark on your Google Cloud data adventure? Begin your own hero’s journey with GCP’s recommended learning path where you can achieve badges and certifications along the way. Join the Cloud Innovators program today to stay up to date on more data practitioner tips, tricks, and events.If you think you have a good Data Hero story worth sharing, please let us know! We’d love to feature you in our series as well.Related ArticleGoogle Cloud Data Heroes Series: Meet Lynn, a cloud architect equipping bioinformatic researchers with genomic-scale data pipelines on GCPGoogle Cloud introduces their Data Hero series with a profile on Lynn Langit, a data cloud architect, educator, and developer on GCPRead Article
Quelle: Google Cloud Platform

Introducing GKE cost estimator, built right into the Google Cloud console

Have you ever wondered what it will cost to run a particular Google Kubernetes Engine (GKE) cluster? How various configurations and feature choices will affect your costs? What the potential of autoscaling might be on your bill?If you’ve ever tried to  estimate this yourself, you know it can be a puzzle — especially if you’re just starting out with Kubernetes, and don’t have many reference points from existing infrastructure to help. Today we are launching the GKE cost estimator in Preview, seamlessly integrated into the Google Cloud console.This is just the latest of a number of features to help you understand and optimize your GKE environment, for example GKE’s built-in workload rightsizing or GKE cost optimization insights. In addition, if you use GKE Autopilot, you pay for resources that you requested for your currently scheduled Pods, eliminating the need to manage the cost of nodes.It’s all part of our commitment to making Google Cloud the most cost-effective cloud — offering leading price/performance and customer-friendly licensing of course, but also predictable, transparent pricing, so that you can feel confident about building your applications with us. Our customers are embracing these cost optimization methods, as 42% of surveyed customers report that Google Cloud saves them up to 30% over three years. Inside the GKE cost estimator The new GKE cost estimator is part of the GKE cluster creation flow, and surfaces a number of variables that can affect your compute running costs. See the breakdown of costs between management fees, individual node pools, licenses and more. You can also use it to learn how enabling autoscaling mechanisms can impact your estimated expenses, by changing your expected average cluster size.While the GKE cost estimator doesn’t have visibility into your entire environment (e.g., networking, logging, or certain types of discounts), we believe it still provides a helpful overall estimate and will help you understand GKE’s compute cost structure. Combined with the proactive estimator for Cluster autoscaler and Node auto-provisioning, getting a sense for cost has never been easier. Simply input your desired configuration and use the provided sliders to choose the estimated average values that represent your cluster. Try it today!Related ArticleGKE workload rightsizing — from recommendations to actionWith new workload rightsizing capabilities, you get recommendations about your Kubernetes Pod resource requests, and apply them in the GK…Read Article
Quelle: Google Cloud Platform

Training Deep Learning-based recommender models of 100 trillion parameters over Google Cloud

Training recommender models of 100 trillion parametersA recommender system is an important component of Internet services today: billion dollar revenue businesses are directly driven by recommendation services at big tech companies. The current landscape of production recommender systems is dominated by deep learning based approaches, where an embedding layer is first adopted to map extremely large-scale ID type features to fixed-length embedding vectors; then the embeddings are leveraged by complicated neural network architectures to generate recommendations. The continuing advancement of recommender models is often driven by increasing model sizes–several models have been previously released with billion parameters up to even trillion very recently. Every jump in the model capacity has brought in significant improvement on quality.  The era of 100 trillion parameters is just around the corner. The scale of training tasks for recommender models has created unique challenges.  There is a staggering heterogeneity of the training computation–the model’s embedding layer could include more than 99.99% of the total model size, which is extremely memory-intensive. Meanwhile, the complicated, dense rest neural network is increasingly computation-intensive with more than 100 TFLOPs in each training iteration.  Thus, it is important to have some sophisticated mechanism to manage a cluster with heterogeneous resources for such training tasks. Recently, Kwai Seattle AI Lab and DS3 Lab from ETH Zurich have collaborated to propose a novel system named “Persia” to tackle this problem through careful co-design of both the training algorithm and the training system. At the algorithm level, Persia adopts a hybrid training algorithm to handle the embedding layer and dense neural network modules differently. The embedding layer is trained asynchronously to improve the throughput of training samples, while the rest neural network is trained synchronously to preserve statistical efficiency. At the system level, a wide range of system optimizations for memory management and communication reduction have been implemented to unleash the full potential of the hybrid algorithm.  Deploying a large-scale training on Google CloudThe massive scale required by Persia posed multiple challenges, from network bandwidth required across components to the amount of RAM memory required to store the embeddings. Additionally, there is a sizable number of virtual machines needed to be deployed, automated, and orchestrated to minimize the pipeline and optimize costs. Specifically, the workload runs on the following heterogeneous resources:3,000 cores of compute-intensive Virtual Machines8 A2 Virtual Machines adding a total of 64 A100 Nvidia GPUs30 High Memory Virtual Machines, each with 12 TB of RAM, totalling 360 TBOrchestration with KubernetesAll resources had to be launched concurrently in the same zone to minimize network latency. Google Cloud was able to provide the required capacity with very little notice.Given the bursty nature of the training, Google Kubernetes Engine (GKE) was utilized to orchestrate the deployment of the 138 VMs and software containers. Having the workload containerized also allows for porting and repeatability of the training. The team chose to keep all embeddings in memory during the training. This requires the availability of highly specialized “Ultramem” VMs, though for a relatively short period of time. This was critical to scale the training up to 100 trillions parameters while keeping cost and duration of processing under control. Results and ConclusionsWith the support of the Google Cloud infrastructure, the team demonstrated Persia’s scalability up to 100 trillion parameters. The hybrid distributed training algorithm introduced elaborate system relaxations for efficient utilization of heterogeneous clusters, while converging as fast as vanilla SGD. Google Cloud was essential to overcome the limitations of on-premise hardware and proved an optimal computing environment for distributed Machine Learning training on a massive scale. Persia has been released as an open source project on github with setup instructions for Google Cloud —everyone from both academia and industry would find it easy to train 100-trillion-parameter scale, deep learning recommender models.Related ArticleRecommendations AI modelingIn this series of Recommendations AI deep dive blog posts, we started with an overview of Recommendations AI and then walked through the …Read Article
Quelle: Google Cloud Platform

Built with BigQuery: Material Security’s novel approach to protecting email

Editor’s note: The post is part of a series highlighting our awesome partners, and their solutions, that are Built with BigQuery.Since the very first email was sent more than 50 years ago, the now-ubiquitous communication tool has evolved into more than just an electronic method of communication. Businesses have come to rely on it as a storage system for financial reports, legal documents, and personnel records. From daily operations to client and employee communications to the lifeblood of sales and marketing, email is still the gold standard for digital communications.But there’s a dark side to email, too: It’s a common source of risk and a preferred target for cybercriminals. Many email security approaches try to make it safer by blocking malicious emails, but leave the data in those mailboxes unguarded in case of a breach.Material Security takes a different approach. As an independent software vendor (ISV), we start with the assumption that a bad actor already has access to a mailbox, and tries to reduce the severity of the breach by providing additional protections for sensitive emails.For example, Material’s Leak Prevention solution finds and redacts sensitive content in email archives but allows for it to be reinstated with a simple authentication step when needed. The company’s other products include:ATO Prevention, which stops attackers from misusing password reset emails to hijack other services.Phishing Herd Immunity, which automates security teams’ response to employee phishing reports.Visibility and Control, which provides risk analytics, real-time search, and other tools for security analysis and management.Material’s products can be used with any cloud email provider, and allow customers to retain control over their data with a single-tenant deployment model. Powering data-driven SaaS apps with Google BigQueryEmail is a large unstructured dataset, and protecting it at scale requires quickly processing vast amounts of data — the perfect job for Google Cloud’s BigQuery data warehouse. “BigQuery is incredibly fast and highly scalable, making it an ideal choice for a security application like Material,” says Ryan Noon, CEO and co-founder of Material. “It’s one of the main reasons we chose Google Cloud.” BigQuery provides a complete platform for large-scale data analysis inside Google Cloud, from simplified data ingestion, processing, and storage to powerful analytics, AI/ML, and data sharing capabilities. Together, these capabilities make BigQuery a powerful security analytics platform, enabled via Material’s unique deployment model.Each customer gets their own Google Cloud project, which comes loaded with a BigQuery data warehouse full of normalized data across their entire email footprint. Security teams can query the warehouse directly to power internal investigations and build custom, real-time reporting — without the burden of building and maintaining large-scale infrastructure themselves. Material’s solutions are resonating with a diverse range of customers including leading organizations such as Mars, Compass, Lyft, DoorDash and Flexport. The Built with BigQuery advantage for ISVs Material’s story is about innovative thinking, skillful design, and strategic execution, but BigQuery is also a foundational part of the company’s success. Mimicking this formula is now easier for ISVs through Built with BigQuery, which was announced at the Google Data Cloud Summit in April.Through Built with BigQuery, Google is helping tech companies like Material build innovative applications on Google’s data cloud with simplified access to technology, helpful and dedicated engineering support, and joint go-to-market programs. Participating companies can: Get started fast with a Google-funded, pre-configured sandbox. Accelerate product design and architecture through access to designated experts from the ISV Center of Excellence who can provide insight into key use cases, architectural patterns, and best practices. Amplify success with joint marketing programs to drive awareness, generate demand, and increase adoption.BigQuery gives ISVs the advantage of a powerful, highly scalable data warehouse that’s integrated with Google Cloud’s open, secure, sustainable platform. And with a huge partner ecosystem and support for multicloud, open source tools and APIs, Google provides technology companies the portability and extensibility they need to avoid data lock-in. Click here to learn more about Built with BigQuery.Related ArticleHelping global governments and organizations adopt Zero Trust architecturesGoogle details how it helps governments embark on a Zero Trust journey as the anniversary of the Biden Zero Trust Executive Order approac…Read Article
Quelle: Google Cloud Platform

Get more insights with the new version of the Node.js library

We’re thrilled to announce the release of a new update to the Cloud Logging Library for Node.js with the key new features of improved error handling and writing structured logging to standard output which becomes handy if you run applications in serverless environments like Google Functions!The latest v9.9.0 of Cloud Logging Library for Node.js makes it even easier for Node.js developers to send and read logs from Google Cloud providing real-time insight into what is happening in your application through comprehensive tools like Log Explorer. If you are a Node.js developer working with Google Cloud, now is a great time to try out Cloud Logging.The latest features of the Node.js library are also integrated and available in other packages which are based on Cloud Logging Library for Node.js:@google-cloud/logging-winston – this package integrates Cloud Logging with the Winston logging library. @google-cloud/logging-bunyan – this package integrates Cloud Logging with the Bunyan logging library. If you are unfamiliar with the Cloud Logging Library for Node.js, start by running following command to add the library to your project:code_block[StructValue([(u’code’, u’npm install @google-cloud/logging’), (u’language’, u”)])]Once the library is installed, you can use it in your project. Below, I demonstrate how to initialize the logging library, create a client assigned configured with a project ID,  and log a single entry ‘Your log message':code_block[StructValue([(u’code’, u”// Imports the Google Cloud client library rn const { Logging } = require(‘@google-cloud/logging’);rn // Creates a client with predefined project Id and a path torn // credentials JSON file to be used for auth with Cloud Loggingrn const logging = new Logging(rn {rn projectId: ‘your-project-id’,rn keyFilename: ‘/path/to/key.json’,rn }rn );rn // Create a log with desired log namern const log = logging.log(‘your-log-name’);rn // Create a simple log entry without any metadatarn const entry = log.entry({}, ‘Your log message’);rn // Log your record!!!rn log.info(entry);”), (u’language’, u”)])]Here’s the log message generated by this code in Log Explorer:Two critical features of the latest Cloud Logging Library for Node.js release are writing structured log entries to standard output and error handling with a default callback. Let’s dig in deeper. Writing structured log entries to standard outputThe LogSync class helps users write context-rich structured logs to stdout or any other Writable interface. This class extracts additional log properties like trace context from HTTP headers, and can be used to toggle between writing to the Cloud Logging endpoint or to stdout during local development.In addition, writing structured logging to stdout can be integrated with a Logging agent. Once a log is written to stdout, a Logging agent then picks up those logs and delivers those to Cloud Logging out-of-process. Logging agents can add more properties to each entry before streaming it to the Logging API.We recommend serverless applications (i.e. applications running in Cloud Functions and Cloud Run) to use the LogSync class as async logs delivery may be dropped due to lack of CPU or other environmental factors  preventing the logs from being sent immediately to the Logging API. Cloud Functions and Cloud Run applications by their nature are ephemeral and can have a short lifespan which will cause logging data drops when an instance is shut down before the logs have been sent to Cloud Logging servers. Today, Google Cloud managed services automatically install Logging agents for all Google serverless environments in the resources that they provision – this means that you can use LogSync in your application to seamlessly deliver logs to Cloud Logging through standard output.Below is a sample how to use LogSync class:code_block[StructValue([(u’code’, u”const { Logging } = require(‘@google-cloud/logging’);rn const logging = new Logging(rn {rn projectId: ‘your-project-id’,rn keyFilename: ‘/path/to/key.json’,rn }rn );rn// Create a LogSync transport, defaulting to `process.stdout`rnconst log = logging.logSync(‘Your-log-name’);rnconst entry = log.entry({}, ‘Your log message’);rnlog.write(entry);”), (u’language’, u”)])]If you use @google-cloud/logging-winston  or @google-cloud/logging-bunyan library, you can set the redirectToStdout parameter in LoggingWinston or LoggingBunyan constructor options respectively. Below is a sample code how to redirect structured logging output to stdout for LoggingWinston class:code_block[StructValue([(u’code’, u”// Imports the Google Cloud client library for Winstonrnconst {LoggingWinston} = require(‘@google-cloud/logging-winston’);rnrn// Creates a client that writes logs to stdoutrnconst loggingWinston = new LoggingWinston({rn projectId: ‘your-project-id’,rn keyFilename: ‘/path/to/key.json’,rn redirectToStdout: true,rn});”), (u’language’, u”)])]Error Handling with a default callbackThe Log class provides users the ability to write and delete logs asynchronously. However, there are cases when log entries cannot be written or deleted and an error is thrown – if the error is not handled properly, it can crash the application. One possible way to handle the error is to await the log write/delete calls and wrap it with try/catch. However, waiting for every write or delete call may introduce delays which could be avoided by simply adding a callback as shown below:code_block[StructValue([(u’code’, u”// Asynchronously write the log entry and handle response or rn // any errors in provided callbackrn log.write(entry, err => {rn if (err) {rn // The log entry was not written.rn console.log(err.message);rn } else {rn console.log(‘No error in write callback!’);rn }rn });”), (u’language’, u”)])]Adding a callback to each write or delete call is duplicate code and remembering to include it for each call may be toilsome, especially if  the code handling the error is always the same. To eliminate this burden, we introduced the ability to provide a default callback for the Log class which can be set through the LogOptions passed to the Log constructor as in example below:code_block[StructValue([(u’code’, u”const {Logging} = require(‘@google-cloud/logging’);rn const logging = new Logging();rn rn // Create options with default callback to be called on rn // every write/delete response or errorrn const options = {rn defaultWriteDeleteCallback: function (err) {rn if (err) {rn console.log(‘Error is: ‘ + err);rn } else {rn console.log(‘No error, all is good!’);rn }rn },rn };rnrn const log = logging.log(‘my-log’, options);”), (u’language’, u”)])]If you use @google-cloud/logging-winston  or @google-cloud/logging-bunyan library, you can set the callback through defaultCallback parameter in LoggingWinston or LoggingBunyan constructor options respectively. Here is an example of  how to set a default callback for LoggingWinston class:code_block[StructValue([(u’code’, u”// Imports the Google Cloud client library for Winstonrnconst {LoggingWinston} = require(‘@google-cloud/logging-winston’);rnrn// Creates a clientrnconst loggingWinston = new LoggingWinston({rn projectId: ‘your-project-id’,rn keyFilename: ‘/path/to/key.json’,rn defaultCallback: err => {rn if (err) {rn console.log(‘Error occurred: ‘ + err);rn }rn },rn});”), (u’language’, u”)])]Next StepsNow, when you integrate the Cloud Logging Library for Node.js in your project, you can start using the latest features. To try the latest Node.js library in Google Cloud you can follow this quickstart walkthrough guide:For more information on the latest check out for Cloud Logging Library for Node.js user guide.For any feedback or contributions, feel free to open issues in our Cloud Logging Library for Node.js GitHub repo. Issues can be also opened for bugs, questions about library usage and new feature requests.Related ArticleIntroducing a high-usage tier for Managed Service for PrometheusNew pricing tier for our managed Prometheus service users with over 500 billion metric samples per month. Pricing for existing tiers redu…Read Article
Quelle: Google Cloud Platform

Run your fault-tolerant workloads cost-effectively with Google Cloud Spot VMs, now GA

Available in GA today, you can now begin deploying Spot VMs in your Google Cloud projects to start saving now. For an overview of Spot VMs, see our Preview launch blog and for a deeper dive, check out our Spot VM documentation. Modern applications such as microservices, containerized workloads, and horizontal scalable applications are engineered to persist even when the underlying machine does not. This architecture allows you to leverage Spot VMs to access capacity and run applications at a low price. You will save 60 – 91% off the price of our on-demand VMs with Spot VMs.To make it even easier to utilize Spot VMs, we’ve incorporated Spot VM support in a variety of tools. Google Kubernetes Engine (GKE)Containerized workloads are often a good fit for Spot VMs as they are generally stateless and fault tolerant. Google Kubernetes Engine (GKE) provides container orchestration. Now with native support for Spot VMs, use GKE to manage your Spot VMs to get cost savings. On clusters running GKE version 1.20 and later, the kubelet graceful node shutdown feature is enabled by default, which allows the kubelet to notice the preemption notice, gracefully terminate Pods that are running on the node, restart Spot VMs, and reschedule Pods. As part of this launch, Spot VM support in GKE is now GA. For best practices on how to use GKE with Spot VMs, see our architectural walkthrough on running web applications on GKE using cost-optimized Spot VMs as well as our GKE Spot VM documentation.  GKE Autopilot Spot PodsKubernetes is a powerful and highly configurable system. However, not everyone needs that much control and choice. GKE Autopilot provides a new mode of using GKE which automatically applies industry best practices to help minimize the burden of node management operations. When using GKE Autopilot, your compute capacity is automatically adjusted and optimized based on your workload needs. To take your efficiency to the next level, mix in Spot Pods to drastically reduce the cost of your nodes. GKE Autopilot gracefully handles preemption events by redirecting requests away from nodes with preempted Spot Pods and manages autoscaling and scheduling to ensure new replacement nodes are created to maintain sufficient resources. Spot Pods for GKE Autopilot is now GA, and you can learn more through the GKE Autopilot and Spot Pods documentation.  TerraformTerraform makes managing infrastructure as code easy, and Spot VM support is now available for Terraform on Google Cloud. Using Terraform templates to define your entire environment, including networking, disks, and service accounts to use with Spot VMs, makes continuous spin-up and tear down of deployments a convenient, repeatable process. Terraform is especially important when working with Spot VMs as the resources should be treated as ephemeral. Terraform works even better in conjunction with GKE to define and manage a node poolseparately from the cluster control plane. This combination gives you the best of both worlds by using Terraform to set up your compute resources while allowing GKE to handle autoscaling and autohealing to make sure you have sufficient VMs after preemptions. SlurmSlurm is one of the leading open-source HPC workload managers used in TOP 500 supercomputers around the world. Over the past five years, we’ve worked with SchedMD, the company behind Slurm, to release ever-improving versions of Slurm on Google Cloud. SchedMD recently released the newest Slurm for Google Cloud scripts, available through the Google Cloud Marketplace and in SchedMD’s GitHub repository. This latest version of Slurm for Google Cloud includes support for Spot VMs via the Bulk API. You can read more about the release in the Google Cloud blog post.Related ArticleCloud TPU VMs are generally availableCloud TPU VMs with Ranking & Recommendation acceleration are generally available on Google Cloud. Customers will have direct access to TP…Read Article
Quelle: Google Cloud Platform