Colossus under the hood: a peek into Google’s scalable storage system

You trust Google Cloud with your critical data, but did you know that Google also relies on the same underlying storage infrastructure for its other businesses as well? That’s right, the same storage system that powers Google Cloud also underpins Google’s most popular products, supporting globally available services like YouTube, Drive, and Gmail. That foundational storage system is Colossus, which backs Google’s extensive ecosystem of storage services, such as Cloud Storage and Firestore, supporting a diverse range of workloads, including transaction processing, data serving, analytics and archiving, boot disks, and home directories. In this post, we take a deeper look at the storage infrastructure behind your VMs, specifically the Colossus file system, and how it helps enable massive scalability and data durability for Google services as well as your applications. Google Cloud scales because Google scalesBefore we dive into how storage services operate, it’s important to understand the single infrastructure that supports both Cloud and Google products. Like any well-designed software system, all of Google is layered with a common set of scalable services. There are three main building blocks used by each of our storage services:Colossus is our cluster-level file system, successor to the Google File System (GFS).  Spanner is our globally-consistent, scalable relational database.Borg is a scalable job scheduler that launches everything from compute to storage services. It was and continues to be a big influence on the design and development of Kubernetes.These three core building blocks are used to provide the underlying infrastructure for all Google Cloud storage services, from Firestore to Cloud SQL to Filestore, and Cloud Storage. Whenever you access your favorite storage services, the same three building blocks are working together to provide everything you need. Borg provisions the needed resources, Spanner stores all the metadata about access permissions and data location, and then Colossus manages, stores, and provides access to all your data. Google Cloud takes these same building blocks and then layers everything needed to provide the level of availability, performance, and durability you need from your storage services. In other words, your own applications will scale the same as Google products because they rely on the same core infrastructure based on these three services scaling to meet your needs. Colossus in a nutshellNow, let’s take a closer look at how Colossus works. But first, a little background on Colossus:It’s the next-generation of the GFS.Its design enhances storage scalability and improves availability to handle the massive growth in data needs of an ever-growing number of applications. Colossus introduced a distributed metadata model that delivered a more scalable and highly available metadata subsystem.  But how does it all work? And how can one file system underpin such a wide range of workloads? Below is a diagram of the key components of the Colossus control plane:Client libraryThe client library is how an application or service interacts with Colossus. The client is probably the most complex part of the entire file system. There’s a lot of functionality, such as software RAID, that goes into the client based on an application’s requirements. Applications built on top of Colossus use a variety of encodings to fine-tune performance and cost trade-offs for different workloads. Colossus Control PlaneThe foundation of Colossus is its scalable metadata service, which consists of many Curators. Clients talk directly to curators for control operations, such as file creation, and can scale horizontally. Metadata databaseCurators store file system metadata in Google’s high-performance NoSQL database, BigTable. The original motivation for building Colossus was to solve scaling limits we experienced with Google File System (GFS) when trying to accommodate metadata related to Search. Storing file metadata in BigTable allowed Colossus to scale up by over 100x over the largest GFS clusters. D File ServersColossus also minimizes the number of hops for data on the network. Data flows directly between clients and “D” file servers (our network attached disks). CustodiansColossus also includes background storage managers called Custodians. They play a key role in maintaining the durability and availability of data as well as overall efficiency, handling tasks like disk space balancing and RAID reconstruction. How Colossus provides rock-solid, scalable storageTo see how this all works in action, let’s consider how Cloud Storage uses Colossus. You’ve probably heard us talk a lot about how Cloud Storage can support a wide range of use cases, from archival storage to high throughput analytics, but we don’t often talk about the system that lies beneath.With Colossus, a single cluster is scalable to exabytes of storage and tens of thousands of machines. In the example above, for example, we have instances accessing Cloud Storage from Compute Engine VMs, YouTube serving nodes, and Ads MapReduce nodes—all of which are able to share the same underlying file system to complete requests. The key ingredient is having a shared storage pool that is managed by the Colossus control plane, providing the illusion that each has its own isolated file system. Disaggregation of resources drives more efficient use of valuable resources and lowers costs across all workloads. For instance, it’s possible to provision for the peak demand of low latency workloads, like a YouTube video, and then run batch analytic workloads more cheaply by having them fill in the gaps of otherwise idle time.Let’s take a look at a few other benefits Colossus brings to the table. Simplify hardware complexityAs you might imagine, any file system supporting Google services has fairly daunting throughput and scaling requirements that must handle multi-TB files and massive datasets. Colossus abstracts away a lot of physical hardware complexity that would otherwise plague storage-intensive applications. Google data centers have a tremendous variety of underlying storage hardware, offering a mix of spinning disk and flash storage in many sizes and types. On top of this, applications have extremely diverse requirements around durability, availability, and latency. To ensure each application has the storage it requires, Colossus provides a range of service tiers. Applications use these different tiers by specifying I/O, availability, and durability requirements, and then provisioning resources (bytes and I/O) as abstract, undifferentiated units.In addition, at Google scale, hardware is failing virtually all the time—not because it’s unreliable, but because there’s a lot of it. Failures are a natural part of operating at such an enormous scale, and it’s imperative that its file system provide fault tolerance and transparent recovery. Colossus steers IO around these failures and does fast background recovery to provide highly durable and available storage.The end result is that the associated complexity headaches of dealing with hardware resources are significantly reduced, making it easy for any application to get and use the storage it requires.Maximize storage efficiencyNow, as you might imagine it takes some management magic to ensure that storage resources are available when applications need them without overprovisioning. Colossus takes advantage of the fact that data has a wide variety of access patterns and frequencies (i.e., hot data that is accessed frequently) and uses a mix of flash and disk storage to meet any need. The hottest data is put on flash for more efficient serving and lower latency. We buy just enough flash to push the I/O density per gigabyte into what disks can typically provide and buy just enough disks to ensure we have enough capacity. With the right mix, we can maximize storage efficiency and avoid wasteful overprovisioning. For disk-based storage,  we want to keep disks full and busy to avoid excess inventory and wasted disk IOPs. To do this, Colossus uses intelligent disk management to get as much value as possible from available disk IOPs. Newly written data (i.e. hotter data) is evenly distributed across all the drives in a cluster. Data is then rebalanced and moved to larger capacity drives as it ages and becomes colder.  This works great for analytics workloads, for example, where data typically cools off as it ages.  Battle-tested to deliver massive scaleSo, there you have it—Colossus is the secret scaling superpower behind Google’s storage infrastructure. Colossus not only handles the storage needs of Google Cloud services, but also provides the storage capabilities of Google’s internal storage needs, helping to deliver content to the billions of people using Search, Maps, YouTube, and more every single day. When you build your business on Google Cloud you get access to the same super-charged infrastructure that keeps Google running. We’ll keep making our infrastructure better, so you don’t have to.To learn more about Google Cloud’s storage architecture, check out the Next ‘20 session from which this post was developed, “A peek at the Google Storage infrastructure behind the VM.” And check out the cloud storage website to learn more about all our storage offerings.Related ArticleOptimizing object storage costs in Google Cloud: location and classesSaving on Cloud Storage starts with picking the right storage for your use case, and making sure you follow best practices.Read Article
Quelle: Google Cloud Platform

How ShareChat built scalable data-driven social media with Google Cloud

Editor’s note: Today’s guest post comes from Indian social media platform ShareChat. Here’s the story of how they improved performance, app development, and analytics for serving regional content to millions of users using Google Cloud. How do you create a social network when your country has 22 major official languages and countless active regional dialects? At ShareChat, we serve more than 160 million monthly active users who share and view videos, images, GIFs, songs, and more in 15 different Indian languages. We also launched a short video platform in 2020, Moj, which already supports over 80 million monthly active users. Connecting with people in the language they understandAs mobile data and smartphones have become more affordable in India, we noticed a large new segment of people, many in rural areas, being welcomed onto the internet. However, many of them didn’t speak English, and when it comes to accessing content and information—language plays a significant role. Instead of joining other social media sites where English reigned supreme, new internet users chose to join language or dialect-specific Whatsapp groups where they felt more comfortable instead.So, we set out to build a platform where people can share their opinions, document their lives, and make new friends, all in their native language. ShareChat simplifies content and people discovery by using a personalized content newsfeed to deliver language-specific content to the right audience.Given the high-intensity data and high volume of content and traffic, we rely heavily on IT infrastructure. On top of that, a large number of our users rely on 2G networks to post, like, view, or follow each other. Our platform needs to deliver great experiences to people who are spread out across the country and different networks without any reduction in performance.The right cloud partner to support future growthShareChat was born in the cloud—we already knew how to scale systems to serve a large customer base with our existing cloud provider. But like many companies, we struggled with over-provisioning compute and storage to accommodate unpredictable traffic and avoid running out of storage. With demand rising for local language content and an increase in online interactions in response to the COVID-19 crisis, we realized that we would need a more efficient way to scale dynamically and allocate resources as needed.Google Cloud was a natural choice for us. We wanted to partner with a technology-first company that would make it easy (and cost-effective) to manage a strong technology portfolio that would allow us to build whatever we wanted. Google is at the forefront of technology innovation and provided everything we needed to build, run, and manage our applications (including creating an efficient DevOps pipeline to fix and release new features quickly). We had a few issues in mind at the start of discussions with the Google Cloud team, but over time, as we got information and support from them, we realized that these were the partners we wanted in our corner when it came time to tackle our most challenging problems. In the end, we decided to take our entire infrastructure to Google Cloud.To support millions of users, we deploy and scale using Google Kubernetes Engine. While we analyze our data using a combination of managed data cloud services, such as Pub/Sub for data pipelines, BigQuery for analytics, Cloud Spanner for real-time app serving workloads, and Cloud Bigtable for less-indexed databases. We also rely on Cloud CDN to help us distribute high-quality and reliable content delivery at low latency to our users. We now use just half the total core consumption of our legacy environment to run ShareChat’s existing workloads.Google Cloud delivers better outcomes at every level By moving to Google Cloud, we saw major benefits in several key areas: Zero-downtime migration for usersAt the time of migration, we had over 70 terabytes of data, consisting of 220 tables—some of which were up to 14 terabytes with nearly 50 billion rows. Due to our data’s interdependencies, moving services over one at a time wasn’t an option for us. Even though we were migrating such large volumes of data, we didn’t want to impact any of our customers. Latency spikes for out-of-sync data might affect message delivery. For instance, if a message or notification was delayed, we didn’t want to risk a bad user experience causing someone to abandon ShareChat. To prepare for the move, we ran a proof-of-concept cluster for over four months to test database performance in a real-world scenario for handling more than a million queries per second. Using an open-source API gateway, we replicated our legacy data environment into Google Cloud for performance testing and capacity analysis. As soon as we were confident Google Cloud could handle the same traffic as our previous cloud environment, we were ready to execute.Using wrappers, we were able to migrate without having to change anything in our existing application code. The entire migration of 60 million users to Google Cloud took five hours—without any data loss or downtime. Today, ShareChat has grown to 160 million users, and Google Cloud continues to give us the support we need.Scaling globally to meet unexpected demandWe rely on real-time data to drive everything on ShareChat by tracking everything that goes on in our app—from messages and new groups to content people like or who they follow. Our users create more than a million posts per day, so it’s critical that our systems can process massive amounts of data efficiently. We chose to migrate to Spanner for its global consistency and secondary index. Unlike our legacy NoSQL database, we could scale without having to rethink existing tables or schema definitions and keep our data systems in sync across multiple locations. It’s also cost-effective for us—moving over 120 tables with 17 indexes into Cloud Spanner reduced our costs by 30%.Spanner also replicates data seamlessly in multiple locations in real time, enabling us to retrieve documents if one region fails. For instance, when our traffic unexpectedly grew by 500% over just a few days, we were able to scale horizontally with zero lines of code change. We were also launching our Moj video app simultaneously, and we were able to move it to another region without a single issue. Simplifying development and deploymentOn average, we experience about 80,000 requests per second (RPS) –nearly 7 billion RPS per day. That means daily push notifications sent out to the entire user base about daily trending topics can often result in a spike of 130,000 RPS in just a few seconds. Instead of over-provisioning, Google Kubernetes Engine (GKE) enables us to pre-scale for traffic spikes around scheduled events, such as holidays like Diwali, when millions of Indians send each other greetings. Migrating to GKE has also enabled us to adopt more agile ways of work, such as automating deployment and saving time with writing scripts. Even though we were already using container-based solutions, they lacked transparency and coverage across the entire deployment funnel. Kubernetes features, such as sidecar proxy, allows us to attach peripheral tasks like logging into the application without requiring us to make code changes. Kubernetes upgrades are managed by default, so we don’t have to worry about maintenance and stay focused on more valuable work. Clusters and nodes automatically upgrade to run the latest version, minimizing security risks and ensuring we always have access to the latest features.Low latency and real-time ML predictionsEven though many of our users may be accessing ShareChat outside of metropolitan areas, it doesn’t mean they’re more patient if the app loads slowly or their messages are delayed. We strive to deliver a high-performance experience, regardless of where our users are. We use Cloud CDN to cache data in five Google Cloud Point of Presence (PoP) locations at the edge in India, allowing us to bring content as close as possible to people and speeding up load time. Since moving to Cloud CDN, our cache hit ratio has improved from 90% to 98.5%—meaning our cache can handle 98.5% of content requests. As we expand globally, we’d like to use machine learning to reach new people with content in different languages. We want to build new algorithms to process real-time datasets in regional languages and accurately predict what people want to see. Google Cloud gives us an infrastructure optimized to handle compute-intensive workloads that will be useful to us both now—and in the future.  The confidence to build the best platformOur current system now performs better than before we migrated, but we are continuously building new features on top of it. Google’s data cloud has provided us with an elegant ecosystem of services that allows us to build whatever we want, more easily and faster than ever before. Perhaps the biggest advantage of partnering with Google Cloud has been the connection we have with the engineers at Google. If we’re working to solve a specific problem statement and find a specific solution in a library or a piece of code, we have the ability to immediately connect with the team responsible for it. As a result, we have experienced a massive boost in our confidence. We know that we can build a really good system because we not only have a good process in place to solve problems—we have the right support behind us.Related ArticleDatabase observability for developers: introducing Cloud SQL InsightsNew Insights tool helps developers quickly understand and resolve database performance issues on Cloud SQL.Read Article
Quelle: Google Cloud Platform

The 5 benefits of Cloud SQL [infographic]

Tired of spending too much time on database maintenance? You’re not alone. Lots of businesses are turning to fully managed database services to help build and scale infrastructure quickly, freeing up time and resources to spend on innovation instead. At Google Cloud, our managed database services include Cloud SQL, which makes it easy to move MySQL, PostgreSQL, and SQL Server workloads to the cloud. Here are the top five benefits of using Google’s Cloud SQL.Click to enlarge the 5 benefits of Cloud SQL infographicCheck out more details on how managed services like Cloud SQL work and why they matter.
Quelle: Google Cloud Platform

Bigtable vs. BigQuery: What’s the difference?

Many people wonder if they should use BigQuery or Bigtable. While these two services have a number of similarities, including “Big” in their names, they support very different use cases in your big data ecosystem.At a high level,Bigtable is a NoSQL wide-column database. It’s optimized for low latency, large numbers of reads and writes, and maintaining performance at scale. Bigtable use cases are of a certain scale or throughput with strict latency requirements, such as IoT, AdTech, FinTech, and so on. If high throughput and low latency at scale are not priorities for you, then another NoSQL database like Firestore might be a better fit.Bigtable is a NoSQL wide-column database optimized for heavy reads and writes.On the other hand, BigQuery is an enterprise data warehouse for large amounts of relational structured data. It is optimized for large-scale, ad-hoc SQL-based analysis and reporting, which makes it best suited for gaining organizational insights. You can even use BigQuery to analyze data from Cloud Bigtable.BigQuery is an enterprise data warehouse for large amounts of relational structured data.(Click to enlarge)Characteristics of Cloud BigtableBigtable is a NoSQL database that is designed to support large, scalable applications. Use Bigtable when you are making any application that needs to scale in a big way in terms of reads and writes per second. Bigtable throughput can be adjusted by adding/removing nodes — each node provides up to 10,000 queries per second (read and write). You can use Bigtable as the storage engine for large-scale, low-latency applications as well as throughput-intensive data processing and analytics. It offers high availability with an SLA of 99.5% for zonal instances. It’s strongly consistent in a single cluster; replication adds eventual consistency across two clusters, and increases SLA to 99.99%.Cloud Bigtable is a key-value store that is designed as a sparsely populated table. It can scale to billions of rows and thousands of columns, enabling you to store terabytes or even petabytes of data. This design also helps store large amounts of data per row or per item, making it great for machine learning predictions. It is an ideal data source for MapReduce-style operations and integrates easily with existing big data tools such as Hadoop, Dataflow, and Dataproc. It also supports the open-source HBase API standard to easily integrate with the Apache ecosystem.For a real world example, see how Ricardo, the largest online marketplace in Switzerland benchmarked and came to a conclusion that Bigtable is much more easier to manage and more cost-effective than self-managed Cassandra.  Characteristics of BigQueryBigQuery is a petabyte-scale data warehouse designed to ingest, store, analyze, and visualize data with ease. Typically, you’ll collect large amounts of data from across your databases and other third-party systems to answer specific questions. You can ingest this data into BigQuery by uploading it in a batch or by streaming data directly to enable real-time insights. BigQuery supports a standard SQL dialect that is ANSI-compliant, so if you already know SQL, you are all set. It is safe to say that you would serve an application that uses Bigtable as the database but most of the times you wouldn’t have applications performing BigQuery queries. Cloud Bigtable shines in the serving path and BigQuery shines in analytics.Once your data is in BigQuery, you can start performing queries on it. BigQuery is a great choice when your queries require you to scan a large table or you need to look across the entire dataset. This can include queries such as sums, averages, counts, groupings or even queries for creating machine learning models. Typical BigQuery use cases include large-scale storage and analysis or online analytical processing (OLAP)For a real-world example, see how Verizon Media used BigQuery for a Media Analytics Pipeline migrating massive Hadoop and enterprise data warehouse (EDW) workloads to Google Cloud’s BigQuery and Looker.Common characteristicsBigQuery and Bigtable are both cloud-native and they both feature unique, industry-leading SLAs. Because updates and upgrades happen transparently behind the scenes, you don’t have to worry about maintenance windows or planning downtime for either service. In addition, they offer unlimited scale, automatic sharding, and automatic failure recovery (with replication). For fast transactions and faster querying, both BigQuery and Bigtable separate processing and storage, which helps maximize throughput.ConclusionIf this has piqued your interest and you are excited to learn about the upcoming innovations to support your data strategy join us in the Data Cloud Summit on May 26th. For more information on BigQuery and Bigtable, check out the individual GCP sketchnotes on thecloudgirl.dev. For similar cloud content, follow me on Twitter @pvergadiaRelated ArticleSpring forward with BigQuery user-friendly SQLThe newest set of user-friendly SQL features in BigQuery are designed to enable you to load and query more data with greater precision, a…Read Article
Quelle: Google Cloud Platform

Google Cloud blog 101: Full list of topics, links, and resources

Curious to know the latest news, updates, and resources across the full range of Google Cloud products and services? Here’s a resource list that gives you instant access to our blog channels that cover everything under the sun (or cloud).Solutions & TechnologiesAI & Machine LearningAPI ManagementApplication DevelopmentBusiness Application PlatformCloud MigrationComplianceComputeContainers & KubernetesCost ManagementData AnalyticsDatabasesDevOps & SREHPCHybrid & MulticloudIdentity & SecurityInfrastructureManagement ToolsNetworkingNo-code DevelopmentOpen SourceProductivity & CollaborationSAP on Google CloudServerlessStorage & Data TransferProducts & ServicesGoogle Cloud PlatformGoogle WorkspaceAnthosBigQueryGoogle Kubernetes Engine (GKE)IndustriesConsumer Packaged GoodsEducationHealthcare & Life SciencesGamingManufacturingMedia & EntertainmentPublic SectorRetailSupply Chain & LogisticsTelecommunicationsGetting startedDevelopers & PractitionersTraining and CertificationsPublic DatasetsWho we work withCustomersPartnersStartupsGetting to know us betterInside Google CloudGoogle Cloud in Asia PacificGoogle Cloud in EuropeGoogle Cloud NextRegionsEventsPerspectivesResearchSustainabilitySystemsLooking for even more on Google Cloud? Our head of DevRel Greg Wilson put together a comprehensive list which collects resources around the web. Find it here: A giant list of Google Cloud resources.Related ArticleWhat’s new with Google CloudFind our newest updates, announcements, resources, events, learning opportunities, and more in one handy location.Read Article
Quelle: Google Cloud Platform

Tracking index backfill operation progress in Cloud Spanner

One of the cool things about Google Cloud Spanner, a horizontally scalable relational database, is that you can do an online schema update. Your database is never down for schema update operations. Cloud Spanner continues to serve data during ongoing schema updates. Imagine your application is querying data from a large table in the Cloud Spanner database and you want to add a secondary index on a column to make the data lookup more efficient. Cloud Spanner automatically starts backfilling, or populating, the index to reflect an up-to-date view of the data being indexed. Depending on the size of the dataset, load on the instance etc, it can take several minutes to many hours for that index backfill to complete. While the database continues to serve the traffic, you may want to check the progress of index backfill to plan for deploying application changes that rely on the new indexes once the backfill is complete.Here is some good news for you. Cloud Spanner now provides an ability to track the progress of index backfill. Let us dive deep to understand how you can use the index backfill progress reporting feature. Index CreationSuppose you want to speed up the queries against an example Singers table, and we realize that it is common for queries to also specify both the FirstName and LastName. The schema for the Singers table is shown below:This problem could be solved by creating a secondary index that contains the FirstName and LastName as part of the index key. Let us say you issue the following index creation statement for the index SingersByFirstLastNames through the GCP Console:This statement will trigger the index backfill operation for a non-interleaved index. The primary key for the secondary index will now contain SingerId, FirstName, and LastName. Once the schema update operation is initiated, you go back to the Indexes tab, and see a spinning wheel next to the SingerByFirstLastNames index. A few minutes go by, and you are confused as to when the SingerByFirstLastNames index will be available to use for your queries. How can you understand how much progress has been made on the creation of the secondary index?Tracking Index Backfill ProgressYou can use gcloud command line tool, REST API, or RPC API to monitor the Index Backfill progress. We are also in the process of adding support for this fieldThe next steps will be to monitor the progress of index backfill, and this will be done using the gcloud command in our example. You can view the progress of the index backfill using the OPERATION_ID. If you don’t have the OPERATION_ID, find it by using gcloud spanner operations list:Output of the “operations list” command:To track the progress of the secondary index backfill operation, use gcloud spanner operations describe:Output of the “operations describe” command when the index backfill has not completed:Here you can observe that the Index Backfill process triggered due to the Index Creation statement has progressed 64%. Once the process is completed, output of the “operations describe” command shows the progress percent as 100% as shown below.The “progress” array is where you will find information related to the progress of the index backfill operation. It contains the “startTime”, “progressPercent”, and “endTime” when available for each schema change statement. This example shows only one index creation statement for simplicity, but there can be multiple schema change statements per schema update operation. For more information on interpreting the index backfill progress for multi-statement schema change operations, please refer to the official documentation. You can then periodically track the progress made on the secondary index backfill operation by invoking the “gcloud spanner operations describe” command until the operation is complete.SummaryNew introspection feature “Index Backfill progress reporting” helps you to get visibility into the progress of the index backfill long-running operation. Similarly you can also get visibility into the progress of Backup/Restore operations as described in official documentation.ReferencesManaging Long-Running OperationsSecondary Indexes DocumentationRelated ArticleIntroducing request priorities for Cloud Spanner APIsToday we’re happy to announce that you can now specify request priorities for some Cloud Spanner APIs. By assigning a HIGH, MEDIUM, or LO…Read Article
Quelle: Google Cloud Platform

5 cheat sheets to help you get started on your Google Cloud journey

Sometimes a picture is worth a thousand words, and that’s where these cheat sheets come in handy. Cloud Developer Advocate Priyanka Vergadia has built a number of guides that help developers visually navigate critical decisions, whether it’s determining the best way to move to the cloud, or deciding on the best storage options. Below are five of her top cheat sheets in one handy location.Google Cloud migration made easyMigration to cloud is the first step to digital transformation because it offers a quick, simple path to cost savings and enhanced flexibility. Read the blog to learn about migrating on-premises or public cloud hosted infrastructure into Google Cloud, or click the image below.Click to enlargeMigrating Apache Hadoop to Dataproc: A decision treeAre you using the Apache Hadoop and Spark ecosystem and looking to simplify resource management? You may want to consider Dataproc. Read the blog post to learn four scenarios for migrating Apache Hadoop clusters to Google Cloud, or click the image below.Click to enlargeGoogle Cloud VMware Engine cheat sheetIf you’ve got VMware workloads and are considering modernizing in the cloud for increased agility and reduced total cost of ownership, VMware Engine may be the service for you. Read the blog post, or expand the image below to learn the benefits, features, and use cases for VMware Engine.Click to enlargeGoogle Cloud block storage optionsGoogle Cloud offers two options for block storage: Persistent Disks and Local SSD. This cheat sheet helps you choose the right one for your app. Read the blog, or click the image below.Click to enlargeGoogle Cloud products in 4 words or lessGoogle Cloud offers lots of products to support a wide variety of use cases. But how do you even know where to start? This list—originally kicked off by Google Cloud’s head of DevRel Greg Wilson—makes it easy to familiarize yourself with the Google Cloud ecosystem so you can quickly get up to speed, choosing the ones you want to dive in deep with documentation or other available resources. To get started, read the blog, visit the GitHub page, or click the image below.Click to enlargeLearn moreWe hope these cheat sheets help make navigating the cloud easier than ever. For more Google Cloud tips and best practices, check out our Tech blog.Related Article13 sample architectures to kickstart your Google Cloud journeyThe 13 most popular application architectures on Google Cloud, and how to get started.Read Article
Quelle: Google Cloud Platform

In case you missed it: All our free Google Cloud training opportunities from Q1

No-cost training opportunities remain a core part of how we help you build your cloud knowledge and showcase your cloud competencies. Since January, we’ve introduced a number of opportunities for you to grow your skills, and we wanted to bring them together into one handy resource so you don’t miss out.Join the Google Cloud 30-day challenge 2021We kicked off the new year with our new skills challenge, offering four initial cloud skills tracks: Getting Started, Data Analytics, Kubernetes (previously titled Hybrid and Multicloud), and Machine Learning (ML) and Artificial Intelligence (AI). Learn more in our blog post, or register for the skills challenge today to get 30 days free access to Google Cloud labs.Don’t know where to start with Google Cloud? We can help.Our Google Cloud OnBoard events are a great way to get an introduction from experts on the core components of Google Cloud, as well as an overview of how our tools impact the entire cloud computing landscape. Read more details or watch the training on demand.Learn how to accelerate data science workflows with LookerLooker, the modern business intelligence (BI) and analytics platform that is now a part of Google Cloud, is more than a BI reporting tool. It’s a full-fledged data application and visualization platform that allows users to curate and publish data. And it integrates with a wide range of endpoints in many different formats, ranging from CSV, JSON, and Excel files to SaaS and in-house built custom applications. Our recent blog post dives into how data analysts and data scientists can use Looker to help with data governance. And for a demonstration of real-life examples of how to use Looker to automate and productionalize data science workflows, watch this on-demand training.Earn the new ‘Optimize Costs for Google Kubernetes Engine’ Skills BadgeWe introduced a new skills badge that tests your ability to run a GKE cluster, ensuring it’s optimized to run an application with all its many microservices and that it can autoscale appropriately to handle both traffic spikes and traffic lulls (where you’ll want to save on your infrastructure costs). Learn more in this blog post or watch this on-demand training to take your first step towards learning how to optimize GKE costs and earning your skill badge.Looking aheadThis year is only getting started when it comes to learning opportunities. April alone included free AI and machine learning training for fraud detection, chatbots, and more. Check back regularly for the latest updates.Related ArticleFree AI and machine learning training for fraud detection, chatbots, and moreThese no-cost training opportunities can help you gain the latest AI and machine learning skills from Google Cloud.Read Article
Quelle: Google Cloud Platform

Predictable serverless deployments with Terraform

As a software developer, you want your code to reliably work. If your code is deployed in any sort of complex architecture, your code may be correct, but a misconfigured deployment could mean the entire system doesn’t work. The ability to reliably deploy complex infrastructure is essential. Having detailed documentation is useful, but just one misconfiguration can cause many issues. In these cases, consider Infrastructure as Code as a way to achieve reliably, repeatable deployments. One tool that’s widely used right now is Terraform, which supports many major cloud platforms, including Google Cloud. Here’s a short example of how Terraform can help: consider the following gcloud command: $ gsutil mb gs://my-new-bucketThis command will create a new storage bucket for you, but if you run it again, you get an error message that the bucket already exists! You could add manual checks around this command to ask if the bucket already exists, and create it if it doesn’t, but when you start adding these checks around all your scripts, it gets complex and unmaintainable.Replacing fallible shell with reliable TerraformWith Terraform, you describe your desired state—in this case, a bucket exists in your project—and Terraform will take the steps required to make sure that state is met. If the bucket already exists, Terraform will take no action. But if the bucket doesn’t exist, Terraform will take the steps required to create it.You write Terraform manifests in HCL—Hashicorp Configuration Language—which allows for such complexity like variables and calculated fields. Terraform will also work out the dependency graph itself, when working with multiple resources. Some resources have to be created before others. And some resources create data that will be used by other resources. For example, if you have a Cloud Run service that relies on that cloud storage bucket, the bucket has to exist first; and Terraform will work that out. If you have, say, a deployment of 5 cloud functions that are independent of each other, terraform will run those creations in parallel; which will be much faster than creating each one of those one by one. This technology also isn’t limited to just serverless products. With the Google Terraform provider you can deploy virtual machines, networking, and other complex infrastructure that would be downright annoying and frustrating to have to manually deploy over and over again.If you want to use Terraform to provision a development environment, consider what should be different compared to your production setup. You may want to add some variables to say, create a smaller Cloud SQL instance rather than a production-spec one, but with Terraform you can easily create duplicate setups. Be aware of the limitationsInfrastructure as Code is good for infrastructure and deploying existing assets, like containers and compiled code. There are other tools that you use to build your containers, and Terraform is not one of those tools. Terraform can be the replacement for the manual deployment after your containers are built, in that existing setup. Alternatively,  it can be integrated into your existing automation, for example as a step in your Cloud Build configuration.  This works well if you are doing in-house development with continuous deployments. Check out “Managing infrastructure as code with Terraform, Cloud Build, and GitOps” for an example of how to implement this configuration. For complex deployments, consider using Terraform. Not only will your deployments get more reliable, you can store your live infrastructure configuration settings along with your code in source control.Terraform in practiceFor an example deployment, follow Katie and Martin as they deploy a sample cat identification service, with Terraform: The application demonstrated in this video and post is available on GitHub in the Serverless-Expeditions repo under the terraform-serverless folder. In this video, we deploy a Cloud Function and Cloud Run service, together with a Cloud Storage bucket, and various IAM configurations, to get the project up and running swiftly. We then look around the Google Cloud project to see what was created, and try making some changes that are then re-asserted by Terraform. Finally, we destroy all the Terraform created resources, and re-create them again, restoring the application. Learn more: https://cloud.google.com/solutions/managing-infrastructure-as-codehttps://registry.terraform.io/providers/hashicorp/google/latest/docs Check out more Serverless Expeditions Serverless Expeditions is a fun and cheeky video series that looks at what serverless means and how to build serverless apps with Google Cloud. Follow these hosts on Twitter at @glasnt and @martinomander.Related ArticleA new Terraform module for serverless load balancingWith the new optimized Terraform load balancing module, you can now set up load balancing for serverless applications on Cloud Run, App E…Read Article
Quelle: Google Cloud Platform

A Google Cloud block storage options cheat sheet

“Where do virtual machines store data so they can access it when they restart?”—We need storage that is persistent in nature. That’s where Persistent Disks come in. Persistent Disk is a high performance block storage service that uses solid state drive (SSD) or hard disk drive (HDD) disks. These disks store data in blocks and are attached to compute. In Google Cloud it means they are attached to Compute Engine or Kubernetes Engine. You can attach multiple persistent disks to Compute Engine or GKE simultaneously and can configure quick, automatic, incremental backups or resize storage on the fly without disrupting your application. Types of Block StorageYou can choose the best Persistent Disk option for you based on your  cost and performance requirements. Standard PD is HDD and provides standard throughput. Because it is the most cost effective option, it is best used for cost-sensitive applications and scale out analytics with Hadoop and Kafka. Balanced PD is SSD and is the best price per GB option. This makes it a good fit for common workloads such as line of business apps, boot disks, and web serving. Performance PD is SSD and provides the best price per IOPS (input/output operations per second). It is best suited for performance sensitive applications such as databases, caches, and scale out analytics. Extreme PD is SSD optimized for applications with uncompromising performance requirements. These could include SAP HANA, Oracle, and the largest in-memory databases.Local SSD is recommended if your apps need really low latency. It is best for hot caches that offer best performance for analytics, media rendering, and other use cases that might require scratch space. How to pick block storage based on availability needsYou can also choose a Persistent Disk based on the availability needs of your app. Use Local SSD if you just need ephemeral storage for a stateless app that manages the replication at the application level or database layer. For most workloads you would be fine with Persistent Disk; it is durable and supports automated snapshots. But, if your app demands even higher availability and is mission critical then there is an option to use a regional persistent disk, which is replicated across zones for near zero Recovery Point Objective (RPO) and Recovery Time Objective (RTO) values. ConclusionWhatever your application use case maybe, if you are using a virtual machine or a Google Kubernetes Engine instance then you will be making a block storage choice. Use the pointers in this post to help you identify the option that works best for your use case. For a more in-depth look into Persistent Disk check out the documentation.  For more #GCPSketchnote, follow the GitHub repo. For similar cloud content follow me on Twitter @pvergadia and keep an eye out on thecloudgirl.dev.
Quelle: Google Cloud Platform