Scale your EDA flows: How Google Cloud enables faster verification

Companies embark on modernizing their infrastructure in the cloud for three main reasons: 1) to accelerate product delivery 2) to reduce system downtime and 3) to enable innovation. Chip designers with Electronic Design Automation (EDA) workloads share these goals, and can greatly benefit from using cloud. Chip design and manufacturing includes several tools across the flow, with varied compute and memory footprints. Register Transfer Level (RTL) design and modeling is one of the most time consuming steps in the design process, accounting for more than half the time needed in the entire design cycle. RTL designers use Hardware Description Languages (HDL) such as SystemVerilog and VHDL to create a design which then goes through a series of tools. Mature RTL verification flows include static analysis (checks for design integrity without use of test vectors), formal property verification (mathematically proving or falsifying design properties), dynamic simulation (test vector-based simulation of actual designs) and emulation (a complex system that imitates the behavior of the final chip, especially useful to validate functionality of the software stack).Dynamic simulation arguably takes up the most compute in any design team’s data center. We wanted to create an easy set up using Google Cloud technologies and open-source designs and solutions to showcase three key points:How simulation can accelerate with more computeHow verification teams can benefit from auto-scaling cloud clustersHow organizations can effectively leverage the elasticity of cloud to build highly utilized technology infrastructureOpenPiton Tile architecture(a) and Chipset Architecture(b)Source: OpenPiton: An Open Source Manycore Research Framework, Balkind et alWe did this using a variety of tools: We used the OpenPiton design verification scripts, Icarus Verilog Simulator, SLURM workload management solution and Google Cloud standard compute configurations.OpenPiton is the world’s first open-source, general-purpose, multithreaded manycore processor and framework. Developed at Princeton University, it’s scalable and portable and can scale up to 500-million cores. It’s wildly popular within the research community and comes with scripts for performing the typical steps in the design flow, including dynamic simulation, logic synthesis and physical synthesis.Icarus Verilog, sometimes known as iverilog, is an open-source Verilog simulation and synthesis tool. Simple Linux Utility for Resource Management or SLURM is an open-source, fault-tolerant and highly scalable cluster management and job scheduling system for Linux clusters. SLURM provides functionality such as enabling user access to compute nodes, managing a queue of pending work, and a framework for starting and monitoring jobs. Auto-scaling of a SLURM cluster refers to the capability of the cluster manager to spin up nodes on demand and shut down nodes automatically after jobs are completed.SLURM Components. Source: slurm.schedmd.com/quickstart.htmlSetupWe used a very basic reference architecture for the underlying infrastructure. While simple, it was sufficient to achieve our goals. We used standard N1 machines (n1-standard-2 with 2 vCPUs, 7.5 GB memory), and set up the SLURM cluster to auto-scale to 10 compute nodes. The reference architecture is shown here. All required scripts are provided in this github repo.Running the OpenPiton regressionThe first step in running the OpenPiton regression is to follow the steps outlined in the github repo and complete the process successfully. The next step is to download the design and verification files. Instructions are provided in the github repo. Once downloaded, there are three simple setup tasks to perform:Set up the PITON_ROOT environment variable (%export PITON_ROOT=<location of root of OpenPiton extracted files>) Set up the simulator home (%export ICARUS_HOME=/usr). The scripts provided to you in the github repo already take care of installing Icarus on the machines provisioned. This shows yet another advantage of cloud: simplified machine configuration.Finally, source your required settings (%source $PITON_ROOT/piton/piton_settings.bash)For the verification run, we used the single tile setup for OpenPiton, the regression script ‘sims’ provided in the OpenPiton bundle and the ‘tile1_mini’ regression. We tried two runs—sequential and parallel. The parallel runs were managed by SLURM.We invoked the sequential run using the following command:%sims -sim_type=icv -group=tile1_miniAnd the distributed run using this command:%sims -sim_type=icv -group=tile1_mini -slurm -sim_q_command=sbatchResults The ‘tile1_mini’ regression has 46 tests. Running all 46 tile1_mini tests sequentially took an average of 120 minutes. The parallel run for tile1_mini with 10 auto-scaled SLURM nodes completed in 21 minutes—a 6X improvement!View of VM instances on GCP console; node instances edafarm-compute0-<0-9> are created when the regression is launchedView of VM instances on GCP console when the regression was winding down; notice that the number of nodes has decreasedFurther, we wanted to also highlight the value of autoscaling. The SLURM cluster was set up with two static nodes, and 10 dynamic nodes. The dynamic nodes were up and active quite soon after the distributed run was invoked. Since the nodes are shut down if there are no jobs, the cluster auto-scaled to 0 nodes after the run was complete. The additional cost of the dynamic nodes for the time of the simulation was $8.46.Report generated to view compute utilization of SLURM nodes; notice the high utilization of the top 5 nodesThe cost of the extra compute can also be easily viewed by the several existing reports on GCP consoleThe above example shows a simple regression run, with very standard machines. By providing the capability to scale to more than 10 machines, further improvements in turnaround time can be achieved. In real-life, it is common for project teams to run millions of simulations. By having access to elastic compute capacity, you can dramatically reduce the verification process and shave months off verification sign-off. Other considerationsTypical simulation environments use commercial simulators that extensively leverage multi-core machines and large compute farms. When it comes to Google Cloud infrastructure, it’s possible to build many different machine types (often referred to as “shapes”) with various numbers of cores, disk types, and memory. Further, while a simulation can only tell you whether the simulator ran successfully, verification teams have the subsequent task of validating the results of a simulation. Elaborate infrastructure that captures the simulation results across simulation runs—and provides follow-up tasks based on findings—is an integral part of the overall verification process. You can use Google Cloud solutions such as Cloud SQL and BigTable to create a high-performance, highly scalable and fault-tolerant simulation and verification environment. Further, you can use solutions such as AutoML Tables to infuse ML into your verification flows.Interested? Try it out!All the required scripts are publically available—no cloud experience is necessary to try them out. Google Cloud provides everything you need, including free Google Cloud credits to get you up and running. Click here to learn more about high performance computing (HPC) on Google Cloud.Related ArticleHPC made easy: Announcing new features for Slurm on GCPScheduling HPC workloads on GCP just got easier, with new integrations to the Slurm HPC workload manager.Read Article
Quelle: Google Cloud Platform

You asked, we listened—more productivity features to close out Spanner’s year

Cloud Spanner has had a busy year. We’ve rolled out more enterprise features, including managed backup and restore, local emulator, and numerous multi-region configurations. It’s easier than ever to test and deploy applications across multiple regions without manual sharding, downtime, or patching, and with an industry-leading SLA of 99.999%.In addition to enterprise features, we continued our focus on making developers more productive. Earlier this year we launched foreign keys, C++ client library, query optimizer, and various introspection features. In this post, we’ll discuss three recently launched Spanner features: check constraints, generated columns, and NUMERIC data type. We launched these features to boost your productivity when building an application on Spanner. Read on for a brief description of each feature, and an example of how check constraints and generated columns can be combined to provide additional referential integrity to your application. Check constraintsA check constraint allows you to specify that the values of one or more columns must satisfy a boolean expression. With check constraints, you can specify predicates (boolean expressions) on a table, and require that all rows in the table satisfy those predicates. For example, you can require that the end time of a concert is later than its start time:Here, we see the “start_before_end” constraint requires the value of the StartTime column to be less than the EndTime, otherwise an insert or update for any row will fail. Check constraints join foreign keys,NOT NULL constraints, and UNIQUE indexes as methods to add integrity constraints to your database.Generated columnsA generated column is a column whose value is computed from other columns in the same row. This is useful to push down critical data logic into the database, instead of relying on the application layer. Generated columns can make a query simpler or save the cost of evaluating an expression at query time. Like other column types, they can also be indexed or used as a foreign key. For example, you can create a generated column that concatenates the first and last name fields into a new column:In this example, the value of “FullName” is computed when a new row is inserted or when “FirstName” and/or “LastName” is updated for an existing row. The computed value is stored and accessed the same way as other columns in the table, but can’t be updated on its own. NUMERIC data typeCustomer requests are a crucial part of how we prioritize features, and NUMERIC data type was a common request. The NUMERIC data type provides precision, useful across many industries and functions, such as financial, scientific, or engineering applications. NUMERIC is useful, where a precision of 30 digits or more is commonly required. Spanner’s NUMERIC has precision of 38 and scale of 9, meaning it can store a number with a total of 38 digits, 9 of which can be fractional (i.e., to the right of the decimal). When you need to store an arbitrary precision number in a Spanner database, and you need more precision than NUMERIC provides, we recommend that you store the value as its decimal representation in a STRING column.Using new Spanner features in an example use caseIn the following example, we look at an ecommerce application and see how check constraints, generated columns, and basic indexing can help improve application performance and reliability. A common pattern for an ecommerce site might be creating a case-insensitive search for a product. In many cases, product names are stored case-sensitively, and the practice of capitalization can vary greatly. To improve the performance of searching over product names, we can create a generated column that converts product name to uppercase and then create an index on that.The schema for the product table could look like this:You now have a generated column that stores the product name column with all capital letters. This generated column is stored like any other column in the table, and can be indexed for faster lookups:To search a product name, convert the search term into uppercase first. We also need to instruct Spanner to use the index we just created:Now when we query over the product name, we get faster and consistent results, regardless of how the product is capitalized in the listing.An ecommerce company may also keep a separate pricing table for its products. A few simple check constraints can make sure that prices for products don’t enter an invalid state, and that discount programs don’t overlap. For example, the following constraint checks that a product price cannot be negative:The pricing table also has two columns for discounts, one of which is a seasonal discount that applies during certain seasonal holidays, and another that is provided for new customer registrations. Here we can add a check constraint to an existing table to ensure that only one discount is active at a time:For many customers, check constraints, generated columns, and NUMERIC data type will be welcome additions to the toolbox for defining schema and creating high-performing applications. We hope that this year’s launches have made it easier to build, test, and deploy your applications using Spanner as your globally consistent database. We look forward to your feedback, and stay tuned for a busy 2021.Learn moreTo get started with Spanner, create an instanceor try it out with a Spanner Qwiklab.Related ArticleAutomatically right-size Spanner instances with the new AutoscalerFrom the Google Cloud blog: new Autoscaler lets you scale Spanner instances up and down easily to optimize costs and usage based on utili…Read Article
Quelle: Google Cloud Platform

Google Cloud’s no-code year in review

At the start of 2020, Google Cloud set out to reimagine the application development space by acquiring AppSheet, an intelligent no-code application development platform that equips both IT and line-of-business users with the tools they need to quickly build apps and automation without writing any code. In the months that followed, we’ve experienced change, growth and a few surprises along the way. Let’s take a look back at 2020 and examine how AppSheet has helped organizations and individuals across the globe create new ways to work.Responding to the COVID-19 pandemicIn retrospect, the timing of the AppSheet acquisition—which happened right as the pandemic’s impact was becoming better understood—placed Google Cloud in a unique position to support individuals and organizations responding to the crisis. People all around the world, many of whom had no experience writing code, built powerful applications on the AppSheet platform that helped their organizations and communities respond in these uncertain times:USMEDIC, a provider of comprehensive equipment maintenance solutions to healthcare and medical research communities, built a medical equipment tracking and management solution to support various healthcare organizations, including overrun  hospitals struggling to locate equipment.The Mthunzi Network, a not-for-profit organization that distributes aid to vulnerable populations, built an easy-to-use app to automate the distribution and redemption of digital food vouchers.The AppSheet Community at large rallied around a particular app that was created for local communities to organize their efforts to support those in need. This single app was built in a matter of days and translated into over 100 languages to make support accessible for anyone who needed it. It has been humbling and inspiring to witness how no-code app creators have risen to this year’s many challenges. As the issues surrounding the pandemic continue, we areextending  AppSheet’s COVID-19 support through June 2021.Reimagining workHistory has demonstrated that innovation is born from necessity. The Guttenberg press, for example, found its notoriety during the plague of the 14th century due to both social and cultural demands. So too has 2020 provided the ultimate forcing function to accelerate digital innovation. It’s forced organizations to reimagine collaboration, productivity, and success, demanding that everyone, not just IT, find new ways to get things done.For example, Globe Telecom, a leading mobile network provider in the Philippines, adopted AppSheet to accelerate application development. In June, the company announced a no-code hackathon open to all teams, originally planned in 2019 as an in-person event but changed in the wake of the pandemic to an online-only event. Despite the change, organizers were surprised when over 100 teams entered the hackathon, a signal that employees across the organization had an appetite to contribute to the company’s culture of innovation.The winning team created an app that reports illegal signal boosting. The app captures field data and, if the data shows malfeasance, it triggers automated reports that alert the correct employees to handle the problem, reducing the reporting time from two days to two hours and enabling faster resolution for reported incidents.We also saw app creators at small businesses and universities build useful no-code solutions with AppSheet. A fifth-generation family business operatorcreated a customer retention app and inventory management app for his jewelry store. An event coordinator built multiple apps tomanage registration and logistics for his company’s world-class athletic racing events. A medical studentbuilt a flashcard app with a little extra customization and functionality he couldn’t find elsewhere.Preparing for the futureOn our end, we’ve worked tirelessly to improve the platform with nearly 200 releases this year. We’ve made great strides in making AppSheet easier to use for even more users:The platform’s integrations with Google Workspace, as well as AppSheet’s inclusion in Google Workspace enterprise SKUs, allow people to redefine tasks and processes—and they also add more governance control, boosting AppSheet’s ability to accelerate innovation while avoiding the risks of shadow ITEasy-to-use app templates help people get started faster and incorporate Google Workspace functionality into their AppSheet-powered appsCustomization features such as Color Picker give app builders more control over their appsWith new connectors, like the Apigee API connector, app creators can link AppSheet to new data sources, opening up a new realm of possibilitiesFinally, we would be remiss if we didn’t mention AppSheet capabilities that weannounced in September at Google Cloud Next ‘20 OnAir, such as Apigee datasource for AppSheet, which lets AppSheet users harness Apigee APIs, andAppSheet Automation, which offers a natural language interface and contextual recommendations that let users automate business processes. These efforts, combined with the ongoing integration of Google technologies into AppSheet, give the platform an even better understanding of an app creator’s intent, through a more human-centric approach that makes it easier than ever to build apps without writing any code. While 2020 has been a challenging year for everyone, we’re proud of what we’ve accomplished. At Google Cloud, we will continue to support the transformative solutions created by citizen developers—people who, because they don’t have traditional coding abilities, may have otherwise not been able to build apps. We look forward to seeing what you build in 2021!
Quelle: Google Cloud Platform

Rethinking business resilience with Google Cloud

During my last 20 years of IT projects and consultancy across enterprise businesses, I have seen recurring patterns being implemented to deliver a resilient IT strategy. Challenges across security, operation, innovation, data strategy, and insights are very common for enterprises looking to modernize. 2020 has so far proven to be a very challenging year for everyone, and enterprises want to plan, architect, and build resilient strategies to keep their business successful. My job as an Enterprise Cloud Architect at Google Cloud is to help enterprises on this journey, and I have collected different critical success factors to navigate such challenging times. In this article, I share my experience, those critical success factors, and the methodology I developed to achieve a resilient IT Strategy through Google Cloud.What is Resilient IT Strategy (RIS)Achieving business resilience, one of the main asks from management boards, is keeping your business successful both during these difficult times, and also in the future. This means protecting employees, maintaining the core business running, and accelerating your digital transformation.I’ve grouped the success factors (or pillars) needed to achieve business resilience in the following methodology that I call RIS: Resilient IT Strategy.RIS-Model: How to build a Resilient IT Strategy with Google CloudThe “house of resiliency” is helping reimagine a business model to achieve business resilience. The foundation is led out at the bottom of the house with “keep the lights on,” the first of the six pillars from the RIS-Model. The vertical pillars revolve around best practices on security, innovation, resiliency, and data strategy. The last pillar shown on top is “deliver actionable data insights.” Let’s go through each of the pillars and see how this can help solve these challenges with Google Cloud.1. Keep the lights onIT is not only a business enabler but also a business driver, and IT operations truly is the “engine” of your business. This is why “keeping the lights on” with IT operations is such an important foundation to our house of resiliency: without that, your business engine stops. In this section, I would like to share how enterprises can achieve that goal when using Google Cloud.The first step to take is to follow our best practices for enterprise organization. This article helps enterprise architects and technology stakeholders plan a solid foundation across identity and access management, networking, security, cloud architecture, billing, and management. To expand on the foundation, our Solving for Operational Efficiency with Google Cloud whitepaper is full of insights on how to optimize IT Operations. It also highlights how to optimize costs, and plan for a more agile and scalable future. 2. Protect your organizationLegacy enterprises that are used to monolithic applications and are starting their cloud journey are often challenged by different aspects of security in the cloud: application security, secure access to internal apps, or security analytics. Protecting the organization, and detecting and stopping threats are always a priority. In this pillar, I would like to share a few modern approaches that can make a difference. We, at Google Cloud, believe that they can significantly help make it easier to defend your business. Let’s go over three areas where cloud technologies can be transformative: 1. Security analytics and operationsAnalytics and operations is a security topic that presents significant challenges, as many enterprises don’t have the ability to store and analyze large amounts of security telemetry. The goal is to reduce the cost to analyze and store this data and increase the speed of gaining insight from this data. Cost is often a reason why enterprises limit the retention of their security telemetry data, reducing their capabilities to analyse and detect threats. Our solution based on Chronicle and VirusTotal offers painless scalability combined with intelligent identification. The result of that is that you can analyze many risks at the speed of Google search while reducing cost. Read this Chronicle customer case study of a global healthcare company to learn more.2. Application SecurityProtecting users and applications requires constantly updated knowledge to new security threats. Our Application Security solution, which includes reCAPTCHA Enterprise and our WebRisk product, relies on Google’s years of experience in defending our own services. This means you can better protect your apps from fraud and abuse with an enterprise solution that can easily integrate into site or mobile apps, in the cloud or on-premises.3. BeyondCorp Remote AccessVPNs were not intended to be used for always-on remote access, and in these times where remote work is the norm, they can be a significant productivity sink for end users and IT admins. BeyondCorp Remote Access delivers simple and secure access to internal apps for your employees and extended workforce without the use of a VPN. Learn more about BeyondCorp, the Zero-Trust security model, and how Airbnb implemented it using Google Cloud in this video.3. Drive digital innovationInnovation has always been a key driver to accelerate growth and prevent future disruptions. Driving digital innovation is the third pillar of our RIS-Model. Unfortunately, innovation is often done in iterative cycles, without taking into account the entire IT strategy. I believe that innovation is a holistic topic that must be driven consciously by the top of the enterprise, and by the architecture teams. The goal is to let the business units drive the innovation on behalf of their customers. That’s why digital innovation at Google is part of our DNA. You can find out how Google Cloud helps you solve for innovation here.This third pillar lets you build business capabilities and services to remain relevant in future. Google Cloud is tackling these challenges by focusing on reimagining how to manage apps and infrastructure with an agile and open architecture. We have introduced a new Google Cloud Application Modernization Program (CAMP) during our recent Next conference. It is based on our experience of driving application delivery at speed and scale, and relies on the principles developed by the DevOps Research and Assessment (DORA) team which is based on six years of research and data from over 31,000 professionals. Read more about CAMP and how to get to the future faster. 4. Architect for resiliencyThe fourth pillar of the RIS model is designing IT architectures that are flexible, resilient, and scalable. In other words, part of this mission means building flexibility into your architecture, staying agile, and adopting an open cloud architecture in order to respond to change. Furthermore, architect for scalable and resilient solutions across your organization to build resilient systems. Scalability to adjust capacity to meet demand and resiliency to design to withstand failures. These are two essential goals within this pillar. Google Cloud has introduced some patterns and practices for scalable and resilient apps that helps you architect with these principles.In addition, Google has built the Site Reliability Engineering (SRE) methodology through years of running global apps serving billions of users. You can see SRE as an implementation of the DevOps principles in your organization. Culture has always been a key part of the DevOps, Agile, and Lean movements, driving both software delivery performance and organizational performance. We found that SRE principles like “Accept Failure as Normal,” “Reduce Organisational Silos,” “Implement Gradual Change,” “Leverage Tooling & Automation,” and “Measure Everything” are providing a common theme to work across organizations. These principles also support a culture and way to work. Read more on how to start and assess your journey for SRE.5. Update your data strategyA clear data management strategy to generate business value is a critical success factor for any business. Enterprise businesses today often struggle to achieve this goal because CRM, ERP, consumer and customer data are often residing in data silos. That’s why the fifth pillar of our RIS-Model is about your data strategy and how you can update and build your data strategy for the modern era. Each customer’s journey is different and, with that, so is the data strategy that you define. Enterprise leaders are keen to find new ways to use data to improve customer experiences, find new business opportunities, and generate unmatched value and competitive advantage. They are looking for opportunities to embrace technologies like open source and multi-cloud to avoid vendor lock-in. Google Cloud introduced Anthos, a new platform for managing applications in today’s multi-cloud world. However, if executed poorly, a multi-cloud strategy increases the data silos problem. For this reason, you should strive for a consistent data experience across clouds. A modern multi-cloud data strategy is needed to let you extract value from data, regardless of type, size, and independently of where your data is stored. This ability brings the power of analytics to where your data lives, in different public clouds (Google Cloud, Amazon Web Services, and Azure) and provides a single-pane-of-glass for your data. In a multi-cloud world, businesses need to break down data silos without moving data between providers and gain critical business insights seamlessly across clouds. Google Cloud has recently introduced BigQuery Omni to help you build this pillar: it lets you query your data in Google Cloud, AWS and Azure (coming soon).Here’s a step-by-step guide which provides you insights on how to update your data strategy. The report shares different aspects on how to expand your data strategy, along with customer case studies that complement this pillar. 6. Deliver actionable predictive insightsUnlocking data’s value is every company’s crucial and challenging mission that is covered in the sixth pillar of the RIS-Model. Leveraging data, becoming data-driven, and empowering predictive analytics; these are the goals of many enterprise businesses. The previous pillars should have highlighted the main required steps to build a good framework to really benefit from all the value that lives inside your data. This pillar is about gaining insights from your business, partners and customers. By using Artificial intelligence (AI) and machine learning on Google Cloud, you can implement many different use cases to solve your business pain points and deliver business value to your customers. With that you can gain real-time insights that improve decision-making and accelerate innovation, in order to reimagine your business. For example, learn how the State of Illinois is using AI to help residents who lost their job because of COVID-19 file hundreds of thousands of unemployment claims.To summarize, the methodology helps you navigate these difficult times by focusing your work on a resilient IT Strategy through Google Cloud. The six pillars represent disciplines that you need to build, maintain, and evolve to transform your business across the key success factors in order to keep it successful. Ultimately the approach is a guideline for business and IT stakeholders, which should return stability, value, and growth on your journey with Google Cloud.Thank you to Théo Chamley, Solutions Architect at Google Cloud, for his help on this article.Related Article4 steps to a successful cloud migrationDownload this white paper to help guide your migration to Google Cloud.Read Article
Quelle: Google Cloud Platform

Application rationalization: Why you need to take this first step in your migration journey

Application rationalization is a process of going over the application inventory to determine which applications should be retired, retained, reposted, replatformed, refactored or reimagined. This is an important process for every enterprise in making investment or divestment decisions. Application rationalization is critical for maintaining the overall hygiene of the app portfolio irrespective of where you are running your applications i.e. in cloud or not. However, if you are looking to migrate to the cloud, it serves as a first step towards a cloud adoption or migration journey. In this blog we will explore drivers and challenges while providing a step-by-step process to rationalize and modernize your application portfolio. This is also the first blog post in a series of posts that we will publish on the app rationalization and modernization topic.There are several drivers for application rationalization for organizations, mostly centered around reducing redundancies, paying down technical debt, and getting a handle on growing costs. Some specific examples include: Enterprises going through M&A (mergers and acquisitions), which introduces the applications and services of a newly acquired business, many of which may duplicate those already in place.Siloed lines of businesses independently purchasing software that exists outside the scrutiny and control of the IT organization.Embarking on a digital transformation and revisiting existing investments with an eye towards operational improvements and lower maintenance costs. See the CIO guide for app modernization to maximize business value and minimize risk.What are the challenges associated with application rationalization? We see a few:Sheer complexity and sprawl can limit visibility, making it difficult to see where duplication is happening across a vast application portfolio.Zombie applications exist! There are often applications running simply because retirement plans were never fully executed or completed successfully. Unavailability of up to date application inventory. Are newer applications and cloud services accounted for?Even if you know where all your applications are, and what they do, you may be missing a formal decisioning model or heuristics in place to decide the best approach for a given application.Without proper upfront planning and goal setting, it can be tough to measure ROI and TCO of the whole effort leading to multiple initiatives getting abandoned mid way through the transformation process.Taking an application inventoryBefore we go any further on app rationalization, let’s define application inventory. Application inventory is defined as a catalog for all applications that exist in the organization. It has all relevant information about the applications such as business capabilities, application owners, workload categories (e.g. business critical, internal etc.), technology stacks, dependencies, MTTR (mean time to recovery), contacts, and more. Having an authoritative application inventory is critical for IT leaders to make informed decisions and rationalize the application portfolio. If you don’t have an inventory of your apps, please don’t despair, start with a discovery process and catalogue all the app inventory and assets and repos in one place. The key for successful application rationalization and modernization is approaching it like an engineering problem—crawl, walk, run; iterative process with a feedback loop for continuous improvement. Create a blueprintA key concept in application rationalization/modernization is figuring out the right blueprint for each application. Retain—Keep the application as is, i.e. host it in the current environmentRetire—Decommission the application and compute at source Rehost—Migrate it similar compute elsewhereReplatform—Upgrade the application and re-install on the target Refactor—Make changes to the application to move towards cloud native traitsReimagine—Re-architect and rewrite6 steps to application modernizationThe six step process outlined below is a structured, iterative approach to application modernization. Step 1-3 depicts the application rationalization aspects of the modernization journey.Fig. Application modernization process including steps 1-3 for app rationalizationStep 1: Discover—Gather the dataData is the foundation of the app rationalization process. Gather app inventory data for all your apps in a consistent way across the board. If you have multiple formats of data across lines of businesses, you may need to normalize the data. Typically some form of albeit outdated app inventory can be found in CMDB databases or IT spreadsheets. If you do not have an application inventory in your organization then you need to build one either in an automated way or manually. For automated app discovery there are tools that you can use such as Stratozone, M4A Linux and Windows assessment tools, APM tools such as Splunk, dynatrace, newrelic, and appdynamics and others may also be helpful to get you started. App assessment tools specific to workloads like WebSphere Application Migration Toolkit, Redhat Migration Toolkit for Applications, VMWare cloud suitability analyzer and .NET Portability Analyzer can help paint a picture of technical quality across the infrastructure and application layers. As a bonus, similar rationalization can be done at the data, infrastructure and mainframe tiers too. Watch this space.Fig. Discovery ProcessAt Google, we think of problems as software first and automate across the board (SRE thinking). If you can build an automated discovery process for your infrastructure, applications and data it helps track and assess the state of the app modernization program systematically over the long run. Instrumenting the app rationalization program with DORA metrics enables organizations to measure engineering efficiency and optimize the velocity of software development by focusing on performance.Step 2: Create cohorts—Group applications Once you have the application inventory, categorize applications based on value and effort. Low effort e.g. stateless applications,microservices or applications with simple dependencies etc. and high business value will give you the first wave candidates to modernize or migrate.Fig. Creating and mapping cohortsStep 3: Map out the modernization journeyFor each application, understand its current state to map it to the right destination on its cloud journey. For each application type, we outline the set of possible modernization paths. Watch out for more content in this section in upcoming blogs.Not cloud ready (Retain, Rehost ,Reimagine)—These are typically monolithic, legacy applications which run on the VM, take a long time to restart, not horizontally scalable. These applications sometimes depend on the host configuration and also require elevated privileges. Container ready (Rehost, Refactor and Replatform)—These applications can restart, have readiness and liveliness probes, logs to stdout. These applications can be easily containerized. Cloud compatible (Replatform)—In addition to container ready, typically these applications have externalized configurations, secret management, good observability baked in. The apps can also scale horizontally. Cloud friendly—These apps are stateless, can be disposed of, have no session affinity, and have metrics that are exposed using an exporter.Cloud Native—These are API first, easy to integrate cloud authentication and authorization apps. They can scale to zero and run in serverless runtimes. The picture below shows where each of this category lands on the modernization journey and a recommended way to start modernization. This will drive your cloud migration journey, e.g. lift and shift, move and improve etc.Fig. Application migration/modernization MapOnce you have reached this stage, you have established a migration or change path for your applications. It is useful to think of this transition to cloud as a journey, i.e. an application can go through multiple rounds of migration and modernization or vice-versa as different layers of abstractions become available after every migration of modernization activity. Step 4: Plan and ExecuteAt this stage you have gathered enough data about the first wave of applications. You are ready to put together an execution plan, along with the engineering, DevOps and operations/SRE teams. Google Cloud offers solutions for modernizing applications, one such example for Java is here.At the end of this phase, you will have the following (not an exhaustive list):An experienced team who can run and maintain the production workloads in cloud Recipes for app transformation and repeatable CI/CD patternsA security blueprint and data (in transit and at rest) guidelinesApplication telemetry (logging, metrics, alerts etc.) and monitoringApps running in the cloud, plus old apps turned off realizing infrastructure and license savingsRunbook for day 2 operationsRunbook for incident managementStep 5: Assess ROIROI calculations include a combination of: Direct costs: hardware, software, operations, and administrationIndirect costs: end-user operations and downtimeIt is best to capture the current/as is ROI and projected ROI after the modernization effort. Ideally this is in a dashboard and tracked with metrics that are collected continuously as applications flow across environments to prod and savings are realized. The Google CAMP program puts in place a data-driven assessment and benchmarking, and brings together a tailored set of technical, process, measurement, and cultural practices along with solutions and recommendations to measure and realize the desired savings. Step 6: Rinse and RepeatCapture the feedback from going over the app rationalization steps and repeat for the rest of your applications to modernize your application portfolio. With each subsequent iteration it is critical to measure key results and set goals to create a self propelling, self improving fly wheel of app rationalization. SummaryApp rationalization is not a complicated process. It is a data driven, agile, continuous process that can be implemented and communicated within the organization with the executive support. Stay tuned: As a next step, we will be publishing a series of blog posts detailing each step in the application rationalization and modernization journey and how Google Cloud can help.Related ArticleGoogle Cloud Application Modernization Program: Get to the future fasterThe Google Cloud App Modernization Program (CAMP) can help you accelerate your modernization process and adopt DevOps best practicesRead Article
Quelle: Google Cloud Platform

Unlocking the mystery of stronger security key management

One of the “classic” data security mistakes involving encryption is encrypting the data and failing to secure the encryption key. To make matters worse, a sadly common issue is leaving the key “close” to data, such as in the same database or on the same system as the encrypted files. Such practices reportedly were a contributing factor for some prominent data breaches. Sometimes, an investigation revealed that encryption was implemented for compliance and without clear threat model thinking—key management was an afterthought or not even considered.One could argue that the key must be better protected than the data it encrypts (or, more generally, that the key has to have stronger controls on it than the data it protects). If the key is stored close to the data, the implication is that the controls that secure the key are not, in fact, better.Regulations do offer guidance on key management, but few give precise advice on where to keep the encryption keys relative to the encrypted data. Keeping the keys “far” from data is obviously a good security practice, but one that is sadly misunderstood by enough organizations. How do you even measure “far” in IT land? Now, let’s add cloud computing to the equation. One particular line of thinking that emerged in recent years was:“just like you cannot keep the key in the same database, you cannot keep it in the same cloud.”The expected reaction here is that half of readers will say “Obviously!” while the other half may say “What? That’s crazy!” This is exactly why this is a great topic for analysis!Now, first, let’s point out the obvious: there is no “the cloud.” And, no, this is not about a popular saying about it being “somebody else’s computer.” Here we are talking about the lack of anything monolithic that is called “the cloud.”For example, when we encrypt data at rest, there is a range of key management options. In fact, we always use our default encryption and store keys securely (versus specific threat models and requirements) and transparently. You can read about it in detail in this paper. What you will notice, however, is that keys are always separated from encrypted data with many, many boundaries of many different types. For example, in application development, a common best practice is keeping your keys in a separate project from your workloads. So, these would introduce additional boundaries such as network, identity, configuration, service and likely other boundaries as well. The point is that keeping your keys “in the same cloud” does not really necessarily mean you are making the same mistake as keeping your keys in the same database …. except for a few special cases where it does (these are discussed below). In addition, cloud introduces a new dimension to the risk of keeping the key ‘close to’ the data: where the key is stored physically versus who controls the key. For example, is the key close to data if it is located inside a secure hardware device (i.e., an HSM) that is located on the same network (or: in the same cloud data center) as data? Or, is the key close to data if it is located inside a system in another country, but people with credentials to access the data can also access the key with them? This also raises a question of who is ultimately responsible if the key is compromised, that complicates the matter even more. All these raise interesting dimensions to explore.Finally, keep in mind that most of the discussion here focuses on data at rest (and perhaps a bit on data in transit, but not on data in use).RisksNow that we understand that the concept of “in the same cloud” is nuanced, let’s look at the risks and requirements that are driving behavior regarding encryption key storage.Before we start, note that if you have a poorly architected on-premise application that does store the keys in the same database or on the same disk as your encrypted data, and this application is migrated to the cloud, the problem of course migrates to the cloud as well. The solution to this challenge can be to use the cloud native key management mechanisms (and, yes, that does involve changing the application).   That said, here are some of the relevant risks and issues:Human error: First, one very visible risk is of course a non-malicious human error leading to key disclosure, loss, theft, etc. Think developer mistakes, use of a poor source of entropy, misconfigured or loose permissions, etc. There is nothing cloud-specific about them, but their impact tends to be more damaging in the public cloud. In theory, cloud provider mistakes leading to potential key disclosure are in this bucket as well.External attacker: Second, key theft by an external attacker is also a challenge dating back from a pre-cloud era. Top tier actors have been known to attack key management systems (KMS) to gain wider access to data. They also know how to access and read application logs as well as observe application network traffic—all of which may provide hints as to where keys are located. Instinctively, many security professionals who gained most of their experience before the cloud feel better about a KMS sitting behind layers of firewalls. External attackers tend to find the above-mentioned human errors and turn these weaknesses into compromises as a result.Insider threat: Third, and this is where the things get interesting: what about the insiders? Cloud computing models imply two different insider models: insiders from the cloud user organization and those from a cloud provider. While some of the public attention focuses on the CSP insiders, it’s the customer insider who usually has the valid credentials to access the data. While some CSP provider employees could (theoretically and subject to many security controls with massive collusion levels needed) access the data, it is the cloud customers’ insiders who actually have direct access to their data in the cloud via valid credentials. From a threat modeling perspective, most bad actors will find the weakest link – probably at the cloud user organization – to exploit first before exerting more effort.Compliance: Fourth, there may be mandates and regulations that prescribe key handling in a particular manner. Many of them predate cloud computing, hence they will not offer explicit guidance for the cloud case. It is useful to differentiate explicit requirements, implied requirements and what can be called “interpreted” or internal requirements. For example, an organization may have a policy to always keep encryption keys in a particular system, secured in a particular manner. Such internal policies may have been in place for years, and their exact risk-based origin is often hard to trace because such origin may be decades old. In fact, complex, often legacy, security systems and practices might actually be made more simple (and comprehensible) with more modern techniques afforded through cloud computing resources and practices.Furthermore, some global enterprises may have been subject to some sort of legal matter settled and sealed with a state or government entity separate from any type of regulatory compliance activity. In these cases, the obligations might require some technical safeguards in place that cannot be broadly shared within the organization.Data sovereignty: Finally, and this is where things rapidly veer outside of the digital domain, there are risks that sit outside of the cybersecurity realm. These may be connected to various issues of data sovereignty and digital sovereignty, and even geopolitical risks. To make this short, it does not matter whether these risks are real or perceived (or whether merely holding the key would ultimately prevent such a disclosure). They do drive requirements for direct control of the encryption keys. For example, it was reported that fear of “blind or third party subpoenas” have been driving some of organizations’ data security decisions. Are these five risks above “real”? Does it matter—if the risks are not real, but an organization plans to act as if they are? And if an organization were to take them seriously, what architectural choices they have?Architectures and ApproachesFirst, a sweeping statement: modern cloud architectures actually make some of the encryption mistakes less likely to be committed. If a particular user role has no access to cloud KMS, there is no way to “accidentally” get the keys (equivalent to finding them on disk in a shared directory, for example). In fact, identity does serve as a strong boundary in the cloud. It is notable that trusting, say, a firewall (network boundary) more than a well-designed authentication system (identity boundary) is a relic of pre-cloud times. Moreover, cloud access control or cloud logs of each time a key is used, how, and by whom, may be better security than most on-prem could aspire to.Cloud Encryption Keys Stored in Software-Based SystemsFor example, if there is a need to apply specific key management practices (internal compliance, risks, location, revocation, etc), one can use Google Cloud KMS with CMEK. Now, taking the broad definition, the key is in the same cloud (Google Cloud), but the key is definitely not in the same place as data (details how the keys are stored). People who can get to the data (such as via valid credentials for data access i.e. client insiders) cannot get to the key, unless they have specific access permissions to access KMS (identity serves as a strong boundary).  So, no app developer can accidentally get the keys or design the app with embedded keys.This addresses most of the above risks, but—quite obviously—does not address some of them. Note that while the cloud customer does not control the safeguards separating the keys from data, they can read up on them.Cloud Encryption Keys Stored in Hardware-Based SystemsNext, if there is a need to make sure a human cannot get to the key, no matter what their account permissions are, a Cloud HSM is a way to store keys inside a hardware device. In this case, the boundary that separates keys from data is not just identity, but the security characteristics of a hardware device and all the validated security controls applied to and around the device location. This addresses nearly all of the above risks, but does not address all of them. It also incurs some costs and possible frictions. Here, too, although the cloud customer can request assurance of the use of a hardware security device and other controls,  the cloud customer does not control the safeguards separating the keys from data—still relying on the trust of the cloud service provider’s handling of the hardware. So, although access to the key material is more restricted with HSM keys than with software keys, access to use of the keys is not inherently more secure. Also, the key inside an HSM hosted by the provider is seen as being under logical or physical control of the cloud provider, hence not fitting the true Hold Your Own Key (HYOK) requirement letter or spirit.Cloud Encryption Keys Stored Outside Provider InfrastructureFinally, there is a way to actually address the risks above, including the last item related to geopolitical issues. And the decision is simply to practice Hold Your Own Key (HYOK) implemented using technologies such as Google Cloud External Key Manager (EKM). In this scenario, provider bugs, mistakes, external attacks to provider networks, cloud provider insiders don’t matter as the key never appears there. A cloud provider cannot disclose the encryption key to anybody because they do not have them. This addresses all of the above risks, but incurs some costs and possible frictions. Here, the cloud customer controls the safeguards separating the keys from data, and can request assurance of how the EKM technology is implemented. Naturally, this approach is critically different from any other approach as even customer-managed HSM devices located at the cloud provider data center do not provide the same level of assurance.Key takeawaysThere is no blanket ban for keeping keys with the same cloud provider as your data or “in the same cloud.” The very concept of “key in the same cloud” is nuanced and needs to be reviewed in light of your regulations and threat models—some risks may be new but some will be wholly mitigated by a move to cloud. Review your risks, risk tolerances, and motivations that drive your key management decisions.Consider taking an inventory of your keys and note how far or close they are to your data. More generally, are they better protected than the data? Do the protections match the threat model you have in mind?  If new potential threats are uncovered, deploy the necessary controls in the environment.Advantages for key management using your Google Cloud KMS include comprehensive and consistent IAM, policy, access justification, logging as well as likely higher agility for projects that use cloud native technologies. So, use your cloud provider KMS for most situations not calling forexternalized trust or other situations.Cases for where you do need to keep keys off the cloud are clearly specified by regulation or business requirements; a set of common situations for this will be discussed in the next blog. Stay tuned!Related ArticleThe cloud trust paradox: To trust cloud computing more, you need the ability to trust it lessCloud providers should build technologies that allow organizations to benefit from cloud computing while decreasing the amount of trust t…Read Article
Quelle: Google Cloud Platform

Expanding our global footprint with new cloud regions

Our global network of Google Cloud Platform (GCP) regions underpins all of the important work we do for our customers. With 24 regions and 73 zones in 17 countries, Google Cloud delivers high-performance, low-latency cloud services to our customers no matter where they are around the globe. Google Cloud regions are designed and dedicated to providing enterprise services and products for Google Cloud Platform customers. (Separately, for information about our data centers that support Google’s consumer services please visit this page). Our Cloud regions provide faster service in a given location so organizations can deliver their products and services more reliably and at higher speeds.In 2020, we launched four new cloud regions—Jakarta (Indonesia), Las Vegas (U.S.), Salt Lake City (U.S.), and Seoul (South Korea)—and announced more to come, including Doha (Qatar), Madrid (Spain), Milan (Italy), and Paris (France). Google Cloud customers around the world, like Carrefour Spain, Lufthansa Group, Nokia, and Procter & Gamble depend on us to drive operational efficiencies, reduce IT costs, and accelerate their digital transformations in order to better serve end consumers. A growing number of Google Cloud regions puts our trusted infrastructure everywhere our customers are today and need to be tomorrow. “Our focus is on our customers,” says Jose Antonio Santana, CIO at Carrefour Spain. “We want to meet them wherever they are. Google Cloud has given us the flexibility to adapt our infrastructure and the agility to make changes very quickly. We’ve been able to adopt and standardize industry best practices across the whole company and provide the highest possible service to our customers.”“Google Cloud is a strategic partner as we optimize our operations performance to better serve our customers around the world,” says Henning Krüger, VP Ops Suite at Lufthansa Group. “We’re digitizing our operations atop Google Cloud’s global infrastructure, and we’re using their machine learning capabilities to combine previously disparate systems and data feeds into one unified platform.”“Nokia is accelerating its digital ambitions by modernizing and migrating all its on-premises data centers and servers across multiple countries on to Google Cloud,” says Ravi Parmasad, VP Global IT Infrastructure at Nokia. “The scale, breadth, reliability, and security of Google Cloud’s growing platform ensures we can fundamentally change how we operate and do business; and drive meaningful operational efficiencies and cost savings in a long-term, sustainable way.”“P&G has a long history of innovation, and today, we’re delivering more personalized experiences to consumers than ever before, thanks to the help of partners like Google Cloud,” said Vittorio Cretella, CIO at Procter & Gamble. “We’re implementing agility at scale, and Google Cloud’s continued commitment to scale and grow its global platform is an important enabler of these efforts.”Our expanding Google Cloud region roadmapToday, we’re excited to announce the expansion of our global network with new cloud regions in Chile, Germany and Saudi Arabia. When launched, each region will have three zones to protect against service disruptions, and include a portfolio of key Google Cloud products, while offering lower latency to nearby users and a more robust global network of regions for multinational enterprises. ChileIn South America, Chilean businesses both large and small are accelerating their cloud adoption and transforming their businesses to bring new products and services to customers. Companies like Red Salud, one of the leading networks of private healthcare providers in Chile, have migrated their infrastructure to the cloud to increase the resilience and flexibility of their services as demand continues to evolve and grow.“The announcement of a new Google Cloud region in Chile is welcome news to those of us in the healthcare space, where providers need quick access to complete information in order to make informed decisions about patient care,” says Daniel de la Maza, Manager of Systems and Technology at Red Salud. “With this new cloud region, we will be physically closer to the resources that Google Cloud has to offer, and we will be able to access cloud technology in a faster and more complete way, which will help us strengthen our mission: to bring high quality, accessible care to our patients.” GermanyIn Germany, this second cloud region will complement our existing region in Frankfurt, expanding our ability to meet growing demand for cloud services in the country. With this additional region, customers like Deutsche Börse, one of the world’s leading exchange organizations, and German wholesaler METRO, can continue to scale their businesses atop our world-class cloud infrastructure. “Adopting public cloud as part of our strategic focus on new technologies helps us to improve agility, drive efficiency, and gain access to cutting-edge analytics and AI tools,” says Michael Girg, Chief Cloud Officer at Deutsche Börse Group. “As one of the largest market infrastructure providers, Deutsche Börse Group stands for innovative and stable solutions, which are essential for creating trust in today’s and tomorrow’s markets. Google Cloud impressed us with its technical capabilities, robust security posture, and partner mindset. A second Google Cloud region in Germany will provide even more capacity and technical runway to scale our business.”“METRO is leading a digital transformation in the wholesale travel industry not only through our own move to the cloud, but also by making digital solutions available to our customers,” says Timo Salzsieder, Chief Solution Officer and Chief Information Officer at METRO AG. “We’ve increased the stability of our ecommerce platform and reduced infrastructure costs with Google Cloud, and the new Google Cloud region in Germany will help us serve our millions of customers across 34 countries with even greater reliability.”Saudi ArabiaIn 2018, we announced a memorandum of understanding (MoU) with Aramco to jointly explore establishing cloud services in the region. Building on that MoU, we concluded an agreement in December 2020 and Google Cloud will now deploy and operate a Cloud region in Saudi Arabia, while a local strategic reseller, sponsored by Aramco, will offer cloud services to customers, with a particular focus on businesses in the Kingdom.  This new Cloud region will enable Google Cloud customers to confidently grow and scale their offerings in this market and support companies like Noon and Snap, Inc. as they deliver their products and services to consumers. “In order to deliver a positive user experience, it is important to put our infrastructure as close as possible to our customers,” says Karl D’Adamo, Senior Director for Infrastructure at Snap, Inc. “Google Cloud’s continuing expansion into additional regions will enable us to better serve our hundreds of millions of customers around the world, no matter where they may be.” “We chose Google Cloud because of the scalability and resilience of its products and infrastructure,” says Hisham Zarka, CTO and Managing Director at Noon. “With the new Google Cloud region in the Kingdom of Saudi Arabia, we will be able to securely deliver services to our nearby customers at higher speeds and with greater flexibility.”Our second region in Germany, first in Saudi Arabia, and first in Chile will join our current network of 24 regions, shown here, along with nine other forthcoming regions worldwide.The cleanest cloud in the industryWe do all of this while operating the cleanest cloud in the industry, matching 100 percent of the electricity we use with renewable energy. This commitment to sustainability enables our customers to meet their own cloud computing needs with zero net carbon emissions. You can learn more about our global infrastructure, including new and upcoming regions, here.
Quelle: Google Cloud Platform

Multicloud analytics powers queries in life sciences, agritech and more

In the 2020 Gartner Cloud End-User Buying Behavior survey, nearly 80% of respondents who cited use of public, hybrid or multi-cloud indicated that they worked with more than one cloud provider1.Multi-cloud has become a reality for most, and in order to outperform their competition, organizations need to empower their people to access and analyze data, regardless of where it is stored. At Google, we are committed to delivering the best multi-cloud analytics solution that breaks down data silos and allows people to run analytics at scale and with ease. We believe this commitment has been called out in the new Gartner 2020 Magic Quadrant for Cloud Database Management Systems, where Google was recognized as a Leader2.If you, too, need to enable your people to analyze data across Google Cloud, AWS and Azure (coming soon) on a secure and fully managed platform, take a look at BigQuery Omni.BigQuery natively decouples compute and storage so organizations can grow elastically and run their analytics at scale. With BigQuery Omni, we are extending this decoupled approach to move the compute resources to the data, making it easier for every user to get the insights they need right within the familiar BigQuery interface. We are thrilled with the incredible demand we have seen since we announced BigQuery Omni earlier this year.  Customers have adopted BigQuery Omni to solve their unique business problems and this blog highlights a few use cases we’re seeing. This set of use cases should help guide you on your journey towards adopting a modern, multi-cloud analytics solution. Let’s walk through three of them:Biomedical data analytics use case:  Many life science companies are looking to deliver a consistent analytics experience for their customers and internal stakeholders. Because biomedical data typically resides as large datasets that are distributed across clouds, getting holistic insights from a single pane of glass is difficult. With BigQuery Omni, The Broad Institute of MIT and Harvard is able to analyze biomedical data stored in repositories across major public clouds right from within the familiar BigQuery interface, thus making this data available to enable search and extraction of genomic variants. Previously, running the same kind of analytics required ongoing data extraction and loading processes that created a growing technical burden. With BigQuery Omni, The Broad Institute has been able to reduce egress  costs, while improving the quality of their research. Agritech use case: Data wrangling continues to be a big bottleneck for agriculture technology organizations that are looking to become data-driven. One such organization aims to reduce the amount of time and money spent by their data analysts, scientists, and engineers on data wrangling activities. Their R&D datasets, stored in AWS, describe the key characteristics of their plant breeding pipeline and their plant biotechnology testing operations. All of their critical datasets reside in Google BigQuery. With BigQuery Omni, this customer plans to enable secure, SQL-based access to their data living across both clouds, and help improve data discoverability for richer insights. They will be able to develop agricultural and market-focused analytical models within BigQuery’s single, cohesive interface for their data consumers, irrespective of the cloud platform where the dataset resides. Log analytics use case: Many organizations are looking for ways to tap into their logs data and unlock hidden insights. One media and entertainment company has their user activity log data in AWS and their user profile information in Google Cloud. Their goal was to better predict media content demand by analyzing user journeys and their content consumption patterns. Because each of their AWS and Google Cloud datasets were updated constantly, they were challenged with aggregating all the information while still maintaining data freshness. With BigQuery Omni, the customer has been able to dynamically combine their log data from AWS and Google Cloud  without needing to move or copy entire datasets from one cloud to another, thus reducing the effort of writing custom scripts to query data stored in another cloud.A similar example that blends well with this use case is the challenge of aggregating billing data across multiple clouds. One public sector company has been testing multiple ways to create a single, convenient view of all their billing data across Google Cloud, AWS and Azure in real time. With BigQuery Omni, they aim to break down their data silos with minimum effort and cost and run their analytics from a single pane of glass.To get started with  BigQuery Omni and simplify your journey toward multi-cloud analytics, sign up here.  BigQuery Omni is currently in preview for AWS and Azure is coming soon.Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s Research & Advisory organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose1. Gartner, “2021 Planning Guide for Data Management”, Sanjeev Mohan, Joe Maguire, October 9, 2020.2. Gartner, “Magic Quadrant for Cloud Database Management Systems”, Donald Feinberg, Merv Adrian, Rick Greenwald, Henry Cook, Adam Ronthal, November 23, 2020Related ArticleBringing multi-cloud analytics to your data with BigQuery OmniBigQuery Omni, powered by Anthos, lets you analyze data in Google Cloud, as well as AWS and Azure (coming soon). It’s multi-cloud data an…Read Article
Quelle: Google Cloud Platform

Most popular public datasets to enrich your BigQuery analyses

From rice genomes to historical hurricane data, Google Cloud Public Datasets offer a world of exploration and insight. The more than 20 PB across 200+ datasets in our Public Dataset Program helps you explore big data and data analytics without a lot of cost, setup, or overhead. You can explore up to 1 TB per month at no cost, and you don’t even need a billing account to start using BigQuery sandbox. Joining public datasets with your own data gets you insights right away, such as adding location data for better transportation management or incorporating NOAA’s climate data into forecasting models. Retailers can use census demographics for market analysis, and analysts at those companies can map users with census block, zip code, and county boundary geometries.These datasets can help you start exploring and layering data points, and they also make data analytics a lot easier for enterprise customers. These utility datasets let you start with a set of valid, clean data, rather than having to start from scratch.You can access Google Cloud’s public datasets through BigQuery and Cloud Storage using either legacy or standard SQL queries. Researchers can also use BigQuery MLto train advanced machine learning models with this data right inside BigQuery at no additional cost. BigQuery GIS provides convenient, built-in capabilities to ingest, process, and analyze geospatial data when you want a location component in your data analysis.Here, we’ll explore some common datasets and how they’re used. Expanding access to data for healthcare and research. This year, COVID-19 public datasets have been incredibly important to researchers looking to understand and combat the virus. As the pandemic began in March, we announced an initial set of free public datasets to help researchers, data scientists, and analysts combat the coronavirus. These include the COVID-19 Open Data dataset, the Global Health Data from the World Bank, and OpenStreetMap data. The COVID-19 datasets are free to access and query through September 15, 2021. Looker customers can also install the COVID-19 block, which includes the Community Mobility Data Block, from the Marketplace, where they can accelerate their analyses of the public datasets using curated explore environments and purposeful dashboards. Anyone can go ahead and access the dashboards and explore environments here. The Looker Demographic data block contains demographic information from the American Community Survey.Building the right tools to bring COVID-19 data to all. Google Cloud and partner SADA also collaborated earlier this year on building the National Response Portal, an open data platform that combines multiple datasets for an on-the-ground view of the pandemic. The Oklahoma State Department of Health and governor’s office used COVID-19 public datasets and Looker data blocks to build a dashboard on the state website to monitor cases and update residents.Layering weather, climate, and GIS datasets for a better understanding of nature. Weather and climate are popular datasets to explore. Within BigQuery, you can explore climate simulation data from a collaboration with the Lamont-Doherty Earth Observatory of Columbia University and the Pangeo Project. In addition, the World Climate Research Programme released the Coupled Model Intercomparison Project Phase 6 (CMIP6) data archive. This dataset will be continuously updated and may eventually contain 20 PB of data. Other climate-related datasets include those from NOAA on lightning and hurricanes, and Looker’s Weather data block that contains daily weather reporting in the United States at the zip code level from 1920 until now. You can see how GlideFinder built a platform that ingests satellite data to monitor wildfires, using data characteristics like temperature. And here’s how to use a Colab notebook to analyze data on daily temperature readings from around the world. In Looker, users can leverage the weather block to analyze weather data and join it back onto their own data sources to get an entire picture of how climate may be impacting their business.Using genomics data to improve food security. Our rice genome dataset derives from the Rice 3K dataset, which analyzes genetic variation, population structure, and diversity among more than 3,000 diverse Asian cultivated rice genomes. Our researchers then used DeepVariant to re-analyze that dataset with the goal of improving food security by speeding up genetic enhancement to increase rice crop yield. Get to know cryptocurrencies using blockchain datasets. Our Public Datasets Program includes a set of cryptocurrency blockchain datasets, so you can start to better understand this modern concept. The datasets consist of the blockchain transaction history of Bitcoin and Ethereum, plus others, and you’ll also find a set of queries and views to enable multi-chain meta analysis and integration with conventional financial record processing systems. Putting public datasets to useWe’re always interested to hear all the ways that analysts and researchers use public datasets to further understanding of so many different causes and topics. 2020 has brought fascinating, hopeful stories of how data has helped fight COVID-19, including our COVID-specific datasets and other public health datasets. Google Cloud has been able to help with COVID-19 academic research by offering high-performance compute and other technology resources along with public datasets. One important note is that the contents of these datasets are provided to the public strictly for educational and research purposes only. We are not onboarding or managing PHI or PII data as part of our COVID-19 public datasets. Google has practices and policies in place to ensure that data is handled in accordance with widely recognized patient privacy and data security policies.What will you do with public datasets on BigQuery? Dive into the BigQuery sandbox to get started. Have an idea for a dataset? Add it to our request tracker.Learn more: Get started with geospatial data exploration in this beginner’s guide to BigQuery GIS.Explore Looker’s blocks here and request a demo to learn moreSee how a cross-industry team of AI practitioners ramped up data use to fight COVID.Check out the latest Kaggle competitions to test your skills.Related ArticleCOVID-19 public dataset program: Making data freely accessible for better public outcomesExplore valuable public health data related to COVID-19 with free public datasets, available in Google Cloud’s BigQueryRead Article
Quelle: Google Cloud Platform

Extensions for connectivity and new data types now available in Cloud SQL for PostgreSQL

Open source database PostgreSQL is designed to be easily extensible through its support of extensions. When an extension is loaded into a database, it can function just like features that are built-in. This adds additional functionality to your PostgreSQL instances, allowing you to use enhanced features in your database on top of the existing PostgreSQL capabilities.Cloud SQL for PostgreSQL has added support for more than ten extensions this year, allowing our customers to leverage the benefits of Cloud SQL managed databases along with the extensions built by the PostgreSQL community.We introduced support for these new extensions to enable access to foreign tables across instances using postgres_fdw, remove bloat from tables and indexes and optionally restore the physical order of clustered indexes (pg_repack), manage pages in memory from PostgreSQL (pgfincore), inspect the contents of database pages at a low level (pageinspect), examine the free space map, the visibility map and page-level visibility info using pg_freespacemap and pg_visibility, use a procedural language handler (PL/proxy) to allow remote procedural calls among PostgreSQL databases, and support postgresql-hll data type.Now, we’re adding extensions to support connectivity within databases and to support new data types that make it easier to store and query IP addresses and phone numbers.New extension: dblinkdblink functionality is complementary to the cross-database connectivity capabilities we introduced earlier this year as PL/Proxy and postgres_fdw extensions. Depending on your database architecture, you might come across situations when you need to query data outside of your application’s database or query the same database with an independent transaction (autonomous) within a local transaction. Dblink allows you to query remote databases and provide you more flexibility and better connectivity in your environment.You can use dblink as part of a SELECT statement for every SQL statement that returns results. For repetitive queries and future use, we recommend creating a view to avoid multiple code modifications in case of changes in connection string or name info.With dblink available now, we still recommend in most use cases to keep the data you need to query under the same database and leverage schemas as possible due to complexity and performance overheads. Another alternative is to use the postgres_fdw extension for more transparency, standards compliance, and better performance.New data types: Ip4r and prefixInternet protocols IPv4 and IPv6 are both commonly used today; IPv4 is Internet Protocol Version 4, while IPv6 is the next generation of Internet Protocol allowing a broader range of IP addresses. IPv6 was introduced in 1998 with the purpose of replacing IPv4.Ip4r allows you to use six data types to store IPv4 and IPv6 addresses and address ranges. These data types provide better functionality and performance than the built-in inet and cidr data types. These data types can leverage PostgreSQL’s capabilities such as primary key, unique key, b-tree index, constraints, etc.prefix data type supports phone number prefixes, allowing customers with call centers and phone systems who are interested in routing calls and matching phone numbers and operators to store prefix data easily and perform operations efficiently. With prefix extension available, you can use prefix_range data type for table and index creation, cast function and query the table with the following operators: <=, <, =, <>, >=, >, @>, <@, &&, |, &Try out the new extensionsdblink, Ip4r and prefix extensions are now available for you to use along with the eight other supported extensions on Cloud SQL for PostgreSQL. Learn more about PostgreSQL extensions and what’s available.Related ArticlePreparing your MySQL database for migration with Database Migration ServiceRecently, we announced the new Database Migration Service (DMS) to make it easier to migrate databases to Google Cloud. DMS is an easy-to…Read Article
Quelle: Google Cloud Platform