SAP backup, the blended way

An SAP architect designing a backup solution is faced with questions towards the business (what needs to be done) and delivery (how it can be accomplished). This blog gives an overview of the challenges, available options, their advantages and disadvantages. As a conclusion, a blended backup concept will be proposed that uses cloud technology to combine best-of-market approaches to satisfy business requirements without overloading complexity or costs. Finally, we will introduce Actifio GO, the future Google Cloud Backup and DR. It is Google’s enterprise-scale backup solution that provides centralized, policy-based protection of multiple workloads. We will describe its features and how it can help finding out when exactly a logical error occurred and how to even repair databases.Terms used in this documentIt is a good idea to clarify certain central terms using a diagram about a restore process:A corruption or a logical error occurs during normal operation. Then the database needs to be restored. After the restore, its logs need to be replayed, sometimes this step is called roll-forward. Then, the database will start, again taking time to complete. Downtime arguably includes corruption and detection. The maximum allowable downtime is called RTO (recovery time objective) and the maximum data loss that can be tolerated is RPO (recovery point objective).For file systems, the diagram would look the same, just that the replay of logs and database start would be void.Many customers rely on HA nodes to reduce downtime, and they can help, but not in case of logical errors like deleted tables. To recover from those, full backups or snapshots are the solution. In this article, we will speak about backups when we mean either full backups or snapshots.Snapshots are fast and cost-efficient. They are a mechanism to only store changes while keeping the original disk state. So, their backup (“snapshot”) and restore (“revert”) can happen in a very short time frame (it is size-independant), while their size corresponds to the amount of data changes. Snapshots will represent the disk content at the time when the snapshot has been taken. If you take a snapshot during normal operation, it will be crash-consistent, just as if the power had been switched off. Databases and file systems will have to recover when you revert to this snapshot. If you want to avoid this time-consuming task for the database, the database needs to place itself into a state that is ready for an application-consistent database snapshot. All SAP NetWeaver supported databases have mechanisms to support this. For HANA, it is the prepare step of a HANA snapshot. From a conceptual perspective, it involves forcing all required DB storage write activity to the disk and then quiescing disk activity at the OS level so the snapshot can be created. On Google Cloud, you can have snapshots that build on each other. For example you could have 24 snapshots which are each one hour apart from each other. Snapshots reside by default on multi-regional storage which guarantees that they can tolerate a regional outage.A full database backup will typically take around 0.6xRAM in case of HANA. The size of snapshots on the other hand will start at 0 and grow with incoming data changes.Ransomware attacks typically encrypt the companies’ data with a key that is only known to the attacker. To recover from such an attack, customers need to restore a backup without this infection – which is hard if the attacker has had the opportunity to infect the backups. But with Google Cloud’s Bucket Lock feature, backups can be set immutable for a retention period up to 100 years.Backup Solutions by SAP ComponentsA typical discussion is that there should be more resources for backups of productive systems than for e.g. DEV and QAS. Using snapshots, this regulates itself as the resource consumption will be determined by the amount of changes in the respective system. In other words, in the past there was the idea to run daily full backups in production and weekly full backups in non-production. This implicitly assumes that production is seven times as important as non-production. By taking snapshots instead of full backups, having a lower data change rate automatically saves storage costs. A distinction between the backup SLA for production and non-production is no longer needed.Blended Backup approachAs discussed, snapshots provide low RTO and can be performed frequently which means they also provide low RPO. On the other hand, full database backups provide an integrity check by SAP and allow for separating the backup from the location and storage infrastructure it was created on. To achieve low costs by low storage consumption and a low RTO/RPO at the same time, we propose:Make sure you can quickly create application servers and database servers with their root file system using e.g. Terraform scripts. Being that agile will not only speed up recovery, but also allow you to scale faster on the application layer and envision leaner concepts for DR.Take a PD snapshot of the application servers’ and database servers’ root file system every day and delete (merge) the old one. This will be stored in a multi-regional bucket by default. Storage consumption will only be the data changes since the last day.Before an operating system or software update, take a Persistent Disk snapshot so you can revert as a matter of secondsShared file systems: Take daily snapshots using the shared storage means. In case of high interface usage, this can be done more frequently. Overwrite the existing snapshot, so storage consumption will only be the data changes since the last snapshot.Databases: All SAP databases have similar support for taking storage snapshots. For productive and non-productive databases (using HANA as an example) we recommend the following approach as starting point in your considerations:As primary mechanism, use DB consistent snapshots orchestrated from the HANA Studio at a frequency as little as every 10 minutes. Retain a series of snapshots. This will give you a fully DB consistent backup with very quick restore times which is at the same time very efficient on storage consumption. Additional load on operation will be very low.  Plus, it is by default replicated to other regions.As secondary mechanism, use a weekly full database backup at lowest operation time, e.g. midnight overwriting the previous one to multi-regional cloud storage. This will give you a DB-checked consistent backup, also replicated to other regions. It will provide additional protection against DB level block errors.This approach achieves an RPO < 10 minutes while retaining only one full backup and a series of snapshots. Restore speed will be very high as it just means reverting to a snapshot. Storage consumption will be little: one full backup, one week (at max) of changes and the log backups from one week.The design can be adapted to the customer’s preferences. The weekly frequency of full backups can be changed to daily without causing more storage consumption – previous backups will be overwritten. To save costs, also single-regional backups can be chosen where the strong recommendation is to have them outside of the region where the system is running. Log backups can be added to further reduce the RPO. So how many snapshots of the database should you retain? If you snapshot every 15 minutes, chances are high that the latest snapshot already contains the error you want to recover from. In this case you must be able to go further back, so you need to manage several snapshots. And this is where Actifio proves helpful.IMPORTANT: A number of SAP systems have cross system data synchronicity requirements (e.g.: SAP ECC and SAP CRM) and can be considered as being so closely coupled that the data consistency across all the systems needs to be ensured. When performing recovery activities for any single system this would trigger similar recovery activities in other systems. Depending on the customer specific environment additional backup mechanisms may be required to be able to ensure cross system data consistency requirements.The Actifio backup softwareActifio (soon to be Google Cloud Backup and DR) is Google’s software for managing backups.It supports GCP-native PD snapshots and the SAP-supported databases DB2, Oracle, SAP ASE, SAP HANA, SAP IQ, SAP MaxDB and SQL Server.For SAP customers, the following benefits are of special interest:Provide a single management interface for database and file system backups, not limited to SAP data.Allow to backup on VM level instead of disk levelDirect backup to the Sky server (“backup appliance”) with no need for an intermediate storageWith Actifio it will also be possible to determine the point in time where a logical error has occurred. It is possible to spin up 10 virtual machines each one holding a mount to a different snapshot. Administrators can then check when the error occurred – for example between snapshot 7 and 8. This reduces the data loss to a minimum. But the options do not stop there. It is also possible to “repair” a database. Take the above example, mount snapshot 7 to a virtual machine. It contains a table that has been dropped in snapshots 8 and newer. Now it is possible to export the single table and import it into the production database. Note that this may lead to inconsistencies – but the option is there.See alsoHow to do HANA snapshots How HANA savepoints relate to snapshotsFAQ about HANA snapshotsRelated ArticleUsing Pacemaker for SAP high availability on Google Cloud – Part 1This blog introduces some basic terminology and concepts about the Red Hat and SUSE HA implementation of Pacemaker cluster software for S…Read Article
Quelle: Google Cloud Platform

How Google Cloud SecOps can help solve these 6 key MSSP conundrums

Editor’s note: This blog was originally published by Siemplify on October 6, 2021.The COVID-19 pandemic accelerated many organizations’ timelines to transition to the cloud and advance their digital transformation efforts. The potential attack surfaces for those organizations also grew as newly distributed workforces used unmanaged technologies. While some organizations thrived, the transition further exacerbated many of the key challenges many security teams already were facing, such as an overload of alerts, the need for more detection tools, and security skill shortages.The COVID-19 pandemic has also played a role in increasing SecOps automation, or is expected to in the near future, according to 76% of respondents in a Siemplify report from February 2021. Managed security services providers (MSSPs) and managed detection and response (MDR) vendors have emerged as big winners because of their ability to help organizations overcome these challenges while providing agility, scale and cost savings. Outsourcing arrangements also free up customers to eventually gain the internal knowledge that they were originally lacking, which led to calling on a provider to help fill the gaps in the first place. This is promising news for the MSSP space and ensures likely continued strong growth, but it doesn’t do away with obstacles they face to meet increasingly demanding customer expectations. As a result, not all security service providers are created equal.In a competitive marketplace, one way to shed a sometimes-spurious reputation and stand apart from rivals is through ensuring your security operations are optimized and delivering maximum outcomes for customers. To accomplish that, providers must overcome six modern MSSP obstacles:1) Increasing Customer Acquisition CostsWith the proliferation of security technology options, customers’ security stacks are more diverse than ever before. To compete, MSSPs must be willing and able to sufficiently support a broad set of technology that often results in higher acquisition costs, as well as increased training requirements for security analysts.2) Lack of Centralized Visibility MSSP analyst teams who manage and monitor a large customer base often lack visibility into the allocation of resources, which hinders their ability to balance productivity and risk. This visibility void often extends to the customer as well. Clients are yearning for greater visibility into their expanding network, more transparency around what is happening within it, and the ability for a third-party provider to do more than merely notify them about a threat. Customers care about positive outcomes from their providers, which means finding and stopping adversaries—and helping get their business back on its feet as quickly as possible.3) Multiple Delivery ModelsThe range of MSSP delivery models is increasingly diverse and includes always-on outsourced SOC, managed SIEM, MDR, and staff augmentation, as well as numerous hybrid models. These various models are converging—a single MSSP may provide multiple models in various configurations, adding cost and complexity to operations.4) Meeting SLA Commitments MSSP analyst teams who manage multiple systems and interfaces across  =a diverse set of clients strain to meet rigorous SLA expectations.5) Round-the-Clock OperationsTo meet customer demands, MSSPs work around the clock, requiring multiple shifts and handoffs. It’s crucial to maintain consistency in response from one analyst to the next, and variability in staff knowledge and capability places added pressure on analysts. Driving consistency in processes and workflow to ensure optimal handling of alerts and incidents is paramount to balancing productivity and risk.6) Personnel TurnoverShortages and high turnover of personnel add to the challenges of managing a 24/7 operation. Meanwhile, reliance on manual processes and the need to retain expert knowledge further intensifies the pressure.The Power of Automation and OrchestrationMSSPs are engaged in a constant struggle to ensure their existing security team keeps up with growing customer expectations. Due to an ever-expanding digital footprint, heavy investment in detection, and a growing list of security tools to monitor, the industry is at a tipping point.SIEM and SOAR can help MSSPs under pressure by detecting and ingesting aggregated alerts and indicators of compromise (IOCs) and then executing automatable, process-driven playbooks to enrich and respond to these incidents. These playbooks coordinate across technologies, security teams and external users for centralized data visibility and action—for both internal analysts and external customers.For more information on how an automated and integrated SecOps suite can help you, visit chronicle.security.Related ArticleHow to overcome 5 common SecOps challengesHere are 5 common issues that many SecOps teams struggle with—and how to fix them.Read Article
Quelle: Google Cloud Platform

Using Cloud Bigtable with IAM Conditions and Tags

Cloud Bigtable is a low-latency, high-throughput NoSQL database. Bigtable users store terabytes of data in their tables, and exposing the data securely is essential for Bigtable users.​ If​ you are an administrator or developer responsible for securing access to your Bigtable data, ​​you are likely util​izing​ Google Cloud security features to lock down ​the Bigtable resources and customize ​your authorization model.In this article, we are going to learn how to control access to certain Bigtable resources, create a narrow scope of your resources to apply permissions, and set permissions depending on the development environment. We’ll accomplish this using Identity and Access Management (IAM), IAM Conditions, and Tags to secure Bigtable ​data.Identity and Access Management IAM provides fine-grained access control and visibility for centrally managing Google Cloud resources. A complex cloud organization can have various resources that can have IAM policies bound to them, which are configured by administrators who want to control access based on roles.The following diagram shows an administrator binding an IAM policy to a resource. An IAM policy consists of one or more principals—otherwise known as members—and an administrator can grant one or more roles to each principal. A principle can be a user, group or a service account. A role is a collection of permissions that allow a principal to perform some actions on Google Cloud resources.Resources inherit IAM policies from their parents in the resource tree. When a user sends a request, IAM checks to see if the user has permission to perform the action on that particular resource. If the IAM policy bound to the resource can grant the permission to the user, then the permission is granted. If not, IAM goes up the resource tree to search for a policy that can grant such permission. If no policies can grant the permission are found in the resource tree, the request is rejected with a permission-denied error. The following Cloud Console example grants the Bigable Reader role to the user 222larabrown@gmail.com and binds the policy to the my-project project.This can also be done through the gcloud CLI:code_block[StructValue([(u’code’, u”gcloud projects add-iam-policy-binding my-project –member=’user:222larabrown@gmail.com’ –role=’roles/bigtable.reader'”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e405e5a6810>)])]After the binding is created, the Bigtable Reader role is granted to 222larabrown@gmail.com within the my-project project. This means that 222larabrown@gmail.com can have read access to data in existing tables and to metadata for instances, clusters, and tables, including column families.There are three types of roles in IAM: basic, predefined, and custom. The Bigtable Reader role is a predefined role. See Understanding roles to learn more about IAM roles.IAM ConditionsIAM Conditions is a feature that allows the defining and enforcing of conditional, attribute-based access control for Google Cloud resources. In addition to the role binding to a resource, the resource access is granted to a principal only if the configured condition is met.The following illustrates how IAM Conditions works.The following Cloud Console example grants the Bigable Reader role to the user 222larabrown@gmail.com with the “Report tables” condition and binds the policy to the my-project project. With the Report tables condition, 222larabrown@gmail.com has read access to Bigtable tables that contain a prefix of report- in the table ID within the specific Bigtable instance.The condition Report tables is defined so that:The source type has to be the Bigtable table bigtableadmin.googleapis.com/Table.The resource (table) name has to have the prefix of projects/my-project/instances/my-instance/tables/report-.The service has to be the Bigtable Admin service: bigtableadmin.googleapis.com.This can also be done through the gcloud CLI:code_block[StructValue([(u’code’, u”gcloud projects add-iam-policy-binding my-project –member=’user:222larabrown@gmail.com’ –role=’roles/bigtable.reader’rn–condition-from-file=CONDITION_FROM_FILE”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e4076be7f50>)])]The CONDITION_FROM_FILE should be a path to a local JSON or YAML file that defines the following condition:code_block[StructValue([(u’code’, u'”title”: “Report tables”,rn”description”: “Tables with ‘report-‘ prefix.”,rn”expression”: “resource.type == ‘bigtableadmin.googleapis.com/Table’ && resource.name.startsWith(‘projects/my-project/instances/my-instance/tables/report-) && resource.service == ‘bigtableadmin.googleapis.com'”‘), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e40768dd410>)])]IAM Conditions with TagsWhat if 222larabrown@gmail.com should only be allowed to have read access to the data in the Test or Staging environment because Prod has sensitive data that should not be exposed to 222larabrown@gmail.com? One way to achieve this is by binding some environment tag values to the right resources and limiting access to the resource with associated tag values using IAM Conditions.Tags are a good way to manage the resources in your organization hierarchy by adding additional business dimensions. You can use tags to group certain resources for different purposes such as access control. We are going to explore using tags to group resources for different environments, such as Test, Staging, and Prod.First, in the Cloud Console, you can create a tag at the organization level to represent the environments. The new tag has values of Test, Staging and Prod.Once the tag is created, it generates one tag key ID for the tag and three tag value IDs for the three tag values.Let’s say you want to use a Bigtable instance, my-instance, for your Test environment. You can bind the Test tag value in the Environment tag to the instance as follows, using the gcloud CLI.code_block[StructValue([(u’code’, u’gcloud resource-manager tags bindings create –tag-value=tagValues/260761697116 –parent=//bigtable.googleapis.com/projects/my-project/instances/my-instance’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e405f84b550>)])]Note: Currently there is no Cloud Console support for binding tags to Bigtable instances.Once the binding is in effect, you can add a condition and only grant the role to the principal if the resource has a tag value that matches the Test tag value. Now the user 222larabrown@gmail.com has access only to the Test environment.Note: Combining tags and other attributes in the same condition is currently not allowed.See Tags and access control for more information.SummaryIn this article you learned:IAM fundamentalsHow to set IAM roles for Bigtable resources How to limit the scope of an IAM role further with IAM ConditionsHow to add an environment requirement for permissions using IAM TagsLearn MoreTo learn more about using IAM, IAM Conditions to secure your Bigtable data, see Access control with IAM.Related ArticleCloud SQL – SQL Server Performance Analysis and Query TuningCloud SQL – SQL Server Performance Analysis and Query TuningRead Article
Quelle: Google Cloud Platform

The Invisible Cloud: How this Googler keeps the internet moving worldwide

Editor’s note: Google Cloud runs on people like Stacey Cline. As Global Contract Management Lead in our Global Logistics Operations, she enables  the worldwide movement of global technical infrastructure – the servers, the storage, the artificial intelligence pods, and everything else that keeps Google Cloud serving enterprises and individuals 24/7.  A native of Trinidad & Tobago, she came to Google near the onset of COVID after years at IBM and BP. Sound intense? “My kids say I’m happy again.”For many people, the cloud is kind of invisible: Computing comes over the Internet, and out from a plug in the wall. You probably see it a little differently.   We see the guts. There are machines getting made, warehouses, data centers, forklifts, trucks, air freight. It’s different around the world, depending on what customers in different locations need. Google Cloud’s leg up is that we design, build and deploy the majority of what we run, so we can support key customers in growth markets like Africa and the Asia-Pacific region in ways our competitors can’t. We can build entirely new things and also outfit our data centers to meet the demand a lot quicker. Tell us about coming to Google.I grew up in Trinidad & Tobago, where my mom was the cook at an insurance company’s employee cafeteria. There were always different kinds of people coming through from other countries, so it was pretty diverse. I moved to the US to attend Howard University, and later got a MBA there. I started doing supply chain work for IBM about 20 years ago. Then I moved into Oil & Gas first at BP, then into a Refinery when I got the call from Google. How was it, starting your job at Google Cloud in the pandemic?There are now thousands of us here who  started during COVID. Like a lot of Nooglers, I had some imposter syndrome – “I can’t believe I’m here!” – but for me there was a lot of work to do right away. Demand for cloud services skyrocketed, which meant building up data centers and the warehouses to support them when the  supply chain wasn’t optimal. We were shipping by plane, elevating and upskilling people with remote learning, seeing that people could work safely in the warehouses. People acted brilliantly, and as a team we recognized all this great effort. It was a year before I had a badge or I was in a Google office. Work from home was great for learning the ropes and building confidence, but when I came to Mountain View, I finally felt like a Googler.You coordinate a global system. What’s that been like?It can be a lot of fun. Something new is happening almost every day, and you have to react quickly in order for our operations people to continue doing their jobs. We’re most of the way to having this work perfectly, but like with a lot of things, getting that bit right can go on forever.Related ArticleHow one Googler uses talking tulips to connect with customersMeet Matthew Feigal and hear how he helps partners solve their toughest problems with humor.Read Article
Quelle: Google Cloud Platform

Standing shoulder to shoulder – building a resilient healthcare ecosystem with Health-ISAC

Building a resilient healthcare ecosystem is not something done within a vacuum. It takes motivated organizations and people working together in a community of trust to build and defend effectively. We believe the more diverse these communities are, the more effective they can be. Last August, Google announced its commitment to invest at least $10 billion over the next 5 years to advance cybersecurity. We’re making good on our commitment to support the security and digital transformation of government, critical infrastructure, enterprise, small business, and in time, consumers and society overall through community partnerships and other means.As part of this initiative, Google Cloud is announcing today that it is partnering with Health Information Sharing and Analysis Center (Health-ISAC) as an Ambassador partner. Google Cloud is the first and only major cloud provider to join the organization.  Health-ISAC is a trusted community of critical infrastructure owners and operators within the global Healthcare and Public Health sector (HPH). The community is primarily focused on sharing timely, actionable, and relevant information with each other—including intelligence on threats, incidents, vulnerabilities, mitigation strategies, and best practices. Health-ISAC also encourages building relationships and networking through worldwide educational events and research papers. Working groups and committees focus on topics of importance to the sector, and its member-vetted Community Services offer enhanced services to leverage the Health-ISAC community to help better secure the intersection of healthcare and technology. As an Ambassador partner, Google Cloud will bring experts and resources, including our Threat Horizon Report and Google Cybersecurity Action Team, to partner with the healthcare community and its leadership. Googlers will work with defenders and leaders in the global health sector, sharing knowledge we’ve learned building and deploying secure technology at Google.“Partnering with Health-ISAC just makes sense. We share a common vision that building a safe and reliable health ecosystem is our collective responsibility and keeps with values of respecting and protecting each other. This partnership should inspire other organizations with skills and capabilities that can contribute to these outcomes to join us,” said Phil Venables, CISO, Google Cloud.We’re excited to be working with organizations like Health-ISAC and those on the forefront of building communities and protecting societies. On the journey to being resilient, one thing is for certain: Working together can make all the difference.“We’re thrilled to have Google Cloud as an Ambassador sponsor with Health-ISAC,” said Errol Weiss, Chief Security Officer at Health-ISAC. “This partnership will help with Health-ISAC’s expansion in Europe and Asia-Pacific, while also leveraging Google Cloud’s vast resources and data for the benefit of improving the security and resilience of the health and public health sector globally,” Weiss added.Related ArticleHelping global governments and organizations adopt Zero Trust architecturesGoogle details how it helps governments embark on a Zero Trust journey as the anniversary of the Biden Zero Trust Executive Order approac…Read Article
Quelle: Google Cloud Platform

A new Google Cloud region is coming to Mexico

Last month, Google announced a five-year, $1.2 billion USD commitment to Latin America to expand digital infrastructure, support digital skills, foster an entrepreneurial ecosystem, and help create inclusive and sustainable communities. To build on these initiatives and meet the growing demand for cloud services around the world, we are excited to announce that a new Google Cloud region is coming to Mexico. When it launches, this new cloud region will be our third in Latin America, joining Santiago, Chile, and São Paulo, Brazil, among the 34 regions and 103 zones currently in operation around the world, delivering high-performance, low-latency cloud services to customers of all sizes and industries. It will help support the digital transformations of enterprises, cloud-native companies, and public sector organizations.”We are very excited about the announcement of a new cloud region in Mexico. It shows the commitment that Google Cloud has with its customers,” said Antonio Guichard Gonzalez, Liverpool’s Digital Executive Director. “In Liverpool, we will continue to work with Google Cloud to find solutions to our biggest challenges and accelerate our digital capabilities.”“The cloud region in Mexico will unlock  new possibilities for the use of cloud technologies by public sector organizations in the country. Different public entities would benefit from interoperating in an efficient and secure way, facilitating access to computing power and information technologies. It is important to mention that the computer developments in Mexico are highly specialized so they can become important references for other Spanish-speaking countries,” stated Dr. Juan Carlos Sarmiento Tovilla, Director General of Information Systems at the Federal Court of Administrative Justice.Francisco Martha, General Director of Digital Business Development at Grupo Financiero Banorte, added: “For Banorte, this is undeniably a fundamental milestone that will allow us to accelerate our digital transformation and boost initiatives that we are exploring with Google Cloud within the regulatory framework. For Mexico, it is a turning point in the digitization process that we already see taking place in many of our clients, partners and suppliers.”From retail and media & entertainment to financial services, healthcare and public sector, leading organizations come to Google Cloud as their trusted innovation partner. We help organizations digitally transform and become the best tech company in their industry, across five key areas: Understanding and using data: Google Cloud helps customers become smarter and make better decisions with a unified data platform. We help customers reduce complexity and combine unstructured and structured data — wherever it resides — to quickly and easily produce valuable insights. Establishing an open foundation for growth: When customers move to Google Cloud, they get a flexible, secure and open platform that evolves with their organization. Our commitment to multicloud, hybrid cloud, and open source offers organizations the freedom of choice, allowing their developers to build faster and more intuitively.Creating a collaborative environment: In today’s hybrid work environment, Google Cloud provides the tools needed to transform how people connect, create, and collaborate. Securing systems and users: As every company rethinks its security posture, we help customers protect their data using the same infrastructure and security services that Google uses for its own operations. Building a cleaner, more sustainable future: Google has been carbon-neutral since 2007, and we are working toward a revolutionary goal to operate entirely on carbon-free energy by 2030. Today, when customers run on Google Cloud — the cleanest cloud in the industry — their workloads are already matched with 100% renewable energy. We help customers decarbonize their applications and infrastructure with technologies like Carbon Footprint and Active Assist. Local customers will benefit from key controls that enable them to maintain low latency and the highest security, data residency, and compliance standards, including specific data storage requirements. We will work with our local and regional customers to ensure the cloud region fits their unique and evolving needs. This cloud region will be the latest investment to support the digital transformation of Mexican organizations. In the last year, we’ve opened a support center to boost local companies, as well as global companies with operations in Mexico. We also opened a delivery center and grew our team in Monterrey to support the local ecosystem. Meanwhile, with initiatives such as Capacita+, Grow with Google for Women in STEM, and learning programs in the southeast of the country, we are working to grow digital skills and training opportunities for technologists in Mexico. Learn more about ourglobal cloud infrastructure, including new and upcoming regions.Related ArticleCiao, Milano! New cloud region in Milan now openThe new Milan region provides low-latency, highly available services with international security and data protection standards.Read Article
Quelle: Google Cloud Platform

The next generation of Dataflow: Dataflow Prime, Dataflow Go, and Dataflow ML

By the end of 2024, 75% of enterprises will shift from piloting to operationalizing artificial intelligence according to IDC, yet the growing complexity of data types, heterogeneous data stacks and programming languages make this a challenge for all data engineers. With the current economic climate, doing more with cheaper costs and higher efficiency have also become a key consideration for many organizations.Today, we are pleased to announce three major releases that bring the power of Google Cloud’s Dataflow to more developers for expanded use cases and higher data processing workloads, while keeping the costs low, as part of our goal to democratize the power of big data, real time streaming, and ML/AI for all developers, everywhere.The three big Dataflow releases we’re thrilled to announce in general availability are:Dataflow Prime – Dataflow Prime takes the serverless, no-operation benefits of Dataflow to a totally new level.  Dataflow Prime allows users to take advantage of both horizontal autoscaling (more machines) and vertical autoscaling (larger machines with more memory) automatically for your streaming data processing workloads, with batch coming in the near future.  With Dataflow Prime, pipelines are more efficient, enabling you to apply the insights in real time.  Dataflow Go  – Dataflow Go provides native support for Go, a rapidly growing programming language thanks to its flexibility, ease of use and differentiated concepts, for both batch and streaming data processing workloads. With Apache Beam’s unique multi-language model, Dataflow Go pipelines can leverage the well adopted, best-in-class performance provided by the wide range of Java I/O connectors with ML transforms and I/O connectors from Python coming soon.  Dataflow ML – Speaking of ML transforms, Dataflow now has added out of the box support for running PyTorch and scikit-learn models directly within the pipeline. The new RunInference transform enables simplicity by allowing models to be used in production pipelines with very little code. These features are in addition to Dataflow’s existing ML capabilities such as GPU support and the pre and post processing system for ML training, either directly or via frameworks such as Tensorflow Extended (TFX).We’re so excited to make Dataflow even better.  With the world’s only truly unified batch and streaming data processing model provided by Apache Beam, the wide support for ML frameworks, and the unique cross-language capabilities of the Beam model, Dataflow is becoming ever easier, faster, and more accessible for all data processing needs.Getting startedTo get started with Dataflow Go easily, see the Quickstart and download the Go SDK.To learn more about Dataflow Prime, see the documentation. To learn more about Dataflow ML and RunInference, read about the new RunInference Beam transform on the Apache Beam website.Interested in running a proof of concept using your own data? Talk to your Google Cloud sales contact for hands-on workshop opportunities or sign up here.Related ArticleDataflow Prime: bring unparalleled efficiency and radical simplicity to big data processingCreate even better data pipelines with Dataflow Prime, coming to Preview in Q3 2021.Read Article
Quelle: Google Cloud Platform

Streamline data management and governance with the unification of Data Catalog and Dataplex

Today, we are excited to announce that Google Cloud Data Catalog will be unified with Dataplex into a single user interface. With this unification, customers have a single experience to search and discover their data, enrich it with relevant business context, organize it by logical data domains, and centrally govern and monitor their distributed data with built-in data intelligence and automation capabilities. Customers now have access to an integrated metadata platform that connects technical and operational metadata with business metadata, and then uses this augmented and active metadata to drive intelligent data management and governance. The enterprise data landscape is becoming increasingly diverse and distributed with data across multiple storage systems, each having its own way of handling metadata, security, and governance. This creates a tremendous amount of operational complexity, and thus, generates strong market demand for a metadata platform that can power consistent operations across distributed data.Dataplex provides a data fabric to automate data management, governance, discovery, and exploration across distributed data at scale. With Dataplex, enterprises can easily organize their data into data domains, delegate ownership, usage, and sharing of data to data owners who have the right business context, while still maintaining a single pane of glass to consistently monitor and govern data across various data domains in their organization. Prior to this unification, data owners, stewards and governors had to use two different interfaces – Dataplex to organize, manage, and govern their data, and Data Catalog to discover, understand, and enrich their data. Now with this unification, we are creating a single coherent user experience where customers can now automatically discover and catalog all the data they own, understand data lineage, check for data quality, augment that metadata with relevant business context, organize data into business domains, and then use that combined metadata to power data management. Together we provide an integrated experience that serves the full spectrum of data governance needs in an organization, enabling data management at scale.“With Data Catalog now being part of Dataplex, we get a unified, simplified, and streamlined experience to effectively discover and govern our data, which enables team productivity and analytics agility for our organization. We can now use a single experience to search and discover data with relevant business context, organize and govern this data based on business domains, and enable access to trusted data for analytics and data science – all within the same platform.” saidElton Martins, Senior Director of Data Engineering at Loblaw Companies Limited.Getting startedExisting Data Catalog and Dataplex customers and new customers can now start using Dataplex for metadata discovery, management and governance. Please note that while the user experience interface is unified via this release, all existing APIs and feature functionalities of both products will continue to work as before. To learn more, please refer to technical documentations or contact the Google Cloud sales team.Related ArticleScalable Python on BigQuery using Dask and NVIDIA GPUsTo accelerate data analytics and machine learning workflows, we introduce the Dask BigQuery connector to read data through BigQuery stora…Read Article
Quelle: Google Cloud Platform

Using Pacemaker for SAP high availability on Google Cloud – Part 1

Problem StatementMaintaining business continuity of your mission critical systems usually demands high availability (HA) solutions that will failover without human intervention. If you are running SAP HANA or SAP NetWeaver (SAP NW) on Google Cloud, the OS-native high availability (HA) cluster capability provided by Red Hat Enterprise Linux (RHEL) for SAP and SUSE Linux Enterprise Server (SLES) for SAP is often adopted as the foundational functionality to provide business continuity for your SAP system. This blog will introduce some basic terminology and concepts about the RedHat and SUSE HA implementation of Pacemaker cluster software for SAP HANA and NetWeaver platforms.Pacemaker TerminologyResourceThe resource in Pacemaker is the service made highly available by the cluster. For SAP HANA, there are two resources: HANA and HANA Topology. For SAP NetWeaver Central Services, there are also two resources: one for the Central Services instance that runs the Message Server and Enqueue Server (ASCS in NW ABAP or SCS NW Java) and another one for the Enqueue Replication Server (ERS). In the Pacemaker cluster, we also configure other resources for serving other functions such as Virtual IP (VIP) or Internal Load Balancer (ILB) health check mechanism. Resource agentA resource agent manages each resource. It defines the logic for resource operations called by the Pacemaker cluster to start, stop or monitor the health of resources. They are usually Linux bash or python scripts which implement functions for resource agent operations.Resource agents managing SAP resources are co-developed by SAP and OS vendors. They are open sourced in GitHub, OS vendors downstream to SAP resource agent package for their Linux distro.For HANA scale up, resource agents “SAPHANA” and “SAPHANATopology” For HANA scale out, resource agents “SAPHANAController” and “SAPHANATopology”For NetWeaver Central Services, the resource agent is “SAPInstance”Why are there two resource agents to manage HANA? “SAPHanaTopology” is responsible for monitoring HANA topology status on all cluster nodes and updating HANA relevant cluster properties. The attributes are read by “SAPHANA” as part of the HANA monitoring function.Resource agents are usually installed in the directory `/usr/lib/ocf/resource.d/`.Resource operationA resource can have what is called a resource operation. Resource operations are major types of actions: monitor, start, stop, promote, demote. These work as described, for example, if a resource operation is a “promote” operation then it will promote a resource in the cluster. The actions are built into the respective resource agent scripts.Properties of an operation:interval – If set to a nonzero value, defines how frequently the operation occurs after the first monitor action completes. timeout – defines the amount of time the operation has to complete before the operation is aborted and considered failed.on-fail – defines the action to be executed if the operation fails. The default action for operation ‘stop’ is ‘fence’ and the default for all others is ‘restart’.role – run the operation only on node that the cluster thinks should be in the specified role. A role can be master or slave, started or stopped. The role provides context for pacemaker to make resource location and operation decisions.Resource groupResource agents can be grouped into administrative units that are dependent on one another and need to be started sequentially and stopped in the reverse order.While technically each cluster resource is failed over one at a time, logically (to simplify cluster configuration) failover of resource groups is configured. For SAP HANA, for example, there is typically one resource group containing both the VIP resource and the ILB healthcheck resource.Resource constraintsConstraints determine the behavior of a resource in a cluster. Categories of constraints are location, order and colocation. The list below includes the constraints in SLES and RHEL.Location Constraint – determines on which nodes a resource can run; e.g., pins each fence device to the other host VM.Order Constraint – determines the order in which resources run; e.g., first start resource SAPHANATopology then start resource SAPHANA.Colocation Constraint – determines that the location of one resource depends on the location of another resource; e.g., the IP address resource group should be on the same host as the primary HANA instance.Fencing and fence agentA fencing or fence agent is an abstraction that allows a Pacemaker cluster to isolate problematic cluster nodes or cluster resources for which the state cannot be determined. Fencing can be performed at either the cluster node level or at the cluster resource/resource group level. Fencing is most commonly performed at the cluster node level by remotely power cycling the problematic cluster node or by disabling its access to the network.Similar to resource agents, these agents are also usually bash or python scripts. The two commonly used fence agents within GCP are “gcpstonith” and “fence_gce”, with “fence_gce” being the more robust successor of “gcpstonith”. Fence agents leverage the compute engine reset API in order to fence problematic nodes.The fencing resource “gcpstonith” is usually downloaded and saved in the directory `/usr/lib64/stonith/plugins/external` . The resource “fence_gce” comes with the RHEL and SLES images with the HA extension.CorosyncCorosync is an important piece of a Pacemaker cluster whose effect on the cluster is often undervalued. Corosync enables servers to interact as a cluster, while Pacemaker provides the ability to control how the cluster behaves. Corosync provides messaging and membership functionality along with other functions:Maintains the quorum information.Is used by all cluster nodes to communicate and coordinate cluster tasks.Stores the default location of the Corosync configuration: /etc/corosync/corosync.confIf there is a communication failure or timeout within Corosync then there will be a membership change or fencing action performed.Clones and Clone SetsClones represent resources that can become active on multiple hosts without requiring the creation of unique resource definitions for them. When resources are grouped across hosts, we call this a clone set. There are different types of cloned resources. The main clone set of interest for SAP configurations is that of a stateful clone, which represents a resource with a particular role. In the context of the SAP HANA database, the primary and secondary database instances would be contained within the SAPHana clone set.ConclusionNow that you have read through the terminology, let’s see how an SAP Pacemaker cluster looks on each OS: SLES:There are have two nodes in the cluster and both are online* Online: [ node-x node-y ]The STONITH resource is started on each node and is using the “gcpstonith” fence agent  * STONITH-node-x      (stonith:external/gcpstonith):   Started node-y  * STONITH-node-y      (stonith:external/gcpstonith):   Started node-xThere is a resource group called g-primary that contains both the IPAddr2 resource agent, which adds the ILB forwarding rule IP address to the NIC of the active node, and the anything resource agent, which starts a program ‘socat’ to respond to ILB health check probes:    * rsc_vip_int-primary       (ocf::heartbeat:IPaddr2):        Started node-y    * rsc_vip_hc-primary        (ocf::heartbeat:anything):       Started node-yThere is a Clone Set for the SAPHANATopology resource agent containing the two nodes:cln_SAPHanaTopology_TST_HDB00 [rsc_SAPHanaTopology_TST_HDB00] There is a Clone Set for the SAPHANA resource agent containing a master and slave node:  * Clone Set: msl_SAPHana_TST_HDB00 [rsc_SAPHana_TST_HDB00] (promotable)Note: You can see that one of the clone sets is marked as promotable. If a clone is promotable, its instances can perform a special role that Pacemaker will manage via the promote and demote operations of the resource agent.RHEL:There are two nodes in the cluster and both are online:* Online: [ rhel182ilb01 rhel182ilb02 ]The STONITH resource is started on the opposite node and is using the more robust “fence_gce” fence agent:STONITH-rhel182ilb01 (stonith:fence_gce): Started rhel182ilb02STONITH-rhel182ilb02 (stonith:fence_gce): Started rhel182ilb01There is a resource group called g-primary that contains both the IPAddr2 resource agent, which adds the ILB forwarding rule IP address to the NIC of the active node, and the haproxy resource agent, which starts a program ‘haproxy’ to respond to ILB health check probes:* rsc_healthcheck_R82        (service:haproxy):       Started rhel182ilb02 * rsc_vip_R82_00       (ocf::heartbeat:IPaddr2):        Started rhel182ilb02There is a Clone Set for the SAPHanaTopology resource agent containing the two nodes:* Clone Set: SAPHanaTopology_R82_00-clone [SAPHanaTopology_R82_00] There is a Clone Set for the SAPHana resource agent containing a master and slave node:  * Clone Set: SAPHana_R82_00-clone [SAPHana_TST_HDB00] (promotable)If you compare both SLES and RHEL clusters above, even though they are completely different clusters, you can see the similarities and technologies which are used to perform cluster operations.Congratulations. Now you should have a firm grasp of the key areas and terms of a SAP Cluster running on Google Cloud Platform.Where to go from here? Review our other blogs to become an expert in understanding your cluster and its behavior:What’s happening in your SAP systems? Find out with Pacemaker AlertsAnalyze Pacemaker events in Cloud LoggingRelated ArticleWhat’s happening in your SAP systems? Find out with Pacemaker AlertsThe cluster alerting enables the system administrator to be notified about critical events of the enterprise workloads in GCP like the SA…Read Article
Quelle: Google Cloud Platform

Scalable Python on BigQuery using Dask and GPUs

BigQuery is Google Cloud’s fully managed serverless data platform that supports querying using ANSI SQL. BigQuery also has a data lake storage engine that unifies SQL queries with other open source processing frameworks such as Apache Spark, Tensorflow, and Dask. BigQuery storage provides an API layer for OSS engines to process data. This API enables mixing and matching programming in languages like Python with structured SQL in the same data platform. This post provides an introduction to using BigQuery with one popular distributed Python framework, Dask, an open source library that makes it easy to scale Python tools to BigQuery sized datasets. We will also show you how to extend Dask with RAPIDS, a suite of open-source libraries and APIs to execute GPU-accelerated pipelines directly on BigQuery storage.Integrating Dask and RAPIDS with BigQuery storage A core component of BigQuery architecture is the separation of compute and storage. BigQuery storage can be directly accessed over a highly performant Storage Read API which enables users to consume data in multiple streams and provides both column projections and filtering at the storage level. Coiled, a Google Cloud Partner that provides enterprise-grade Dask in your GCP account, developed an open-source Dask-BigQuery connector (GitHub) that enables Dask processing to take advantage of the Storage Read API and governed access to BigQuery data. RAPIDSis an open sourced library spawned from NVIDIA that uses Dask to distribute data and computation over multiple NVIDIA GPUs. The distributed computation can be done on a single machine or in a multi-node cluster. Dask integrates with both RAPIDS cuDF, XGBoost, and RAPIDS cuML for GPU-accelerated data analytics and machine learning.To start using Dask using BigQuery data, you can install the dask-bigquery connector from any Python IDE. You simply install `dask-bigquery` with `pip` or `conda`, authenticate with Google Cloud, and then use the few lines of python code as shown below to pull data from a BigQuery table.code_block[StructValue([(u’code’, u’import dask_bigqueryrnrnddf = dask_bigquery.read_gbq(rn project_id=”your_project_id”,rn dataset_id=”your_dataset”,rn table_id=”your_table”,rn)rnddf.head()’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6b8a655150>)])]Achieving Python scalability on BigQuery with Dataproc While Dask and the BQ connector can essentially be installed anywhere that Python can be run and scale to the number of cores available in that machine, the real power of scaling comes in when you can use an entire cluster of virtual machines. An easy way to do this on Google Cloud is by using Dataproc. Using the initialization actions outlined in this GitHub repo, getting setup with Dask and RAPIDS on a Dataproc cluster with NVIDIA GPUs is fairly straightforward.Let’s walk through an example using the NYC taxi dataset. As a first step, let’s create a RAPIDS accelerated Dask yarn cluster object on Dataproc by running the following code:code_block[StructValue([(u’code’, u’from dask.distributed import Clientrnfrom dask_yarn import YarnClusterrnrncluster = YarnCluster(worker_class=”dask_cuda.CUDAWorker”, rn worker_gpus=1, worker_vcores=4, worker_memory=’24GB’, rn worker_env={“CONDA_PREFIX”:”/opt/conda/default/”})rncluster.scale(4)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6ba01f7510>)])]Now that we have a Dask client, we can use it to read the NYC Taxi dataset in a BigQuery table through the Dask BigQuery connector:code_block[StructValue([(u’code’, u’d_df = dask_bigquery.read_gbq(rn project_id=”k80-exploration”,rn dataset_id=”spark_rapids”,rn table_id=”nyc_taxi_0″,rn)’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6ba01f7150>)])]Next, let’s use RAPIDS Dask cuDF libraries to accelerate the preprocessing with GPUs.code_block[StructValue([(u’code’, u”taxi_df = dask_cudf.from_dask_dataframe(d_df)rntaxi_df = clean(taxi_df, remap, must_haves)rntaxi_df = taxi_df.query(‘ and ‘.join(query_frags))”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6ba345dc90>)])]Finally, we can use a feature of the Dask dataframe to split into two datasets — one for training and one for testing. These datasets can also be converted to XGBoost Dmatrix and sent into XGBoost for training on GPU.code_block[StructValue([(u’code’, u”xgb_clasf = xgb.dask.train(client, rn params,rn dmatrix_train, rn num_boost_round=2000,rn evals=[(dmatrix_train, ‘train’), (dmatrix_test,’test’)]rn )”), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e6b8aa98b90>)])]The complete notebook can be accessed at this GitHub link. Currently, Dask-BigQuery connector doesn’t support native write back to BigQuery, user need work around that through cloud storage, with Dask or Dask Rapids, write back to GCS first with `to_parquet(“gs://temp_path/”)`, then having BigQuery load from GCS with: `bigquery.Client.load_table_from_uri(“gs://temp_path/”)`.What’s nextIn this blog, we introduced a few key components to allow BigQuery users to scale their favorite Python libraries through Dask to process large datasets. With the broad portfolio of NVIDIA GPUs embedded across Google Cloud data analytics services like BigQuery and Dataproc and the availability of GPU-accelerated software like RAPIDS, developers can significantly accelerate their analytics and machine learning workflows. Acknowledgements: Benjamin Zaitlen, Software Engineer Manager, NVIDIA; Jill Milton, Senior Partnership Manager, NVIDIA, Coiled Developer Team.Related ArticleLearn how BI Engine enhances BigQuery query performanceThis blog explains how BI Engine enhances BigQuery query performance, different modes in BI engine and its monitoring.Read Article
Quelle: Google Cloud Platform