Running AlphaFold batch inference with Vertex AI Pipelines

Today, to accelerate research in the bio-pharma space, from the creation of treatments for diseases to the production of new synthetic biomaterials, we are announcing a new Vertex AI solution that demonstrates how to use Vertex AI Pipelines to run DeepMind’s AlphaFold protein structure predictions at scale. Once a protein’s structure is determined and its role within the cell is understood, scientists can develop drugs that can modulate the protein function based on its role in the cell. DeepMind, an AI research organization within Alphabet, created the AlphaFold system to advance this area of research by helping data scientists and other researchers to accurately predict protein geometries at scale.In 2020, in the Critical Assessment of Techniques for Protein Structure Prediction (CASP14) experiment, DeepMind presented a version of AlphaFold that predicted protein structures so accurately, experts declared the “protein-folding problem” solved. The next year, DeepMind open sourced the AlphaFold 2.0 system. Soon after, Google Cloud released a solution that integrated AlphaFold with Vertex AI Workbench to facilitate interactive experimentation. This made it easier for many data scientists to efficiently work with AlphaFold, and today’s announcement builds on that foundation.Last week, AlphaFold took another significant step forward when DeepMind, in partnership with the European Bioinformatics Institute (EMBL-EBI), released predicted structures for nearly all cataloged proteins known to science. This release expands the AlphaFold database from nearly 1 million structures to over 200 million structures—and potentially increases our understanding of biology to a profound degree. Between this continued growth in the AlphaFold database and the efficiency of Vertex AI, we look forward to the discoveries researchers around the world will make. In this article, we’ll explain how you can start experimenting with this solution, and we’ll also survey its benefits, which include offering lower costs through optimized selection of hardware, reproducibility through experiment tracking, lineage and metadata management, and faster run time through parallelization.Background for running AlphaFold on Vertex AIGenerating a protein structure prediction is a computationally intensive task. It requires significant CPU and ML accelerator resources and can take hours or even days to compute. Running inference workflows at scale can be challenging—these challenges include optimizing inference elapsed time, optimizing hardware resource utilization, and managing experiments.Our new Vertex AI solution is meant to address these challenges.To better understand how the solution addresses these challenges, let’s review the AlphaFold inference workflow:Feature preprocessing. You use the input protein sequence (in the FASTA format) to search through genetic sequences across organisms and protein template databases using common open source tools. These tools include JackHMMER with MGnify and UniRef90, HHBlits with Uniclust30 and BFD, and HHSearch with PDB70. The outputs of the search (which consist of multiple sequence alignments (MSAs) and structural templates) and the input sequences are processed as inputs to an inference model. You can run the feature preprocessing steps only on a CPU platform. If you’re using full-size databases, the process can take a few hours to complete.Model inference. The AlphaFold structure prediction system includes a set of pretrained models, including models for predicting monomer structures, models for predicting multimer structures, and models that have been fine-tuned for CASP. At inference time, you independently run the five models of a given type (such as monomer models) on the same set of inputs. By default, one prediction is generated per model when folding monomer models, and five predictions are generated per model when folding multimers. This step of the inference workflow is computationally very intensive and requires GPU or TPU acceleration.(Optional) Structure relaxation. In order to resolve any structural violations and clashes that are in the structure returned by the inference models, you can perform a structure relaxation step. In the AlphaFold system, you use the OpenMM molecular mechanics simulation package to perform a restrained energy minimization procedure. Relaxation is also very computationally intensive, and although you can run the step on a CPU-only platform, you can also accelerate the process by using GPUs.The Vertex AI solutionThe AlphaFold batch inference with the Vertex AI solution lets you efficiently run AlphaFold inference at scale by focusing on the following optimizations:Optimizing inference workflow by parallelizing independent steps.Optimizing hardware utilization (and as a result, costs) by running each step on the optimal hardware platform. As part of this optimization, the solution automatically provisions and deprovisions the compute resources required for a step.Describing a robust and flexible experiment tracking approach that simplifies the process of running and analyzing hundreds of concurrent inference workflows.The following diagram shows the architecture of the solution.The solution encompasses the following:A strategy for managing genetic databases. The solution includes high-performance, fully managed file storage. In this solution, Cloud Filestore is used to manage multiple versions of the databases and to provide high throughput and low-latency access.An orchestrator to parallelize, orchestrate, and efficiently run steps in the workflow. Predictions, relaxations, and some feature engineering can be parallelized. In this solution, Vertex AI Pipelines is used as the orchestrator and runtime execution engine for the workflow steps.Optimized hardware platform selection for each step. The prediction and relaxation steps run on GPUs, and feature engineering runs on CPUs. The prediction and relaxation steps can use multi-GPU node configurations. This is especially important for the prediction step because the memory usage is approximately quadratic with the number of residues. Therefore, predicting a large protein structure can exceed the memory of a single GPU device.Metadata and artifact management. The solution includes management for running and analyzing experiments at scale. In this solution, Vertex AI Metadata is used to manage metadata and artifacts.The basis of the solution is a set of reusable Vertex AI Pipelines components that encapsulate core steps in the AlphaFold inference workflow: feature preprocessing, prediction, and relaxation. In addition to those components, there are auxiliary components that break down the feature engineering step into tools, and helper components that aid in the organization and orchestration of the workflow.The solution includes two sample pipelines: the universal pipeline and a monomer pipeline. The universal pipeline mirrors the settings and functionality of the inference script in the AlphaFold Github repository. It tracks elapsed time and optimizes compute resources utilization. The monomer pipeline further optimizes the workflow by making feature engineering more efficient. You can customize the pipeline by plugging in your own databases.Next stepsTo learn more and to try out this solution, check our GitHub repository, which contains the components and universal and monomer pipelines. The artifacts in the repository are designed so that you can customize them. In addition, you can integrate this solution into your upstream and downstream workflows for further analysis. To learn more about Vertex AI, visit our product page. AcknowledgementsWe would like to thank the following people for their collaboration: Shweta Maniar, Sampath Koppole, Mikhail Chrestkha, Jasper Wong, Alex Burdenko, Meera Lakhavani, Joan Kallogjeri, Dong Meng (NVIDIA), Mike Thomas (NVIDIA), and Jill Milton (NVIDIA).Finally and most importantly, we would like to thank our Solution Manager Donna Schut for managing this solution from start to finish. This would not have been possible without Donna.Related ArticleGetting started with ML: 25+ resources recommended by role and taskWhether you are a Data Analyst, Data Scientist, ML Engineer or Software Engineer, here are specific resources to help you get started wit…Read Article
Quelle: Google Cloud Platform

Five must-know security and compliance features in Cloud Logging

As enterprise and public sector cloud adoption continues to accelerate, having an accurate picture of who did what in your cloud environment is important for security and compliance purposes. Logs are critical when you are attempting to detect a breach, investigating ongoing security issues, or performing forensic investigations. These five must-know Cloud Logging security and compliance features can help customers create logs to best conduct security audits. The first three features were launched recently in 2022, while the last two features have been available for some time.1. Cloud Logging is a part of Assured Workloads. Google Cloud’s Assured Workloads helps customers meet compliance requirements with a software-defined community cloud. Cloud Logging and external log data is in scope for many regulations, which is why Cloud Logging is now part of Assured Workloads. Cloud Logging with Assured Workloads can make it even easier for customers to meet the log retention and audit requirements of NIST 800-53 and other supported frameworks. Learn how to get started by referring to this documentation.2. Cloud Logging is now FedRAMP High certified.FedRAMP is a U.S. government program that promotes the adoption of secure cloud services by providing a standardized approach to security and risk assessment for federal agencies adopting cloud technologies. The Cloud Logging team has received certification for implementing the controls required for compliance with FedRAMP at the High Baseline level. This certification will allow customers to store sensitive data in cloud logs and use Cloud Logging to meet their own compliance control requirements. Below are the controls that Cloud Logging has implemented as required by NIST for this certification. In parenthesis, we’ve included example control mapping to capabilities: Event Logging (AU-2) – A wide variety of events are captured. Examples of events as specified include password changes, failed logons or failed accesses related to systems, security or privacy attribute changes, administrative privilege usage, Personal Identity Verification (PIV) credential usage, data action changes, query parameters, or external credential usage.Making Audits Easy (AU-3) – To provide users with all the information needed for an audit, we capture the type of event, time occurred, location of the event, source of the event, outcome of the event, and identity information. .Extended Log Retention (AU-4) – We support the outlined policy for log storage capacity and retention to provide support for after-the-fact investigations of incidents. We help customers meet their regulatory and organizational information retention requirements by allowing them to configure their retention period. Alerts for Log Failures (AU-5) – A customer can create alerts when a log failure occurs.Create Evidence (AU-16) – A system-wide (logical or physical) audit trail composed of audit records in a standardized format is captured. Cross-organizational auditing capabilities can be enabled.Check out this webinar to learn how Assured Workloads can help support your FedRAMP compliance efforts. 3. “Manage your own Keys,” also known as customer managed encryption keys (CMEK), can encrypt Cloud Logging log buckets.For customers with specific encryption requirements, Cloud Logging now supports CMEK via Cloud KMS. CMEK can be applied to individual logging buckets and can be used with the log router. Cloud Logging can be configured to centralize all logs for the organization into a single bucket and router if desired, which makes applying CMEK to the organization’s log storage simple. Learn how to enable CMEK for Cloud Logging Buckets here.4. Setting a high bar for cloud provider transparency with Access Transparency.Access Transparency logs can help you to audit actions taken by Google personnel on your content, and can be integrated with your existing security information and event management (SIEM) tools to help automate your audits on the rare occasions that Google personnel may access your content. While Cloud Audit logs tell you who in your organization accessed data in Google Cloud, Access Transparency logs tell you if any Google personnel accessed your data. These Access Transparency logs can help you: Verify that Google personnel are accessing your content only for valid business reasons, such as fixing an outage or attending to your support requests.Review actual actions taken by personnel when access is approved. Verify and track Assured Workload Support compliance with legal or regulatory obligations.Learn how to enable Access Transparency for your organization here.5. Track who is accessing your Log data with Access Approval Logs. Access Approvals can help you to restrict access to your content to Google personnel according to predefined characteristics. While this is not a logging-specific feature, it is one that many customers ask about. If a Google support person or engineer needs to access your content for support for debugging purposes (in the event a service request is created), you would use the access approval tool to approve or reject the request. Learn about how to set up access approvals here. We hope that these capabilities make adoption and use of Cloud Logging easier, more secure, and more compliant. With additional features on the way, your feedback on how Cloud Logging can help meet additional security or compliance obligations is important to us. Learn more about Cloud Logging with our qwiklab quest and join us in our discussion forum. As always, we welcome your feedback. To share feedback, contact us here.Related ArticleHow to help ensure smooth shift handoffs in security operationsSOAR tech can help make critical shift handoffs happen in the SOC, ensuring pending tasks are completed and active incidents are resolved.Read Article
Quelle: Google Cloud Platform

5 steps to prepare developers for cloud modernization

If you’re thinking about what it takes to modernize your applications, you’re not alone. Companies everywhere now understand that migrating applications to the cloud and shifting to a cloud-first approach is critical to business competitiveness. The purpose of modernizing applications is to better align them to current and future business needs. By deploying enterprise applications to the cloud, you gain greater ability to innovate, improve security, scale to meet demand, manage costs, and deliver rich and consistent customer experiences anywhere in the world more quickly.

But as you move to the cloud, there are many options to choose from and skills to gain. One of the most important parts of this effort is understanding how to prepare developers for cloud modernization—and one of the trickiest parts is knowing where to start.

According to research on Developer Velocity, the number one driver of business performance is best-in-class developer tools.1 Companies that create the right environment—by providing strong tools and removing points of friction for developers to innovate—have 47 percent higher developer satisfaction and retention rates than those in the lowest quartile for Developer Velocity. With Microsoft Azure, you’ll find not only the tools and technologies that you need to move to the cloud, but also extensive developer support for cloud modernization.

In this article, we’ll walk you through technical documentation, educational resources, and step-by-step guidance to help you build the skills and strategy needed to successfully modernize your applications. We use Azure App Service as our example, but the same concepts apply to other tools you might use in your modernization efforts.

Here are five steps to take to start preparing for cloud modernization:

1.    Watch how application migration works.

Migrating existing, on-premises applications to the cloud is often the focus of initial application modernization efforts. Once the business case has been made to migrate an application to the cloud, you’ll need to assess the application for all the dependencies that can affect whether it can be successfully migrated without modifying the application. In the case of App Service, a migration assistant guides you through the assessment. Then, if the assessment indicates that the application can be migrated, the migration assistant performs the migration. To get an introduction to how the assessment and migration process works, watch the overview video on how to migrate web apps to App Service.

2.    Learn to migrate an on-premises application to the cloud.

The best way to understand what it takes to migrate an application is to try it for yourself. To learn how to migrate an on-premises web application to App Service, take the step-by-step online course—including a hands-on lab—that guides you through migration and post-migration. Using a sandbox environment and access to free resources, you’ll get an in-depth walkthrough of how to migrate your web application, from assessment through post-migration tasks. You’ll also get background on why the assessment phase is so important, what types of problems it’s intended to identify, and what to do if any problems are found. Next, the course takes you through the migration process and provides guidance on the settings you’ll need to choose from, and it prepares you for additional tasks that might be necessary to get the web app in working order.

3.    Build a web app in the language of your choice.

Learning how to build a cloud-native application is another important step in preparing yourself to shift to a cloud-first approach. To give it a try, sign up for an Azure free account, which gives you access to dozens of free services, including App Service. Along with access to a wide range of cloud resources, you get developer support for cloud modernization through quickstart guides that walk you through creating and deploying a web app in App Service using the language of your choice, including .NET, Node.js, Java, Python, and other languages. This is also a great time to explore other Azure cloud capabilities and use the $200 credit that you get with the Azure free account.

4.    Assess your own web apps for modernization readiness.

Once you understand the basics of migrating and deploying applications in the cloud, it’s time to get to work on the process of assessing and migrating your own web apps. Use the free App Service migration tool to run a scan on your web app’s public URL. The tool will provide you with a compatibility report on the technologies your app uses and whether App Service fully supports them. If compatible, the tool will guide you through downloading the migration assistant, which simplifies migration in an automated way with minimal or no code changes.

5.    Download the App Migration Toolkit.

With a solid background in how to prepare for modernization, you’re in a good position to start putting the full range of Azure developer support for cloud modernization to work. Download the App Migration Toolkit to find the resources you need to successfully modernize your ASP.NET applications from start to finish. From building your business case to best practices and help gaining skills, the toolkit provides practical guidance and support to help you turn your application modernization plans into reality.

While application modernization is a significant initiative that requires strategy, planning, skill-building, and investment of time and resources, the benefits to the business are worth the effort. Fortunately, Azure simplifies the process of figuring out how to prepare developers for cloud modernization. The App Migration Toolkit gives you the skills and knowledge needed to help your organization innovate and stay competitive.

1Developer Velocity: How software excellence fuels business performance.
Quelle: Azure

Das AWS Outposts-Rack wird jetzt in der AWS-Region Asien-Pazifik (Jakarta) unterstützt

Das AWS Outposts-Rack wird jetzt in der AWS-Region Asien-Pazifik (Jakarta) unterstützt. Das AWS Outposts-Rack ist ein vollständig verwalteter Service, der dieselbe AWS-Infrastruktur, dieselben AWS-Services, APIs und Tools für praktisch jedes On-Premises-Rechenzentrum oder jeden Co-Location-Raum bietet, um eine wirklich konsistente Hybrid-Erfahrung zu ermöglichen.
Quelle: aws.amazon.com

AWS Systems Manager stellt ein vereinfachtes Onboarding-Erlebnis für den Application Manager vor

Application Manager, eine Funktion von AWS Systems Manager, kündigt heute ein vereinfachtes Onboarding-Erlebnis für Kunden an. Application Manager ist bei AWS ein zentraler Hub zum Erstellen, Anzeigen und Ausführen von Anwendungen über eine einzige Konsole. Mit Application Manager können Kunden ihre Ressourcen in mehreren AWS-Services wie AWS CloudFormation, AWS Launch Wizard, AWS Service Catalog App Registry, AWS Resource Groups, Amazon Elastic Kubernetes Service (Amazon EKS) und Amazon Elastic Container Service (Amazon ECS) erkennen und verwalten. Durch die Anwendung dieser Funktion können IT-Fachleute jetzt einem geführten einfachen Prozess zum Einrichten der Application-Manager-Dashboards durchlaufen.
Quelle: aws.amazon.com

AWS Cloud Map ist jetzt in zwei neuen AWS-Regionen verfügbar

AWS Cloud Map ist jetzt in den AWS-Regionen Asien-Pazifik (Osaka) und Asien-Pazifik (Jakarta) verfügbar. AWS Cloud Map ist ein Service für die Erkennung von Cloud-Ressourcen. Mit AWS Cloud Map kannst du kundenspezifische Namen für deine Anwendungsressourcen vergeben, wie z. B. für Amazon Elastic Container Services (Amazon ECS)-Aufgaben, Amazon Elastic Compute Cloud (Amazon EC2)-Instances, Amazon DynamoDB-Tabellen oder andere Cloud-basierte Ressourcen. Du kannst diese benutzerdefinierten Namen anschließend verwenden, um den Standort und die Metadaten von Cloud-Ressourcen über deine Anwendungen mit AWS SDK und authentifizierten API-Abfragen zu ermitteln.
Quelle: aws.amazon.com

Die mobile App der AWS-Konsole fügt Support für die Funktion für kürzlich aufgerufene Services hinzu

Benutzer der mobilen App der AWS-Konsole können jetzt ganz einfach ihre kürzlich aufgerufenen AWS-Services anzeigen und darauf zugreifen, die von der mobilen App für iOS und Android unterstützt werden. Die zuletzt aufgerufenen AWS-Services eines Benutzers werden zwischen den mobilen und den Web-Anwendung synchronisiert. Die Funktion der kürzlich aufgerufenen Services befindet sich am unteren Rand des Dashboard-Bildschirms und zeigt den Benutzern eine Liste mit den zehn zuletzt besuchten AWS-Services an. Diese Liste kann durch Wischen einfach durchsucht werden. Durch Tippen auf einen AWS-Service in der Liste gelangt unser Benutzer zum Detail-Bildschirm des AWS-Services in der mobilen App.
Quelle: aws.amazon.com