Februar 2021 - Seite 36 von 40 - Cloud Computing Köln

Today, we’re excited to announce the Public Preview of a CentOS 7-based Virtual Machine (VM) image optimized for high performance computing (HPC) workloads, with a focus on tightly-coupled MPI workloads.In 2020, we introduced several features and best-practice tunings to help achieve optimal MPI performance on Google Cloud. With these best practices, we demonstrated that MPI ping-pong latency falls into single-digits of microseconds (us) and small MPI messages are delivered in 10us or less. Improved MPI performance translates directly to improved application scaling, expanding the set of HPC workloads that run efficiently on Google Cloud. However, building a VM image that includes these best practices requires systems expertise and knowledge of Google Cloud. Starting with an HPC-optimized image can make it easier to maintain an image.The HPC VM image makes it easy and quick to instantiate VMs that are tuned to achieve optimal CPU and network performance on Google Cloud. The HPC VM image is available at no additional cost via the Google Cloud Marketplace. Continue reading below for details about the HPC VM image and its benefits, or skip ahead to our documentation and quickstart guide to start creating instances using the HPC VM image today!Benefits of using the HPC VM imageThe HPC VM image is pre-configured and regularly maintained, providing the following advantages to HPC customers on Google Cloud:Easily create HPC-ready VMs out-of the-box that incorporate our best practices for tightly-coupled HPC applications. You can quickly create HPC-ready VMs and always stay up-to-date with the latest tunings.Networking optimizations for tightly-coupled workloads help reduce latency for small messages, and benefit applications that are heavily dependent on point-to-point and collective communications.Compute optimizations for HPC workloads allow more predictable single-node high performance by reducing system jitter that can lead to performance variation.Consistent and reproducible multi-node performance by using a set of tunings which have been tested across a range of HPC workloads. Using the HPC VM image is simple and easy, as it is a drop-in replacement for the standard CentOS 7 image.Customer story: Scaling SDPB solver using CloudyCluster and HPC VM imageWalter Landry is a research software engineer in the Caltech Particle Theory Group working with the international Bootstrap Collaboration. The collaboration uses SDPB, a semidefinite program solver, to study Quantum Field Theories, with application to a wide variety of problems in theoretical physics, such as early universe inflation, superconductors, quantum Hall fluids, and phase transitions.To expand the collaboration’s computation capabilities, Landry wanted to see how SDPB would scale on Google Cloud. Working with Omnibond CloudyCluster and leveraging the HPC VM image, Landry achieved comparable performance and scaling to an on-premises cluster at Yale, based on Intel Xeon Gold 6240 processors and Infiniband FDR.Google Cloud’s C2-Standard-60 instance type is based on the second-generation Intel Xeon Scalable Processor. The C2 family of instances can utilize placement policies to reduce inter-node latency, ideal for tightly-coupled MPI workloads. CloudyCluster leverages the HPC VM image and placement policy for the C2 family out of the box, making it seamless for the researcher. These tests show the ability to scale low latency workloads across many instances in Google Cloud.If you would like to try out the HPC VM image with Omnibond CloudyCluster, an updated version of Omnibond CloudyCluster using the HPC VM image is available in the Google Cloud Marketplace. This version also comes complete with NSF funded Open OnDemand led by Ohio Supercomputing Center, making it easy for system administrators to provide web access to HPC resources.What’s included in the HPC VM image? Tunings and OptimizationsThe current release of the HPC VM image focuses on tunings for tightly coupled HPC workloads and implements the following best-practices for optimal MPI application performance:Disable Hyper-Threading: Intel Hyper-Threading is disabled by default in the HPC VM image. Turning off Hyper-Threading allows more predictable performance and can decrease execution time for some HPC jobs.MPI collective tunings: The choice of MPI collective algorithms can have a significant impact on MPI application performance. HPC VM image includes recommended Intel MPI collective algorithms to use in the most common MPI job configurations.Increase tcp_*mem settings: C2 machines can support up to 32 Gbps bandwidth, and they benefit from larger TCP memory than Linux defaults.Enable busy polling: Busy polling can help reduce latency in the network receive path by allowing socket-layer code to poll the receive queue of a network device and by disabling network interrupts.Raise user limits: Default limits on system resources—like open files and numbers of processes that any one user can use—are typically unnecessary for HPC jobs where compute nodes in a cluster aren’t shared between users.Disable Linux firewalls, Disable SELinux: For Google Cloud CentOS Linux images, SELinux and firewall is turned on by default. HPC VM image disables Linux firewalls and SELinux to improve MPI performance.Disable CPUIdle: C2 machines support CPU C-states to enter low-power mode and save energy. Disabling CPUIdle can help reduce jitter and provide consistent low latency.The benefits of these tunings can vary from application to application and we recommend that you benchmark your applications to find the most efficient or cost-effective configuration.Performance measurement using HPC benchmarksWe have compared the performance of the HPC VM image vs. the default CentOS 7 image across both the Intel MPI Benchmarks and real application benchmarks for Finite Element Analysis (ANSYS LS-DYNA), Computational Fluid Dynamics (ANSYS Fluent) and Weather Modeling (WRF). The following versions of the HPC VM image and CentOS Image were used for the benchmarks in this section:HPC VM image: hpc-centos-7-v20210119 (with –nomitigation applied and mpitune configs installed as suggested in the HPC VM image documentation)CentOS Image: centos-7-v20200811Intel MPI Benchmark (IMB) Ping-Pong IMB Ping-Pong measures the ping-pong latency of transferring a fix-sized message between two ranks over a pair of VMs. On average, we saw that the HPC VM image reduces inter-node ping-pong latency by up to 50% compared to the default CentOS 7 Image (baseline).Benchmark setup2x C2-standard-60 VMs with compact placement policyMPI Library: Intel MPI Library 2018 update 4Command line: mpirun -genv I_MPI_PIN=1 -genv I_MPI_PIN_PROCESSOR_LIST=0 -hostfile <hostfile> -np 2 -ppn 1 IMB-MPI1 Pingpong -iter 50000ResultsIntel MPI Benchmark (IMB) AllReduceThe IMB AllReduce benchmark measures the collective latency among multiple ranks across VMs. It reduces a vector of a fixed length with the MPI_SUM operation. We show 1 PPN (process-per-node) results to represent the case when we have a 1 MPI rank/node and 30 threads/rank and 30 PPN results where there are 30 MPI ranks/node and 1 thread/rank. We saw that the HPC VM image reduces AllReduce latency by up to 40% for 240 MPI ranks across 8 nodes (30 processes per node) compared to the default CentOS 7 image (baseline).Benchmark setup8x C2-standard-60 VMs with compact placement policyMPI Library: Intel MPI Library 2018 update 4Command line: mpirun -tune -genv I_MPI_PIN=1 -genv I_MPI_FABRICS ‘shm:tcp’ -hostfile <hostfile> -np <#vm*ppn> -ppn <ppn> IMB-MPI1 AllReduce -iter 50000 -npmin <#vm*ppn>ResultsHPC application benchmarks: LS-DYNA, Fluent and WRFAt an application level, the HPC VM image yielded up to a 25% performance improvement to the ANSYS LS-DYNA “3 cars” vehicle collision simulation benchmark when running on 240 MPI ranks across 8 Intel Xeon processor based C2 instances. With ANSYS Fluent and WRF, we observed up to 6% performance improvement using the HPC VM image in comparison with the default CentOS Image.Benchmark setupANSYS LS-DYNA (“3-cars” model): 8 C2-standard-60 VMs with compact placement policy, using the LS-DYNA MPP binary compiled with AVX-2 ANSYS Fluent (“aircraft_wing_14m” model): 12 C2-standard-60 VMs with compact placement policyWRF V3 Parallel Benchmark (12 KM CONUS): 16 C2-standard-60 VMs with compact placement policyMPI Library: Intel MPI Library 2018 update 4ResultsWhat’s next? SchedMD Slurm support and additional Linux distributionsWe are continuing to work with our HPC partners to integrate the HPC VM image with partner offerings by default. Starting next month, HPC customers who use Slurm will be able to start HPC-ready clusters that make use of the HPC VM image by default (preview version is available here). For customers who are looking for HPC Enterprise Linux options and support, SUSE is working with Google on a SUSE Enterprise HPC VM image that has been optimized for Google Cloud. If you’re interested in learning more about SUSE Enterprise HPC VM image, or have a requirement for additional integrations or Linux distributions, please contact us.Get started today!The HPC VM image is available in Preview for all customers through the Google Cloud Marketplace today. Check out our documentation and quickstart guide for more details on creating instances using the HPC VM image. Special thanks to Jiuxing Liu, Tanner Love, Jian Yang, Hongbo Lu and Pallavi Phene for their contributions.Related ArticleGetting higher MPI performance for HPC applications on Google CloudYou can reduce MPI latency in HPC workloads running on Google Cloud by following these best practices.Read Article
Quelle: Google Cloud Platform

3. Februar 2021

da Agency

The cloud trust paradox: 3 scenarios where keeping encryption keys off the cloud may be necessary

As we discussed in “The Cloud trust paradox: To trust cloud computing more, you need the ability to trust it less” and hinted at in “Unlocking the mystery of stronger security key management,” there are situations where the encryption keys must be kept away from the cloud provider environment. While we argue that these are rare, they absolutely do exist. Moreover, when these situations materialize, the data in question or the problem being solved is typically hugely important.Here are three patterns where keeping the keys off the cloud may in fact be truly necessary or outweighs the benefits of cloud-based key management.Scenario 1: The last data to go to the cloudAs organizations migrate data processing workloads to the cloud, there usually is this pool of data “that just cannot go.” It may be data that is the most sensitive, strictly regulated or the one with the toughest internal security control requirements.Examples of such highly sensitive data vary by industry and even by company. One global organization states that if they present the external key approach to any regulator in the world, they would be expecting an approval due to their robust key custody processes. Another organization was driven by their interpretation of PCI DSS and internal requirements to maintain control of their own master keys in FIPS 140-2 level 3 HSMs that they own and operate.This means that risk, compliance or policy reasons make it difficult if not impossible to send this data set to the public cloud provider for storage or processing. This use case often applies to a large organization that is heavily regulated (financial, healthcare and manufacturing come to mind). It may be data about specific “priority” patients or data related to financial transactions of a specific kind. However, the organization may be willing to migrate this data set to the cloud as long as it is encrypted and they have sole possession of the encryption keys. Thus, a specific decision to migrate may be made involving a combination of risk, trust, as well as auditor input. Or, customer key possession may be justified by customer interpretation of specific compliance mandates.Now, some of you may say “but we have data that really should never go to the cloud.” This may indeed be the case, but there is also general acceptance that digital transformation projects require the agility of the cloud, so an acceptable, if not entirely agreeable solution must be found.Scenario 2: Regional regulations and concernsAs cloud computing evolves, regional requirements are playing a larger role in how organizations migrate to the cloud and operate workloads in public cloud. This scenario focuses on a situation where an organization outside of one country wants to use a cloud based in a different country, but is not comfortable with the provider having access to encryption keys for all stored data. Note that if the unencrypted data is processed in the same cloud, the provider will access the data at one point anyhow. Some of these organizations may be equally uncomfortable with keys stored in any cryptographic device (such as an HSM) under logical or physical control of the cloud provider. They reasonably conclude that such an approach is not really Hold Your Own Key (HYOK). This may be due to issues with regulations they are subject to their government, or all of the above. Furthermore, regulators in Europe, Japan, India, Brazil and other countries are considering or strengthening mandates for keeping unencrypted data and/or encryption keys within their boundaries. Examples may include specific industry mandates (such as TISAX in Europe) that either state or imply that the cloud provider cannot have access to data under any circumstances, that may necessitate not having any way for them to access the encryption keys. However, preliminary data indicates that some may accept the models where the encryption keys are in a sole possession of a customer and located in their country, and hence off the cloud provider premises (while the encrypted data may be outside). Another variation is the desire to have the keys for each country specific data set in the respective country under the control of that country’s personnel or citizens. This may apply to banking data and will necessitate the encryption keys for each data set being stored in each country. An example may be a bank that insists that all their encryption keys are stored under one particular mountain in Switzerland. Yet another example covers the requirements (whether regulatory or internal) to have complete knowledge and control over administrators to the keys, and a local audit log of all key access activity.As Thomas Kurian states here, “data sovereignty provides customers with a mechanism to prevent the provider from accessing their data, approving access only for specific provider behaviors that customers think are necessary. Examples of customer controls provided by Google Cloud include storing and managing encryption keys outside the cloud, giving customers the power to only grant access to these keys based on detailed access justifications, and protecting data-in-use. With these capabilities, the customer is the ultimate arbiter of access to their data.”Therefore, this scenario allows organizations to utilize Google Cloud while keeping their encryption keys in the location of their choice, under their physical and administrative control.Scenario 3: Centralized encryption key controlWith this use case, there are no esoteric threats to discuss or obscure audit requirements to handle. The focus here is on operational efficiency. As Gartner recently noted, the need to reduce the number of key management tools is a strong motivation for keep all the keys within one system to cover multiple cloud and on-premise environments.It may sound like a cliche, but complexity is very much the enemy of security. Multiple “centralized” systems for any task—be it log management or encryption key management—add complexity and introduce new points for security to break.In light of this, a desire to use one system for a majority of encryption keys, cloud or not, is understandable. Given that few organizations are 100% cloud-based today for workloads that require encryption, the natural course of action is to keep all the keys on-prem. Additional benefits may stem from using the same vendor as an auxiliary access control and policy point. A single set of keys reduces complexity and a properly implemented system with adequate security and redundancy outweighs the need to have multiple systems.Another variant of this is a motivation to retain an absolute control over data processing by means of controlling the encryption key access. After all, if a client can push the button and instantly cut off the cloud provider from key access, the data cannot possibly be accessed or stolen by anybody else.Finally, centralizing key management gives the cloud user a central location to enforce policies around access to keys and hence access to data-at-rest.Next stepsTo summarize, these scenarios truly call for encryption keys being both physically away from the cloud provider, away from their physical and administrative control. This means that a customer managed HSM at the CSP location won’t do. Please review Unlocking the mystery of stronger security key management for a broader review of key management in the cloud.Assess your data risks in regards to attackers, regulations, geopolitical risks, etc.Understand the three scenarios discussed in this post and match your requirements to them. Apply threat model thinking to your cloud data processing and see if you truly need to remove the keys from the cloud.Review services covered by Google EKM and partners to deliver encryption key management for keeping the keys away from the cloud, on premises (Ionic, Fortanix, Thales, etc).
Quelle: Google Cloud Platform