SageMaker Notebook Instances now support P5en.48xl instance types

We are pleased to announce general availability of Amazon EC2 P5en.48xl instances on SageMaker notebook instances.
Amazon EC2 P5en instances feature 8 H200 GPUs which have 1.7x GPU memory size and 1.4x GPU memory bandwidth than H100 GPUs featured in P5 instances. P5en instances pair the H200 GPUs with high performance custom 4th Generation Intel Xeon Scalable processors, enabling Gen5 PCIe between CPU and GPU which provides up to 4x the bandwidth between CPU and GPU and boosts AI training and inference performance. P5en, with up to 3200 Gbps of third generation of EFA using Nitro v5, shows up to 35% improvement in latency compared to P5 that uses the previous generation of EFA and Nitro. This helps improve collective communications performance for distributed training workloads such as deep learning, generative AI, real-time data processing, and high-performance computing (HPC) applications.
Amazon EC2 P5en.48xl instances are available on SageMaker notebook instances in the AWS US East (N. Virginia and Ohio), US West (Oregon), and Asia Pacific (Tokyo) regions.
Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
Quelle: aws.amazon.com

Amazon Bedrock expands support for Service Quotas

Amazon Bedrock is a fully managed service that provides secure, enterprise-grade access to high-performing foundation models from leading AI companies, enabling you to build and scale generative AI applications. Amazon Bedrock customers can now view inference quotas for the bedrock-mantle endpoint through AWS Service Quotas. This gives customers a familiar, consistent way to track limits for this endpoint, the same way they already do for the bedrock-runtime endpoint and other AWS services, and gives them clear visibility into the limits that apply to their workloads. The bedrock-mantle endpoint supports the OpenAI Responses API, OpenAI Chat Completions API, and the Anthropic Messages API, letting customers run existing OpenAI or Anthropic based applications on Amazon Bedrock with minimal code changes. AWS Service Quotas now exposes per-model input-tokens-per-minute and output-tokens-per-minute quotas for supported models on the endpoint. With this launch, customers gain visibility into how much limits they have on the bedrock-mantle endpoint and can proactively plan for production scale. To get started, open the AWS Service Quotas console, choose Amazon Bedrock, and search for “Bedrock Mantle” to view your current quotas. To request an increase to any of these quotas, follow the standard Amazon Bedrock limit increase process. Service Quotas support for the bedrock-mantle endpoint is available in all AWS Regions where the endpoint is offered: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Mumbai, Tokyo, Sydney, Jakarta), Europe (Frankfurt, Ireland, London, Milan, Stockholm), and South America (São Paulo). To learn more, see Quotas for Amazon Bedrock. 
Quelle: aws.amazon.com

Announcing Region Expansion of P6-B200 instances on SageMaker Notebook Instances

We are pleased to announce general availability of Amazon EC2 P6-B200 instances in AWS US East (N. Virginia) on SageMaker notebook instances.
Amazon EC2 P6-B200 instances are powered by 8 NVIDIA Blackwell GPUs with 1440 GB of high-bandwidth GPU memory and 5th Generation Intel Xeon processors (Emerald Rapids). These instances deliver up to 2x better performance compared to P5en instances for AI training. Customers can use P6-B200 instances to interactively develop and fine-tune large foundation models, including LLMs, mixture of experts models, and multi-modal reasoning models. These instances enable efficient experimentation with larger models directly in JupyterLab or CodeEditor environments for generative AI applications such as enterprise copilots and content generation across text, images, and video.
Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
Quelle: aws.amazon.com

Amazon EC2 X8i instances are now available in additional regions

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) X8i instances are available in the Asia Pacific (Singapore), Asia Pacific (Sydney) and AWS GovCloud (US-West) regions. These instances are powered by custom Intel Xeon 6 processors available only on AWS. X8i instances are SAP-certified and deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. They deliver up to 43% higher performance, 1.5x more memory capacity (up to 6TB), and 3.3x more memory bandwidth compared to previous generation X2i instances. X8i instances are designed for memory-intensive workloads like SAP HANA, large databases, data analytics, and Electronic Design Automation (EDA). Compared to X2i instances, X8i instances offer up to 50% higher SAPS performance, up to 47% faster PostgreSQL performance, 88% faster Memcached performance, and 46% faster AI inference performance. X8i instances come in 14 sizes, from large to 96xlarge, including two bare metal options. To get started, visit the AWS Management Console. X8i instances can be purchased via Savings Plans, On-Demand instances, and Spot instances. For more information visit X8i instances page
Quelle: aws.amazon.com

Amazon SageMaker HyperPod Slurm clusters now support specifying minimum capacity requirements with continuous provisioning

Amazon SageMaker HyperPod now supports minimum capacity requirements (MinCount) for clusters using Slurm orchestration with continuous provisioning. With continuous provisioning, HyperPod provisions clusters with available partial capacity so you can start your AI/ML jobs quickly, while continuing to provision remaining instances asynchronously in the background. While this provides flexibility, some training workloads require a guaranteed minimum number of nodes before they can start effectively. MinCount lets you specify the minimum number of instances that must be successfully provisioned before an instance group transitions to InService status, giving you greater control over when your cluster becomes available for job scheduling. This is particularly useful for distributed training workloads using frameworks such as PyTorch FSDP, Megatron-LM, or NVIDIA NeMo, where training jobs are commonly configured with a fixed number of participating nodes and may not start efficiently or correctly with partial cluster capacity. It also benefits teams that need to guarantee a baseline GPU count to meet SLA or cost-efficiency targets before committing to a training run. You can specify MinInstanceCount in the CreateCluster or UpdateCluster API request to set a minimum capacity threshold for an instance group. The instance group remains in Creating or Updating status until the threshold is met, then transitions to InService and nodes become available for Slurm job scheduling. HyperPod continues launching additional instances beyond MinCount until the target count is reached. If MinCount cannot be satisfied within 3 hours, the system automatically rolls back the instance group to its last known good state. MinCount for Slurm clusters with continuous provisioning is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To get started on specifying minimum capacity requirements for your cluster, see Minimum capacity requirements (MinCount) in the Amazon SageMaker AI documentation.
Quelle: aws.amazon.com

Amazon Connect Customer now uses generative AI to automatically evaluate self-service interactions

Amazon Connect Customer now enables managers to use generative AI to automatically evaluate self-service interactions, and get aggregated insights to help improve customer experience. Managers can define custom evaluation criteria in natural language within evaluation forms — such as “Were all of the customer issues resolved by the AI agent?” — which generative AI uses to help assess the quality of the self-service interaction. Connect provides detailed reasoning for the evaluation along with relevant reference points from the conversation transcript. Managers can review these insights in aggregate and on individual contacts, alongside self-service interaction recordings and transcripts, to identify opportunities to improve AI agent performance.
This feature is available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Europe (Frankfurt). To learn more, please visit our documentation and our webpage. For information about Amazon Connect Customer pricing, please visit our pricing page.
Quelle: aws.amazon.com

Amazon GuardDuty Malware Protection for AWS Backup supports Amazon S3 continuous backups

Amazon GuardDuty Malware Protection for AWS Backup is now available for Amazon S3 continuous backups. You can now scan your S3 continuous backups for malware and identify clean points in time across your entire backup timeline for safe recovery.
You can enable full or incremental malware scans for S3 continuous backups within your backup plan, and run on-demand scans up to any restorable point in time. You can now query the malware scan status at any point in time within your continuous backup using the new GetPITRMalwareScanResults API, allowing you to verify whether a specific recovery time is clean before initiating a restore.
Support for S3 continuous backups is available in all AWS Regions where Amazon GuardDuty Malware Protection for AWS Backup is supported. You can get started using the AWS Backup console, API, or CLI. To learn more, visit the AWS Backup documentation and Amazon GuardDuty Malware Protection documentation.
Quelle: aws.amazon.com

Amazon EC2 R8i and R8i-flex instances are now available in AWS GovCloud (US-East) Region

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) R8i and R8i-flex instances are available in the AWS GovCloud (US-East) Region. These instances are powered by custom Intel Xeon 6 processors, available only on AWS, delivering the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. The R8i and R8i-flex instances offer up to 15% better price-performance, and 2.5x more memory bandwidth compared to previous generation Intel-based instances. They deliver 20% higher performance than R7i instances, with even higher gains for specific workloads. They are up to 30% faster for PostgreSQL databases, up to 60% faster for NGINX web applications, and up to 40% faster for AI deep learning recommendation models compared to R7i. R8i-flex, our first memory-optimized Flex instances, are the easiest way to get price performance benefits for a majority of memory-intensive workloads. They offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources. R8i instances are a great choice for all memory-intensive workloads, especially for workloads that need the largest instance sizes or continuous high CPU usage. R8i instances offer 13 sizes including 2 bare metal sizes and the new 96xlarge size for the largest applications. R8i instances are SAP-certified and deliver 142,100 aSAPS, delivering exceptional performance for mission-critical SAP workloads. To get started, sign in to the AWS Management Console. For more information about the R8i and R8i-flex instances visit the AWS News blog.
Quelle: aws.amazon.com

Amazon EC2 M8i and M8i-flex instances are now available in AWS GovCloud (US-East) Region

Starting today, Amazon EC2 M8i and M8i-flex instances are now available in AWS GovCloud (US-East) Region. These instances are powered by custom Intel Xeon 6 processors, available only on AWS, delivering the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. The M8i and M8i-flex instances offer up to 15% better price-performance, and 2.5x more memory bandwidth compared to previous generation Intel-based instances. They deliver up to 20% better performance than M7i and M7i-flex instances, with even higher gains for specific workloads. The M8i and M8i-flex instances are up to 30% faster for PostgreSQL databases, up to 60% faster for NGINX web applications, and up to 40% faster for AI deep learning recommendation models compared to M7i and M7i-flex instances. M8i-flex are the easiest way to get price performance benefits for a majority of general-purpose workloads like web and application servers, microservices, small and medium data stores, virtual desktops, and enterprise applications. They offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don’t fully utilize all compute resources. M8i instances are a great choice for all general purpose workloads, especially for workloads that need the largest instance sizes or continuous high CPU usage. The SAP-certified M8i instances offer 13 sizes including 2 bare metal sizes and the new 96xlarge size for the largest applications. To get started, sign in to the AWS Management Console. For more information about the new instances, visit the M8i and M8i-flex instance page or visit the AWS News blog.
Quelle: aws.amazon.com

Amazon SageMaker Unified Studio adds interactive interface for managing Feature Store in IAM Domains

Amazon SageMaker Unified Studio IAM domains now includes an interactive interface for creating and managing feature groups in SageMaker Feature Store, eliminating the need to write code for common feature management tasks. This launch makes feature management accessible to data scientists, ML engineers, and business analysts from a single collaborative environment.
Features are the inputs to ML models used during training and inference. For example, a music recommendation app might use features like song ratings, listening duration, and listener demographics to personalize which songs are suggested to each user. With this interactive interface for creating and managing features, you can now discover and search existing features, create and modify feature groups, view definitions and schemas, monitor data ingestion status – all without writing API calls. Features created elsewhere appear immediately in SageMaker Unified Studio when sharing the same IAM role, ensuring seamless workflows across your ML development lifecycle.
To learn more about using the interactive interface for creating and managing features in SageMaker Unified Studio, visit the Amazon SageMaker Unifed Studio User Guide.
Quelle: aws.amazon.com