Amazon EKS announces 99.99% Service Level Agreement and new 8XL scaling tier for Provisioned Control Plane clusters

Amazon Elastic Kubernetes Service (Amazon EKS) now offers a 99.99% Service Level Agreement (SLA) for clusters running on Provisioned Control Plane, up from the 99.95% SLA offered on standard control plane. Amazon EKS is also introducing the 8XL scaling tier, the largest available Provisioned Control Plane tier. Provisioned Control Plane gives you the ability to select your cluster’s control plane capacity from a set of well-defined scaling tiers, ensuring the control plane is pre-provisioned and ready to handle traffic spikes or unpredictable bursts. The higher 99.99% SLA is measured in 1-minute intervals, providing a more granular and stringent availability commitment for mission-critical workloads. The new 8XL tier offers double the Kubernetes API server request processing capacity of the next lower 4XL tier, enabling workloads such as ultra-scale AI/ML training, high-performance computing (HPC), and large-scale data processing. Both the 99.99% SLA and the 8XL tier are available today in all AWS regions where Amazon EKS Provisioned Control Plane is offered. To learn more about the SLA, see the Amazon EKS Service Level Agreement. For 8XL pricing and capabilities, see the EKS pricing and EKS Provisioned Control Plane documentation.
Quelle: aws.amazon.com

Amazon Bedrock AgentCore Runtime adds WebRTC support for real-time bidirectional streaming

Amazon Bedrock AgentCore Runtime now supports WebRTC for real-time bidirectional streaming between clients and agents, adding to the existing WebSocket protocol support. With WebRTC, developers can build voice agents for browser and mobile applications that stream audio and video bidirectionally with low latency using peer-to-peer, UDP-based transport, enabling natural, real-time conversational experiences.
WebRTC joins WebSocket as the second bidirectional streaming protocol supported by AgentCore Runtime. While WebSocket provides persistent, full-duplex connections for text and audio streaming over TCP, WebRTC is optimized for real-time media delivery where low latency is critical, such as voice agents in browser and mobile applications. WebRTC requires a TURN relay for media traffic, and AgentCore Runtime gives you flexibility in how you set that up: Amazon Kinesis Video Streams managed TURN for a fully managed experience with native AWS IAM integration, a third-party provider, or your own self-hosted TURN infrastructure. Both protocols benefit from AgentCore Runtime session isolation, observability, and scaling.
WebRTC is supported in AgentCore Runtime across fourteen AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Canada (Central), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and Europe (Stockholm).
To get started, see Bidirectional streaming in the Amazon Bedrock AgentCore documentation, which includes ready-to-deploy examples for both protocols: an Amazon Nova Sonic voice agent with KVS TURN server, Pipecat voice agents with WebSocket, WebRTC, and Daily transport, a LiveKit voice agent, and a Strands Agents SDK voice agent.
Quelle: aws.amazon.com

AWS DataSync now supports AWS Secrets Manager for all location types

AWS DataSync now supports AWS Secrets Manager for credential management across all location types, including Hadoop Distributed File System (HDFS), Amazon FSx for Windows File Server, and Amazon FSx for NetApp ONTAP. Previously, Secrets Manager integration was limited to a subset of location types, requiring you to provide credentials directly through the DataSync API or console.
You can centralize credential management for all DataSync locations in Secrets Manager, providing a single, consistent approach across all your data transfers. You can also encrypt credentials with your own AWS KMS key instead of the default AWS-owned key, helping you meet your organization’s security requirements and governance policies. All secrets are stored in your account, allowing you to update credentials as needed, independent of the DataSync service.
DataSync supports two approaches for credential management. You can provide a secret ARN referencing credentials you manage in Secrets Manager for full control over rotation, auditing, and access policies. Alternatively, DataSync can automatically create and manage secrets on your behalf.
This capability is available is available in the majority of AWS regions where AWS DataSync is offered. For the full list of supported regions, visit the AWS Capabilities tool in Builder Center. To get started, visit the AWS DataSync console. For more information, see Managing credentials with AWS Secrets Manager in the AWS DataSync documentation.
Quelle: aws.amazon.com

AWS Neuron announces support for Dynamic Resource Allocation with Amazon EKS

AWS announces the Neuron Dynamic Resource Allocation (DRA) driver for Amazon Elastic Kubernetes Service (EKS), bringing Kubernetes-native hardware-aware scheduling to AWS Trainium-based instances. The Neuron DRA driver publishes rich device attributes directly to the Kubernetes scheduler, enabling topology-aware placement decisions without custom scheduler extensions. Deploying AI workloads on Kubernetes requires ML engineers to make infrastructure decisions that are not directly related to model development, such as determining device counts, understanding hardware and network topologies, and writing accelerator-specific manifests. This creates friction, slows iteration, and tightly couples workloads to underlying infrastructure. As use cases expand to distributed training, long-context inference, and disaggregated architectures, this complexity becomes a scaling bottleneck. The Neuron DRA driver removes this burden by separating infrastructure concerns from ML workflows. Infrastructure teams define reusable ResourceClaimTemplates that capture device topology, allocation, and networking policies. ML engineers can simply reference these templates in their manifests, without needing to reason about hardware details. This enables consistent deployment across workload types while allowing per-workload configuration so multiple workloads can efficiently share the same nodes. The Neuron DRA driver supports all AWS Trainium instance types  and is available in all AWS Regions where AWS Trainium is available.
For documentation, sample templates, and implementation guides, visit the Neuron DRA documentation.
Learn more:

Neuron EKS DRA templates
Neuron EKS documentation
Amazon EKS documentation

Quelle: aws.amazon.com

Amazon Redshift supports federated permissions with IAM Identity Center in multiple AWS Regions

Amazon Redshift federated permissions are now supported with AWS IAM Identity Center (IdC) in multiple AWS Regions. You can extend IdC from your primary AWS Region to additional Regions for improved performance through proximity to users and reliability. In the additional regions, you now have simplified administration of Redshift fine-grained access controls at the table and column level using existing workforce identities with IdC. When a new Region is added in IdC, you can create Redshift and Lake Formation Identity Center applications in the new Region without replicating identities from the primary Region. This enables you to use existing workforce identities to query data across warehouses in the new Region. Regardless of which warehouse is used for querying, row-level, column-level, and masking controls always apply automatically, delivering fine-grained access compliance. You can also access Amazon Redshift with single sign-on in these new Regions from Amazon QuickSight, Amazon Redshift Query Editor, or third-party SQL tools. To get started with Redshift federated permissions using IdC, read the blog and documentation. To extend IdC support in multiple regions, read IdC documentation, Redshift documentation, Lake Formation documentation, and see the region availability.
Quelle: aws.amazon.com

Amazon EC2 Fleet now supports interruptible Capacity Reservations

Amazon EC2 Fleet now supports interruptible Capacity Reservations. EC2 Fleet allows you to launch instances across multiple instance types and Availability Zones. Starting today, you can specify interruptible Capacity Reservation IDs across your Launch Templates to provision instances in a single EC2 Fleet call.
When On-Demand Capacity Reservations are not in use, customers can make them temporarily available as interruptible reservations within their AWS Organization to improve utilization and save costs. When these interruptible reservations are available to your account, you can now use EC2 Fleet to easily consume them.
This feature is available in all AWS commercial regions. To get started, refer to the EC2 Fleet documentation. To learn more about interruptible Capacity Reservations, visit the EC2 Capacity Reservations user guide. 
Quelle: aws.amazon.com

Amazon RDS Custom for SQL Server adds ability to view and schedule new operating system updates

Amazon RDS Custom for SQL Server now provides customers with ability to view and schedule new operating system (OS) updates for RDS provided engine versions (RPEV). With RPEV, RDS Custom provides a SQL Server version pre-installed on an Amazon Machine Image (AMI). When new operating system updates are available for RPEV, customers can now view upcoming updates, apply them immediately, or schedule them for application in the next maintenance window using RDS Custom APIs. To view available OS updates, customers can use the describe-pending-maintenance-actions API, or subscribe to RDS-EVENT-0230 to receive an alert when new updates become available for their database instance. Customers can use apply-pending-maintenance-action API to apply the updates immediately or schedule them within their next maintenance window.
Using these features, customers can efficiently track and apply OS updates. To learn more, refer to the Amazon RDS Custom for SQL Server User Guide. These features are available in all the AWS Regions where RDS Custom for SQL Server is available.
Quelle: aws.amazon.com

AWS Lambda now supports Availability Zone metadata

AWS Lambda now provides Availability Zone (AZ) metadata through a new metadata endpoint in the Lambda execution environment. With this capability, developers can determine the AZ ID (e.g., use1-az1) of the AZ their Lambda function is running in, enabling them to build functions that make AZ-aware routing decisions, such as preferring same-AZ endpoints for downstream services to reduce cross-AZ latency. This capability also enables operators to implement AZ-aware resilience patterns like AZ-specific fault injection testing.
Lambda automatically provisions and maintains execution environments ready to serve function invocations across multiple AZs within an AWS Region to provide high availability and fault tolerance without any additional configuration or management overhead for customers.  As development teams scale their serverless applications, their functions often need to interact with other AWS services like Amazon ElastiCache and Amazon RDS that provide endpoints specific to each AZ. Until now, Lambda did not provide a way for functions to determine which AZ they were running in. With the new metadata endpoint, functions can now retrieve their AZ ID with a simple HTTP request, making it easy to implement AZ-aware logic without building and maintaining custom solutions.
To get started, use the Powertools for AWS Lambda metadata utility or call the metadata endpoint directly using the environment variables that Lambda automatically sets in the execution environment. This capability is supported for all Lambda runtimes, including custom runtimes and functions packaged as container images, and integrates seamlessly with Lambda capabilities like SnapStart and provisioned concurrency, regardless of whether your functions are VPC-enabled. 
AZ metadata support is available at no additional cost in all commercial AWS Regions where Lambda is available. To learn more, visit Lambda documentation.
Quelle: aws.amazon.com

Amazon EC2 C8gn instances are now available in additional regions

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C8gn instances, powered by the latest-generation AWS Graviton4 processors, are available in the AWS Region Asia Pacific (Jakarta, Hyderabad, Tokyo), South America (Sao Paulo), and Europe (Zurich). The new instances provide up to 30% better compute performance than Graviton3-based Amazon EC2 C7gn instances. Amazon EC2 C8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network optimized EC2 instances. 
  Take advantage of the enhanced networking capabilities of C8gn to scale performance and throughput, while optimizing the cost of running network-intensive workloads such as network virtual appliances, data analytics, CPU-based artificial intelligence and machine learning (AI/ML) inference.    For increased scalability, C8gn instances offer instance sizes up to 48xlarge, up to 384 GiB of memory, and up to 60 Gbps of bandwidth to Amazon Elastic Block Store (EBS). C8gn instances support Elastic Fabric Adapter (EFA) networking on the 16xlarge, 24xlarge, 48xlarge, metal-24xl, and metal-48xl sizes, which enables lower latency and improved cluster performance for workloads deployed on tightly coupled clusters.    C8gn instances are available in the following AWS Regions: US East (N. Virginia, Ohio), US West (Oregon, N.California), Europe (Frankfurt, Stockholm, Ireland, London, Spain, Zurich), Asia Pacific (Singapore, Malaysia, Sydney, Thailand, Mumbai, Seoul, Melbourne, Jakarta, Hyderabad, Tokyo), Middle East (UAE), Africa (Cape Town), Canada West (Calgary, Central), South America (Sao Paulo), AWS GovCloud (US-East, US-West).   To learn more, see Amazon C8gn Instances. To begin your Graviton journey, visit the Level up your compute with AWS Graviton page. To get started, see AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs.
Quelle: aws.amazon.com

AWS adds support for NIXL with EFA to accelerate LLM inference at scale

AWS announces support for NVIDIA Inference Xfer Library (NIXL) with Elastic Fabric Adapter (EFA) to accelerate disaggregated large language model (LLM) inference on Amazon EC2. This integration enhances disaggregated inference serving through three key improvements: increased KV-cache throughput, reduced inter-token latency, and optimized KV-cache memory utilization. NIXL with EFA enables high throughput, low-latency KV-cache transfer between prefill and decode nodes, and it enables efficient KV-cache movement between various storage layers. NIXL is interoperable with all EFA-enabled EC2 instances and integrates natively with frameworks including NVIDIA Dynamo, SGLang, and vLLM. Combined, NIXL with EFA enables flexible integration with your EC2 instance and framework of choice, providing performant disaggregated inference at scale. AWS supports NIXL version 1.0.0 or higher with EFA installer version 1.47.0 or higher on all EFA-enabled EC2 instance types in all AWS regions at no additional cost. For more information, visit the EFA documentation.
Quelle: aws.amazon.com