Amazon SageMaker AI now supports EAGLE speculative decoding

Amazon SageMaker AI now supports EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) speculative decoding to improve large language model inference throughput by up to 2.5x. This capability enables models to predict and validate multiple tokens simultaneously rather than one at a time, improving response times for AI applications. As customers deploy AI applications to production, they need capabilities to serve models with low latency and high throughput to deliver responsive user experiences. Data scientists and ML engineers lack efficient methods to accelerate token generation without sacrificing output quality or requiring complex model re-architecture, making it hard to meet performance expectations under real-world traffic. Teams spend significant time optimizing infrastructure rather than improving their AI applications. With EAGLE speculative decoding, SageMaker AI enables customers to accelerate inference throughput by allowing models to generate and verify multiple tokens in parallel rather than one at a time, maintaining the same output quality while dramatically increasing throughput. SageMaker AI automatically selects between EAGLE 2 and EAGLE 3 based on your model architecture, and provides built-in optimization jobs that use either curated datasets or your own application data to train specialized prediction heads. You can then deploy optimized models through your existing SageMaker AI inference workflow without infrastructure changes, enabling you to deliver faster AI applications with predictable performance. You can use EAGLE speculative decoding in the following AWS Regions: US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Europe (Ireland), Asia Pacific (Singapore), and Europe (Frankfurt) To learn more about EAGLE speculative decoding, visit AWS News Blog here, and SageMaker AI documentation here.
Quelle: aws.amazon.com

AWS Lambda adds support for Node.js 24

AWS Lambda now supports creating serverless applications using Node.js 24. Developers can use Node.js 24 as both a managed runtime and a container base image, and AWS will automatically apply updates to the managed runtime and base image as they become available. Node.js 24 is the latest long-term support release of Node.js and is expected to be supported for security and bug fixes until April 2028. With this release, Lambda has simplified the developer experience, focusing on the modern async/await programming pattern and no longer supports callback-based function handlers. You can use Node.js 24 with Lambda@Edge (in supported Regions), allowing you to customize low-latency content delivered through Amazon CloudFront. Powertools for AWS Lambda (TypeScript), a developer toolkit to implement serverless best practices and increase developer velocity, also supports Node.js 24. You can use the full range of AWS deployment tools, including the Lambda console, AWS CLI, AWS Serverless Application Model (AWS SAM), AWS CDK, and AWS CloudFormation to deploy and manage serverless applications written in Node.js 24. The Node.js 24 runtime is available in all Regions, including the AWS GovCloud (US) Regions and China Regions. For more information, including guidance on upgrading existing Lambda functions, see our blog post. For more information about AWS Lambda, visit our product page. 
Quelle: aws.amazon.com

Manage Amazon SageMaker HyperPod clusters with the new Amazon SageMaker AI MCP Server

The Amazon SageMaker AI MCP Server now supports tools that help you setup and manage HyperPod clusters. Amazon SageMaker HyperPod removes the undifferentiated heavy lifting involved in building generative AI models by quickly scaling model development tasks such as training, fine-tuning, or deployment across a cluster of AI accelerators. The SageMaker AI MCP Server now empowers AI coding assistants to provision and operate AI/ML clusters for model training and deployment. MCP servers in AWS provide a standard interface to enhance AI-assisted application development by equipping AI code assistants with real-time, contextual understanding of various AWS services. The SageMaker AI MCP server comes with tools that streamline end-to-end AI/ML cluster operations using the AI assistant of your choice—from initial setup through ongoing management. It enables AI agents to reliably setup HyperPod clusters orchestrated by Amazon EKS or Slurm complete with pre-requisites, powered by CloudFormation templates that optimize networking, storage, and compute resources. Clusters created via this MCP server are fully optimized for high-performance distributed training and inference workloads, leveraging best practice architectures to maximize throughput and minimize latency at scale. Additionally, it provides comprehensive tools for cluster and node management—including scaling operations, applying software patches, and performing various maintenance tasks. When used in conjunction with AWS API MCP Server, AWS Knowledge MCP Server, and Amazon EKS MCP Server you gain complete coverage for all SageMaker HyperPod APIs and you can effectively troubleshoot common issues, such as diagnosing why a cluster node became inaccessible. For cluster administrators, these tools streamline daily operations. For data scientists, they enable you to set up AI/ML clusters at scale without requiring infrastructure expertise, allowing you to focus on what matters most—training and deploying models. You can manage your AI/ML clusters through the SageMaker AI MCP server in all regions where SageMaker HyperPod is available. To get started, visit the AWS MCP Servers documentation.
Quelle: aws.amazon.com

Introducing AWS Network Firewall Proxy in preview

AWS introduces Network Firewall Proxy in public preview. You can use it to exert centralized controls against data exfiltration and malware injection. You can set up your Network Firewall Proxy in explicit mode in just a few clicks and filter the traffic going out from your applications and the response that these applications receive. Network Firewall Proxy enables customers to efficiently manage and secure web and inter-network traffic. It protects your organization against atempts to spoof the domain name or the server name index (SNI) and offers flexibility to set fine-grained access controls. You can use Network Firewall Proxy to restrict access from your applications to trusted domains or IP addresses, or block unintended response from external servers. You can also turn on TLS inspection and set granular filtering controls on HTTP header attributes. Your Network Firewall Proxy offers comprehensive logs for monitoring your applications. You can enable them and send to Amazon S3 and AWS CloudWatch for detailed analyses and audit. Try out AWS Network Firewall Proxy in your test environment today in US East (Ohio) region. Proxy is available for free during public preview. For more information check AWS Network Firewall proxy documentation.
Quelle: aws.amazon.com

Amazon OpenSearch Service introduces Agentic Search

Amazon OpenSearch Service launches Agentic Search, transforming how users interact with their data through intelligent, agent-driven search. Agentic Search introduces an intelligent agent-driven system that understands user intent, orchestrates the right set of tools, generates OpenSearch DSL (domain-specific language) queries, and provides transparent summaries of its decision-making process through a simple ‘agentic’ query clause and natural language search terms. Agentic Search automates OpenSearch query planning and execution, eliminating the need for complex search syntax. Users can ask questions in natural language like “Find red cars under $30,000″ or “Show last quarter’s sales trends.” The agent interprets intent, applies optimal search strategies, and delivers results while explaining its reasoning process. The feature provides two agent types: conversational agents, which handle complex interactions with the ability to store conversations in memory, and flow agents for efficient query processing. The built-in QueryPlanningTool uses large language models (LLMs) to create DSL queries, making search accessible regardless of technical expertise. Users can manage Agentic Search through APIs or OpenSearch Dashboards to configure and modify agents. Agentic Search’s advanced settings allow you to connect with external MCP servers and use custom search templates. Support for agentic search is available for OpenSearch Service version 3.3 and later in all AWS Commercial and AWS GovCloud (US) Regions where OpenSearch Service is available. See here for a full listing of our Regions. Build agents and run agentic searches using the new Agentic Search use case available in the AI Search Flows plugin. To learn more about Agentic Search, visit the OpenSearch technical documentation.
Quelle: aws.amazon.com

AWS Glue Data Quality now supports pre-processing queries

Today, AWS announces the general availability of preprocessing queries for AWS Glue Data Quality, enabling you to transform your data before running data quality checks through AWS Glue Data Catalog APIs. This feature allows you to create derived columns, filter data based on specific conditions, perform calculations, and validate relationships between columns directly within your data quality evaluation process.
Preprocessing queries provide enhanced flexibility for complex data quality scenarios that require data transformation before validation. You can create derived metrics like calculating total fees from tax and shipping columns, limiting number of columns that are considered for data quality recommendations or filter datasets to focus quality checks on specific data subsets. This capability eliminates the need for separate data pre-processing steps, streamlining your data quality workflows.
AWS Glue Data Quality preprocessing queries are available through AWS Glue Data Catalog APIs – start-data-quality-rule-recommendation-run and start-data-quality-ruleset-evaluation-run, in all commercial AWS Regions where AWS Glue Data Quality is available. To learn more about preprocessing queries, see the Glue Data Quality documentation. 
Quelle: aws.amazon.com

Amazon Quick Suite introduces scheduling for Quick Flows

Amazon Quick Flows now supports scheduling, enabling you to automate repetitive workflows without requiring manual intervention. You can now configure Quick Flows to run automatically at specified times or intervals, improving operational efficiency and ensuring critical tasks execute consistently. You can schedule Quick Flows to run daily, weekly, monthly, or on custom intervals. This capability is great for automating routine and administrative tasks such as generating recurring reports from dashboards, summarizing open items assigned to you in external services, or generating daily meeting briefings before you head out to work. You can schedule any flow you have access to—whether you created it or it was shared with you. To schedule a flow, click the scheduling icon and configure your desired date, time, and frequency. Scheduling in Quick Flows is available now in IAD, PDX, and DUB. There are no additional charges for using scheduled execution beyond standard Quick Flows usage. To learn more about configuring scheduled Quick Flows, please visit our documentation.
Quelle: aws.amazon.com

AWS Glue Data Quality now supports rule labeling for enhanced reporting

Today, AWS announces the general availability of rule label, a feature of AWS Glue Data Quality, enabling you to apply custom key-value pair labels to your data quality rules for improved organization, filtering, and targeted reporting. This enhancement allows you to categorize data quality rules by business context, team ownership, compliance requirements, or any custom taxonomy that fits your data quality and governance needs. Rule labels provide effective way to organize analyze data quality results. You can query results by specific labels to identify failing rules within particular categories, count rule outcomes by team or domain, and create focused reports for different stakeholders. For example, you can apply all rules that pertain to finance team with a label “team=finance” and generate a customized report to showcase quality metrics specific to finance team. You can label high priority rules with “criticality=high” to prioritize remediation efforts. Labels can be authored as part of the DQDL. You can query the labels as part of rule outcomes, row-level results, and API responses, making it easy to integrate with your existing monitoring and reporting workflows. AWS Glue Data Quality rule labeling is available in all commercial AWS Regions where AWS Glue Data Quality is available. See the AWS Region Table for more details. To learn more about rule labeling, see the AWS Glue Data Quality documentation.
Quelle: aws.amazon.com

Amazon MSK Replicator is now available in five additional AWS Regions

You can now use Amazon MSK Replicator to replicate streaming data across Amazon Managed Streaming for Apache Kafka (Amazon MSK) clusters in five additional AWS Regions: Asia Pacific (Thailand), Mexico (Central), Asia Pacific (Taipei), Canada West (Calgary), Europe (Spain). MSK Replicator is a feature of Amazon MSK that enables you to reliably replicate data across Amazon MSK clusters in different or the same AWS Region(s) in a few clicks. With MSK Replicator, you can easily build regionally resilient streaming applications for increased availability and business continuity. MSK Replicator provides automatic asynchronous replication across MSK clusters, eliminating the need to write custom code, manage infrastructure, or setup cross-region networking. MSK Replicator automatically scales the underlying resources so that you can replicate data on-demand without having to monitor or scale capacity. MSK Replicator also replicates the necessary Kafka metadata including topic configurations, Access Control Lists (ACLs), and consumer group offsets. If an unexpected event occurs in a region, you can failover to the other AWS Region and seamlessly resume processing. You can get started with MSK Replicator from the Amazon MSK console or the Amazon CLI. To learn more, visit the MSK Replicator product page, pricing page, and documentation.
Quelle: aws.amazon.com

Amazon CloudFront announces support for mutual TLS authentication

Amazon CloudFront announces support for mutual TLS Authentication (mTLS), a security protocol that requires both the server and client to authenticate each other using X.509 certificates, enabling customers to validate client identities at CloudFront’s edge locations. Customers can now ensure only clients presenting trusted certificates can access their distributions, helping protect against unauthorized access and security threats. Previously, customers had to spend ongoing effort implementing and maintaining their own client access management solutions, leading to undifferentiated heavy lifting. Now with the support for mutual TLS, customers can easily validate client identities at the AWS edge before connections are established with their application servers or APIs. Example use cases include B2B secure API integrations for enterprises and client authentication for IoT. For B2B API security, enterprises can authenticate API requests from trusted third parties and partners using mutual TLS. For IoT use cases, enterprises can validate that devices are authorized to receive proprietary content such as firmware updates. Customers can leverage their existing third-party Certificate Authorities or AWS Private Certificate Authority to sign the X.509 certificates. With Mutual TLS, customers get the performance and scale benefits of CloudFront for workloads that require client authentication. Mutual TLS authentication is available to all CloudFront customers at no additional cost. Customers can configure mutual TLS with CloudFront using the AWS Management Console, CLI, SDK, CDK, and CloudFormation. For detailed implementation guidance and best practices, visit CloudFront Mutual TLS (viewer) documentation.
Quelle: aws.amazon.com