Amazon SageMaker HyperPod now supports automatic Slurm topology management

Amazon SageMaker HyperPod now automatically selects and continuously maintains the optimal network topology configuration for Slurm clusters based on the GPU instance types in the cluster. Network topology directly impacts distributed training performance — when jobs are placed on nodes that are topologically close, GPU-to-GPU communication is faster, NCCL collective operations are more efficient, and training throughput improves. HyperPod dynamically adapts the topology as the cluster evolves through scaling operations and node replacements, so job placement remains optimized throughout the cluster lifecycle without requiring manual updates to topology files or Slurm reconfiguration. HyperPod inspects the instance types across all instance groups at cluster creation, identifies the networking and interconnect characteristics of each instance type, and automatically selects the best-fit topology model. HyperPod supports tree topology for instance types with hierarchical interconnects such as ml.p5.48xlarge, ml.p5e.48xlarge, and ml.p5en.48xlarge, and block topology for instance types with uniform high-bandwidth connectivity such as ml.p6e-gb200.NVL72. For clusters with mixed instance types, HyperPod selects a compatible topology that works across all nodes. As the cluster changes through scale-up, scale-down, or node replacement events, HyperPod automatically updates the topology configuration without manual intervention, so the topology always reflects the actual state of the cluster.
To get started, create a SageMaker HyperPod Slurm cluster with supported GPU instance types. Topology-aware scheduling is enabled by default and requires no configuration.
This feature is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To learn more about topology-aware scheduling, visit the Amazon SageMaker HyperPod documentation
Quelle: aws.amazon.com

AWS Parallel Computing Service now supports Slurm 25.11

AWS Parallel Computing Service (AWS PCS) now supports Slurm version 25.11, with support for a Prometheus-compatible OpenMetrics endpoint, and introduces new log types including scheduler audit logs. This release of Slurm 25.11 introduces expedited re-queue, which can automatically reschedule jobs affected by node issues at the highest priority to help your workloads recover faster. You can enable a new OpenMetrics endpoint for real-time visibility into jobs, nodes, and scheduling using your existing monitoring tools. AWS PCS can now also send Slurm database daemon (slurmdbd) and REST API daemon (slurmrestd) logs to Amazon CloudWatch Logs, Amazon S3, or Amazon Data Firehose, helping diagnose accounting issues and debug API integrations. Scheduler audit logs, previously included in operational logs, are now delivered as a dedicated log type, providing independent control over ingestion and storage costs. AWS PCS is a managed service that makes it easier for you to run and scale your high performance computing (HPC) workloads and build scientific and engineering models on AWS using Slurm. You can use AWS PCS to build complete, elastic environments that integrate compute, storage, networking, and visualization tools. AWS PCS simplifies cluster operations with managed updates and built-in observability features, helping to remove the burden of maintenance. You can work in a familiar environment, focusing on your research and innovation instead of worrying about infrastructure. These features are available in all AWS Regions where AWS PCS is available. Standard charges apply for log delivery destinations. To learn more about AWS PCS, refer to the service documentation.
Quelle: aws.amazon.com

Amazon Athena simplifies federated queries with managed connectors

Amazon Athena now offers managed connectors for 12 data sources, including Amazon DynamoDB, PostgreSQL, MySQL, and Snowflake. Managed connectors are AWS Glue Data Catalog federated connectors that Athena creates and manages on your behalf, so you can query data outside Amazon S3 without deploying or maintaining connector resources in your AWS account. With Athena, you can interactively query relational, non-relational, object, and custom data sources without moving or duplicating data. To get started with managed connectors, you create a connection for your data source in Athena. Athena automatically sets up and manages connector resources on your behalf, registering the data source as a federated catalog in AWS Glue Data Catalog. You can then query the data source alongside your Amazon S3 data and optionally set up fine-grained access controls through AWS Lake Formation. Federated queries with managed connectors are available in all AWS Regions where Athena is available, except the AWS GovCloud (US) Regions and the China Regions. To learn more, visit Use Amazon Athena Federated Query in the Athena User Guide.
Quelle: aws.amazon.com

Trivy, KICS, and the shape of supply chain attacks so far in 2026

Catching the KICS push: what happened, and the case for open, fast collaboration

In the past few weeks we’ve worked through two supply chain compromises on Docker Hub with a similar shape: first Trivy, now Checkmarx KICS. In both cases, stolen publisher credentials were used to push malicious images through legitimate publishing flows. In both cases, Docker’s infrastructure was not breached. And in both cases, the software supply chain of everyone who pulled the compromised tags was briefly exposed.

This is our account of what happened with KICS, what affected users should do, and what the pattern says about where defenders need to invest.

What happened

On April 22, 2026 at approximately 12:35 UTC, a threat actor authenticated to Docker Hub using valid Checkmarx publisher credentials and pushed malicious images to the checkmarx/kics repository. Five existing tags were overwritten to malicious digests (latest, v2.1.20, v2.1.20-debian, alpine, debian) and two new tags (v2.1.21, v2.1.21-debian) were created. The images were built from an attacker-controlled source repository, not from Checkmarx’s.

The poisoned binary kept the legitimate scanning surface intact and added a quiet exfiltration path. Scan output was collected, encrypted, and sent to attacker-controlled infrastructure at audit.checkmarx[.]cx, with the User-Agent KICS-Telemetry/2.0. Because KICS scans Terraform, CloudFormation, Kubernetes and similar configuration files, its output routinely contains secrets, credentials, cloud resource names, and internal topology. 

Affected malicious digests (any one of these in your pull history should be treated as malicious):

For alpine, v2.1.20, v2.1.21 -> Index manifest digest: sha256:2588a44890263a8185bd5d9fadb6bc9220b60245dbcbc4da35e1b62a6f8c230d

Image digest (amd64): sha256:d186161ae8e33cd7702dd2a6c0337deb14e2b178542d232129c0da64b1af06e4
Image digest (arm64): sha256:415610a42c5b51347709e315f5efb6fffa588b6ebc1b95b24abf28088347791b

For debian, v2.1.20-debian, v2.1.21-debian -> Index manifest digest: sha256:222e6bfed0f3bb1937bf5e719a2342871ccd683ff1c0cb967c8e31ea58beaf7b

Image digest (amd64): sha256:a6871deb0480e1205c1daff10cedf4e60ad951605fd1a4efaca0a9c54d56d1cb
Image digest (arm64): sha256:ff7b0f114f87c67402dfc2459bb3d8954dd88e537b0e459482c04cffa26c1f07

For latest -> Index manifest digest: sha256:a0d9366f6f0166dcbf92fcdc98e1a03d2e6210e8d7e8573f74d50849130651a0

Image digest (amd64): sha256:26e8e9c5e53c972997a278ca6e12708b8788b70575ca013fd30bfda34ab5f48f

Image digest (arm64): sha256:7391b531a07fccbbeaf59a488e1376cfe5b27aef757430a36d6d3a087c610322

If your CI ran kics against any repository with credentials in scope during the exposure window, rotate those credentials now. Re-pull checkmarx/kics by digest, not tag, and pin your CI to the digest so a future overwrite cannot silently affect you again. Purge the malicious digests from local caches, CI runners, pull-through registries, and mirrors: a clean pull won’t remove what’s already been cached. Check egress logs for connections to audit.checkmarx[.]cx, or outbound traffic with the KICS-Telemetry/2.0 User-Agent, which are strong indicators that exfiltration occurred on your infrastructure.

The affected digests are disabled, the repository has been restored to its last known-good state, and pulls of checkmarx/kics today return the legitimate March 3, 2026 image. The publisher account used to push the malicious images has been suspended, and we’ve notified the small number of users our telemetry shows pulled the compromised digests.Socket’s technical analysis of the issue is here. Their post also covers what appears to be a broader Checkmarx compromise, including recent VS Code extension releases, which is worth reading if your developers use those extensions.

How we caught this breach

Within about half an hour of the push, a new image on a repository we monitor triggered a review. A check against the upstream source found no matching release, and the provenance showed the image had been built from a different source repository created one day before the push. That was enough to quarantine the repository and start forensics with Socket and Checkmarx.

The defense is in correlation, not any single signal. In this episode, we found a new tag without an upstream release, provenance from an unfamiliar source, and a timing pattern that did not appear to match normal publishing behavior. Since we happened to see these signals together, they bought us a narrow window in which to act. It has to be noted that layered defense shortens the window between push and takedown, it does not prevent the push.

The bar for this kind of attack has collapsed

The uncomfortable thing about this incident, and Trivy before it, is how little sophistication incidents such as these require these days. A stolen credential from an IDE extension compromise, a target chosen from a public profile, a push through the normal publishing flow, and the attacker is inside the software supply chain of every organization that pulls that tag. Our assumption is this attack did not require any zero-days, novel tradecraft, or nation-state level budgets. The ingredients are stolen credentials and time, and both are abundant right now.

Every registry, every package manager, and every publisher of any consequence is in the firing line, including Docker. This isn’t a Checkmarx problem or a Hub problem or an npm problem. It’s the new baseline, and defenders who aren’t planning for it as the default case are already behind.

There are two implications for our ecosystem.

Credential hygiene at the publishing boundary matters more than it used to: fine-grained tokens scoped to a single registry, shorter credential lifetimes, clean separation between personal and publisher identities.

And that no single layer will catch all of this. Publishing-time verification, provenance, signatures, registry-side monitoring, deep package inspection (the kind Socket does to catch malicious behavior in dependencies), runtime egress controls, and cross-registry signal correlation each have to do some of the work, because any of them alone will miss cases the others catch.

A note on where this is structurally harder

In the Docker Hardened Images catalog, images are built by Docker from source, with verified provenance and signed releases produced through a hardened build pipeline. The class of attack described above, where a valid publisher credential pushes a tag that diverges from its upstream source, is structurally much harder to execute against an image built this way. There is no external credential that can substitute its way in; the provenance and the signatures have to match, or the image doesn’t ship. The DHI catalog is expanding, and we’re investing in this layer precisely because of the scenario and reasons explored in this blog. 

No one catches this alone

The reason this incident got caught quickly, the reason Socket was able to produce a technical analysis within hours, and the reason Checkmarx’s response could move in parallel with ours, is that all three teams shared signals and samples in real time. The Trivy response looked the same, as did the rapid notification to GitHub about the attacker-controlled source repository.

This is the posture the ecosystem needs more of, not less. Supply chain attackers are routing  across registries, IDE marketplaces, source hosts, and CI systems in hours. Defenders who don’t share signals across those same boundaries are operating from a point of disadvantage.  Formal standards for cross-registry coordination are still emerging, and they will matter eventually. What’s kept the windows short so far has been teams working with a spirit of openness, willingly sharing what they’re discovering, in real time.

Docker will keep investing in layered defenses on Hub, keep extending publishing-time verification to more of the catalog, and keep showing up to share signals, whether this is across a partner’s incident channel, a peer registry’s investigation, or the rooms where a more durable framework for coordination eventually takes shape.

We want to thank the Socket research team for fast, independent analysis, and to Checkmarx for moving alongside us on a tight timeline for this one.

Further reading

Socket blog: https://socket.dev/blog/checkmarx-supply-chain-compromise

Docker Hardened Images on Docker Hub: https://hub.docker.com/hardened-images/catalog

Quelle: https://blog.docker.com/feed/

Amazon Quick now supports multiple owners for admin-managed SharePoint and Google Drive knowledge bases

Amazon Quick now enables you to add co-owners to knowledge bases and data source connections for admin-managed Microsoft SharePoint Online and Google Drive integrations. This makes it easier to collaborate across teams and reuse existing connections without re-entering credentials. Knowledge base owners can share their knowledge bases with two roles: Owner (full management access including editing, syncing, sharing, and deleting) and Viewer (query-only access). Co-owner sharing with the Owner role is available exclusively for admin-managed SharePoint and Google Drive knowledge bases. All other knowledge base types support Viewer sharing only. To share, navigate to the actions menu next to any knowledge base or use the Permissions tab. Administrators can also share data source connections, allowing other users to create knowledge bases from the same connection. Data source sharing supports Owner (create knowledge bases and edit connection details) and Viewer (create knowledge bases only) roles. To share a data source, go to Manage account > Manage assets > Data sources and select the connection to share. This feature is available in all AWS Regions where Amazon Quick is available. For more information, see Knowledge Base Sharing in the Amazon Quick User Guide. Amazon Quick is available in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (London), and Europe (Ireland). For more information, visit the Amazon Quick page.
Quelle: aws.amazon.com