Gemeinsam gegen Geldwäsche: Wie EuroDaT den sicheren Austausch sensibler Finanzdaten ermöglicht

Ein Beitrag von Dr. Alexander Alldridge, Geschäftsführer von EuroDaTGeldwäschebekämpfung ist Teamarbeit. Banken, Regierungen und Technologiepartner müssen eng zusammenarbeiten, um kriminelle Netzwerke effektiv aufzudecken. Diese Herausforderung ist im streng regulierten Finanzsektor besonders komplex: Wie funktioniert Datenabgleich, wenn die Daten, um die es geht, hochsensibel sind? In diesem Blogbeitrag erklärt Dr. Alexander Alldridge, Geschäftsführer von EuroDaT, welche Rolle ein Datentreuhänder dabei spielen kann – und wie EuroDaT mit Lösungen von Google Cloud eine skalierbare, DSGVO-konforme Infrastruktur für genau diesen Zweck aufgebaut hat.
Wenn eine Bank eine verdächtige Buchung bemerkt, beginnt ein sensibler Abstimmungsprozess. Um mögliche Geldflüsse nachzuverfolgen, bittet sie andere Banken um Informationen zu bestimmten Transaktionen oder Konten. Aktuell geschieht das meist telefonisch – nicht, weil es keine digitalen Alternativen gäbe, sondern weil die Weitergabe sensibler Finanzdaten wie IBANs oder Kontobewegungen nur unter sehr engen rechtlichen Vorgaben erlaubt ist.Das Hin und Her per Telefon ist nicht nur mühsam, sondern auch fehleranfällig. Deutlich schneller und sicherer wäre ein digitaler Datenabgleich, der nur berechtigten Stellen Zugriff auf genau die Informationen gibt, die sie im konkreten Verdachtsfall benötigen.Hier bei EuroDaT, einer Tochtergesellschaft des Landes Hessen, bieten wir genau das: Als Europas erster transaktionsbasierter Datentreuhänder ermöglichen wir einen kontrollierten, anlassbezogenen Austausch sensibler Finanzdaten, der vertrauliche Informationen schützt und alle gesetzlichen Vorgaben erfüllt.safeAML: Ein neuer Weg für den Datenaustausch im FinanzsektorMit safeAML haben wir in Zusammenarbeit mit der Commerzbank, der Deutschen Bank und N26 ein System entwickelt, das den Informationsaustausch zwischen Finanzinstituten digitalisiert. Statt aufwendig andere Institute abzutelefonieren, kann künftig jede Bank selbst die relevanten Daten von anderen Banken hinzuziehen, um auffällige Transaktionen besser einordnen zu können.Der Datenaustausch läuft dabei kontrolliert und datenschutzkonform ab: Die Daten werden pseudonymisiert verarbeitet und so weitergegeben, dass nur die anfragende Bank sie am Ende wieder zuordnen kann. Wir bei EuroDaT haben als Datentreuhänder zu keinem Zeitpunkt Zugriff auf personenbezogene Inhalte.

safeAML Anwendung

Höchste Sicherheits- und Compliance-Standards mit Google CloudsafeAML ist eine Cloud-native Anwendung, wird also vollständig in der Cloud entwickelt und betrieben. Dafür braucht es eine Infrastruktur, die nicht nur technisch leistungsfähig ist, sondern auch die strengen Vorgaben im Finanzsektor erfüllt – von der DSGVO bis zu branchenspezifischen Sicherheits- und Cyber-Resilienz-Anforderungen. Google Cloud bietet dafür eine starke Basis, weil das Google Cloud-Team technisch und vertraglich schon früh die passenden Grundlagen für solche sensiblen Anwendungsfälle gelegt hat. Für uns war das ein entscheidender Vorteil gegenüber anderen Anbietern.Unsere gesamte Infrastruktur ist auf Google Kubernetes Engine (GKE) aufgebaut. Darüber richten wir sichere, isolierte Umgebungen ein, in denen jede Anfrage nachvollziehbar und getrennt von anderen verarbeitet werden kann. Alle technischen Ressourcen, darunter auch unsere Virtual Private Clouds (VPCs), sind in der Google-Cloud-Umgebung über Infrastruktur als Code definiert. Das bedeutet: Die gesamte Infrastruktur von EuroDaT wird automatisiert und wiederholbar aufgebaut, inklusive der Regeln dafür, welche Daten wohin fließen dürfen.Diese transparente, einfach reproduzierbare Architektur hilft uns auch dabei, die strengen Compliance-Anforderungen im Finanzsektor zu erfüllen: Wir können jederzeit belegen, dass sicherheitsrelevante Vorgaben automatisch umgesetzt und überprüft werden.
Banken nutzen safeAML für schnellere VerdachtsprüfungsafeAML ist inzwischen bei den ersten deutschen Banken testweise im Einsatz, um verdächtige Transaktionen schneller und besser einordnen zu können. Anstatt wie gewohnt zum Telefon greifen zu müssen, können Ermittler*innen jetzt gezielt ergänzende Informationen von anderen Instituten einholen, ohne dabei sensible Daten offenzulegen.Das beschleunigt nicht nur die Prüfung, sondern reduziert auch Fehlalarme, die bisher viel Zeit und Kapazitäten gebunden haben. Die Meldung, ob ein Geldwäscheverdacht vorliegt, bleibt dabei weiterhin eine menschliche Einzelfallentscheidung, wie es das deutsche Recht verlangt.Dass Banken über safeAML erstmals kontrolliert Daten austauschen können, ist bereits ein großer Schritt für die Geldwäschebekämpfung in Deutschland. Wir stehen aber noch am Anfang: Jetzt geht es darum, mehr Banken einzubinden, die Vernetzung national und international auszuweiten und den Prozess so unkompliziert wie möglich zu machen. Denn je mehr Institute mitmachen, desto besser können wir ein vollständiges Bild verdächtiger Geldflüsse zeichnen. Die neue Datenbasis kann künftig auch dabei helfen, Verdachtsfälle besser einzuordnen und fundierter zu bewerten.
Nachhaltiger Datenschutz: Sicherer Austausch von ESG-DatenUnsere Lösung ist aber nicht auf den Finanzbereich beschränkt. Als Datentreuhänder können wir das Grundprinzip, sensible Daten nur gezielt und kontrolliert zwischen dazu berechtigten Parteien zugänglich zu machen, auch auf viele andere Bereiche übertragen. Wir arbeiten dabei immer mit Partnern zusammen, die ihre Anwendungsideen auf EuroDaT umsetzen, und bleiben als Datentreuhänder selbst neutral.

Leistungsangebot EuroDaT

Ein aktuelles Beispiel sind ESG-Daten: Nicht nur große Firmen, sondern auch kleine und mittlere Unternehmen stehen zunehmend unter Druck, Nachhaltigkeitskennzahlen offenzulegen – sei es wegen neuer gesetzlicher Vorgaben oder weil Geschäftspartner wie Banken und Versicherer sie einfordern.Gerade für kleinere Firmen ist es schwierig, diesen Anforderungen gerecht zu werden. Sie haben oft nicht die nötigen Strukturen oder Ressourcen, um ESG-Daten standardisiert bereitzustellen, und möchten sensible Informationen wie Verbrauchsdaten verständlicherweise auch nicht einfach öffentlich machen.Hier kommt EuroDaT ins Spiel: Wir sorgen als vertrauenswürdige Zwischenstelle dafür, dass Nachhaltigkeitsdaten sicher weitergegeben werden, ohne dass Unternehmen die Kontrolle darüber verlieren. Mit dem Deutschen Nachhaltigkeitskodex (DNK) führen wir aktuell Gespräche zu einer Lösung, die kleinen Firmen das Übermitteln von ESG-Daten an Banken, Versicherungen und Investor*innen über EuroDaT als Datentreuhänder erleichtern kann.
Forschung im Gesundheitssektor: Sensible Daten, sichere ErkenntnisseAuch im Gesundheitssektor sehen wir großes Potenzial für unsere Technologie. Hier geht es natürlich um besonders sensible Daten, die nur unter strengen Auflagen verarbeitet werden dürfen. Trotzdem gibt es viele Fälle, in denen Gesundheitsdaten zusammengeführt werden müssen – etwa für die Grundlagenforschung, die Ausgestaltung klinischer Studien und politische Entscheidungen.Im Auftrag der Bundesregierung hat die Unternehmensberatung d-fine jetzt gezeigt, wie Gesundheitsdaten mithilfe von EuroDaT genutzt werden können – etwa zur Analyse der Auswirkungen von Post-COVID auf die Erwerbstätigkeit. Dafür müssen diese Daten mit ebenfalls hochsensiblen Erwerbsdaten zusammengeführt werden, was durch EuroDaT möglich wird: Als Datentreuhänder stellen wir sicher, dass die Daten vertraulich bleiben und dennoch sinnvoll genutzt werden können.Datensouveränität als Schlüssel zur digitalen ZusammenarbeitWenn Daten nicht ohne Weiteres geteilt werden dürfen, hat das meist gute Gründe. Gerade im Finanzwesen oder im Gesundheitssektor sind Datenschutz und Vertraulichkeit nicht verhandelbar. Umso wichtiger ist, dass der Austausch dieser Daten, wenn er tatsächlich notwendig wird, rechtlich sicher und kontrolliert stattfinden kann.Als Datentreuhänder sorgen wir deshalb nicht nur für sicheren Datenaustausch in sensiblen Branchen, sondern stärken dabei auch die Datensouveränität aller Beteiligten. Gemeinsam mit Google Cloud verankern wir Datenschutz fest im Kern der digitalen Zusammenarbeit zwischen Unternehmen, Behörden und Forschungseinrichtungen.
Quelle: Google Cloud Platform

Agent Factory: Top 5 agent observability best practices for reliable AI

This blog post is the third out of a six-part blog series called Agent Factory which will share best practices, design patterns, and tools to help guide you through adopting and building agentic AI.

Seeing is knowing—the power of agent observability

As agentic AI becomes more central to enterprise workflows, ensuring reliability, safety, and performance is critical. That’s where agent observability comes in. Agent observability empowers teams to:

Detect and resolve issues early in development.

Verify that agents uphold standards of quality, safety, and compliance.

Optimize performance and user experience in production.

Maintain trust and accountability in AI systems.

With the rise of complex, multi-agent and multi-modal systems, observability is essential for delivering AI that is not only effective, but also transparent, safe, and aligned with organizational values. Observability empowers teams to build with confidence and scale responsibly by providing visibility into how agents behave, make decisions, and respond to real-world scenarios across their lifecycle.

Learn more about building agentic AI in Azure AI Foundry

What is agent observability?

Agent observability is the practice of achieving deep, actionable visibility into the internal workings, decisions, and outcomes of AI agents throughout their lifecycle—from development and testing to deployment and ongoing operation. Key aspects of agent observability include:

Continuous monitoring: Tracking agent actions, decisions, and interactions in real time to surface anomalies, unexpected behaviors, or performance drift.

Tracing: Capturing detailed execution flows, including how agents reason through tasks, select tools, and collaborate with other agents or services. This helps answer not just “what happened,” but “why and how did it happen?”

Logging: Records agent decisions, tool calls, and internal state changes to support debugging and behavior analysis in agentic AI workflows.

Evaluation: Systematically assessing agent outputs for quality, safety, compliance, and alignment with user intent—using both automated and human-in-the-loop methods.

Governance: Enforcing policies and standards to ensure agents operate ethically, safely, and in accordance with organizational and regulatory requirements.

Traditional observability vs agent observability

Traditional observability relies on three foundational pillars: metrics, logs, and traces. These provide visibility into system performance, help diagnose failures, and support root-cause analysis. They are well-suited for conventional software systems where the focus is on infrastructure health, latency, and throughput.

However, AI agents are non-deterministic and introduce new dimensions—autonomy, reasoning, and dynamic decision making—that require a more advanced observability framework. Agent observability builds on traditional methods and adds two critical components: evaluations and governance. Evaluations help teams assess how well agents resolve user intent, adhere to tasks, and use tools effectively. Agent governance can ensure agents operate safely, ethically, and in compliance with organizational standards.

This expanded approach enables deeper visibility into agent behavior—not just what agents do, but why and how they do it. It supports continuous monitoring across the agent lifecycle, from development to production, and is essential for building trustworthy, high-performing AI systems at scale.

Azure AI Foundry Observability provides end-to-end agent observability

Azure AI Foundry Observability is a unified solution for evaluating, monitoring, tracing, and governing the quality, performance, and safety of your AI systems end to end in Azure AI Foundry—all built into your AI development loop. From model selection to real-time debugging, Foundry Observability capabilities empower teams to ship production-grade AI with confidence and speed. It’s observability, reimagined for the enterprise AI era.

With built-in capabilities like the Agents Playground evaluations, Azure AI Red Teaming Agent, and Azure Monitor integration, Foundry Observability brings evaluation and safety into every step of the agent lifecycle. Teams can trace each agent flow with full execution context, simulate adversarial scenarios, and monitor live traffic with customizable dashboards. Seamless CI/CD integration enables continuous evaluation on every commit and governance support with Microsoft Purview, Credo AI, and Saidot integration helps enable alignment with regulatory frameworks like the EU AI Act—making it easier to build responsible, production-grade AI at scale.

Five best practices for agent observability

1. Pick the right model using benchmark driven leaderboards

Every agent needs a model and choosing the right model is foundational for agent success. While planning your AI agent, you need to decide which model would be the best for your use case in terms of safety, quality, and cost.

You can pick the best model by either evaluating the model on your own data or use Azure AI Foundry’s model leaderboards to compare foundation models out-of-the-box by quality, cost, and performance—backed by industry benchmarks. With Foundry model leaderboards, you can find model leaders in various selection criteria and scenarios, visualize trade-offs among the criteria (e.g., quality vs cost or safety), and dive into detailed metrics to make confident, data-driven decisions.

Azure AI Foundry’s model leaderboards gave us the confidence to scale client solutions from experimentation to deployment. Comparing models side by side helped customers select the best fit—balancing performance, safety, and cost with confidence.
—Mark Luquire, EY Global Microsoft Alliance Co-Innovation Leader, Managing Director, Ernst & Young, LLP*

2. Evaluate agents continuously in development and production

Agents are powerful productivity assistants. They can plan, make decisions, and execute actions. Agents typically first reason through user intents in conversations, select the correct tools to call and satisfy the user requests, and complete various tasks according to their instructions. Before deploying agents, it’s critical to evaluate their behavior and performance.

Azure AI Foundry makes agent evaluation easier with several agent evaluators supported out-of-the-box, including Intent Resolution (how accurately the agent identifies and addresses user intentions), Task Adherence (how well the agent follows through on identified tasks), Tool Call Accuracy (how effectively the agent selects and uses tools), and Response Completeness (whether the agent’s response includes all necessary information). Beyond agent evaluators, Azure AI Foundry also provides a comprehensive suite of evaluators for broader assessments of AI quality, risk, and safety. These include quality dimensions such as relevance, coherence, and fluency, along with comprehensive risk and safety checks that assess for code vulnerabilities, violence, self-harm, sexual content, hate, unfairness, indirect attacks, and the use of protected materials. The Azure AI Foundry Agents Playground brings these evaluation and tracing tools together in one place, letting you test, debug, and improve agentic AI efficiently.

The robust evaluation tools in Azure AI Foundry help our developers continuously assess the performance and accuracy of our AI models, including meeting standards for coherence, fluency, and groundedness.
—Amarender Singh, Director, AI, Hughes Network Systems

3. Integrate evaluations into your CI/CD pipelines

Automated evaluations should be part of your CI/CD pipeline so every code change is tested for quality and safety before release. This approach helps teams catch regressions early and can help ensure agents remain reliable as they evolve.

Azure AI Foundry integrates with your CI/CD workflows using GitHub Actions and Azure DevOps extensions, enabling you to auto-evaluate agents on every commit, compare versions using built-in quality, performance, and safety metrics, and leverage confidence intervals and significance tests to support decisions—helping to ensure that each iteration of your agent is production ready.

We’ve integrated Azure AI Foundry evaluations directly into our GitHub Actions workflow, so every code change to our AI agents is automatically tested before deployment. This setup helps us quickly catch regressions and maintain high quality as we iterate on our models and features.
—Justin Layne Hofer, Senior Software Engineer, Veeam

4. Scan for vulnerabilities with AI red teaming before production

Security and safety are non-negotiable. Before deployment, proactively test agents for security and safety risks by simulating adversarial attacks. Red teaming helps uncover vulnerabilities that could be exploited in real-world scenarios, strengthening agent robustness.

Azure AI Foundry’s AI Red Teaming Agent automates adversarial testing, measuring risk and generating readiness reports. It enables teams to simulate attacks and validate both individual agent responses and complex workflows for production readiness.

Accenture is already testing the Microsoft AI Red Teaming Agent, which simulates adversarial prompts and detects model and application risk posture proactively. This tool will help validate not only individual agent responses, but also full multi-agent workflows in which cascading logic might produce unintended behavior from a single adversarial user. Red teaming lets us simulate worst-case scenarios before they ever hit production. That changes the game.
—Nayanjyoti Paul, Associate Director and Chief Azure Architect for Gen AI, Accenture

5. Monitor agents in production with tracing, evaluations, and alerts

Continuous monitoring after deployment is essential to catch issues, performance drift, or regressions in real time. Using evaluations, tracing, and alerts helps maintain agent reliability and compliance throughout its lifecycle.

Azure AI Foundry observability enables continuous agentic AI monitoring through a unified dashboard powered by Azure Monitor Application Insights and Azure Workbooks. This dashboard provides real-time visibility into performance, quality, safety, and resource usage, allowing you to run continuous evaluations on live traffic, set alerts to detect drift or regressions, and trace every evaluation result for full-stack observability. With seamless navigation to Azure Monitor, you can customize dashboards, set up advanced diagnostics, and respond swiftly to incidents—helping to ensure you stay ahead of issues with precision and speed.

Security is paramount for our large enterprise customers, and our collaboration with Microsoft allays any concerns. With Azure AI Foundry, we have the desired observability and control over our infrastructure and can deliver a highly secure environment to our customers.
—Ahmad Fattahi, Sr. Director, Data Science, Spotfire

Get started with Azure AI Foundry for end-to-end agent observability

To summarize, traditional observability includes metrics, logs, and traces. Agent Observability needs metrics, traces, logs, evaluations, and governance for full visibility. Azure AI Foundry Observability is a unified solution for agent governance, evaluation, tracing, and monitoring—all built into your AI development lifecycle. With tools like the Agents Playground, smooth CI/CD, and governance integrations, Azure AI Foundry Observability empowers teams to ensure their AI agents are reliable, safe, and production ready. Learn more about Azure AI Foundry Observability and get full visibility into your agents today!

What’s next

In part four of the Agent Factory series, we’ll focus on how you can go from prototype to production faster with developer tools and rapid agent development.

Did you miss these posts in the series?

Agent Factory: The new era of agentic AI—common use cases and design patterns.

Agent Factory: Building your first AI agent with the tools to deliver real-world outcomes.

Azure AI Foundry
Build adaptable AI agents that automate tasks and enhance user experiences.

Learn more

*The views reflected in this publication are the views of the speaker and do not necessarily reflect the views of the global EY organization or its member firms.
The post Agent Factory: Top 5 agent observability best practices for reliable AI appeared first on Microsoft Azure Blog.
Quelle: Azure

Secure by Design: A Shift-Left Approach with Testcontainers, Docker Scout, and Hardened Images

In today’s fast-paced world of software development, product teams are expected to move quickly: building features, shipping updates, and reacting to user needs in real-time. But moving fast should never mean compromising on quality or security.

Thanks to modern tooling, developers can now maintain high standards while accelerating delivery. In a previous article, we explored how Testcontainers supports shift-left testing by enabling fast and reliable integration tests within the inner dev loop. In this post, we’ll look at the security side of this shift-left approach and how Docker can help move security earlier in the development lifecycle, using practical examples.

A Shift-Left Approach: Testing a Movie Catalog API

We’ll use a simple demo project to walk through our workflow. This is a Node.js + TypeScript API backed by PostgreSQL and tested with Testcontainers.

Movie API Endpoints:

Method

Endpoint

Description

POST

/movies

Add a new movie to the catalog

GET

/movies

Retrieve all movies, sorted by title

GET

/movies/search?q=…

Search movies by title or description (fuzzy match)

Before deploying this app to production, we want to make sure it functions correctly and is free from critical vulnerabilities.

Shift-Left Testing with Testcontainers: Recap

We verify the application against a real PostgreSQL instance by using Testcontainers to spin up containers for both the database and the application. A key advantage of Testcontainers is that it creates these containers dynamically during test execution. Another feature of the Testcontainers libraries is the ability to start containers directly from a Dockerfile. This allows us to run the containerized application along with any required services, such as databases, effectively reproducing the local environment needed to test the application at the API or end-to-end (E2E) level. This approach provides an additional layer of quality assurance and brings even more testing into the inner development loop.

For a more detailed explanation of how Testcontainers enables a shift-left testing approach into the developer inner loop, refer to the introductory blog post.

Here’s a beforeAll setup that prepares our test environment, including PostgreSQL and the application under development, started from the Dockerfile :

beforeAll(async () => {
const network = await new Network().start();
// 1. Start Postgres
db = await new PostgreSqlContainer("postgres:17.4")
.withNetwork(network)
.withNetworkAliases("postgres")
.withDatabase("catalog")
.withUsername("postgres")
.withPassword("postgres")
.withCopyFilesToContainer([
{
source: path.join(__dirname, "../dev/db/1-create-schema.sql"),
target: "/docker-entrypoint-initdb.d/1-create-schema.sql"
},
])
.start();
// 2. Build movie catalog API container from the Dockerfile
const container = await GenericContainer
.fromDockerfile("../movie-catalog")
.withTarget("final")
.withBuildkit()
.build();
// 3. Start movie catalog API container with environment variables for DB connection
app = await container
.withNetwork(network)
.withExposedPorts(3000)
.withEnvironment({
PGHOST: "postgres",
PGPORT: "5432",
PGDATABASE: "catalog",
PGUSER: "postgres",
PGPASSWORD: "postgres",
})
.withWaitStrategy(Wait.forListeningPorts())
.start();
}, 120000);

We can now test the movie catalog API:

it("should create and retrieve a movie", async () => {
const baseUrl = `http://${app.getHost()}:${app.getMappedPort(3000)}`;
const payload = {
title: "Interstellar",
director: "Christopher Nolan",
genres: ["sci-fi"],
releaseYear: 2014,
description: "Space and time exploration"
};

const response = await axios.post(`${baseUrl}/movies`, payload);
expect(response.status).toBe(201);
expect(response.data.title).toBe("Interstellar");
}, 120000);

This approach allows us to validate that:

The application is properly containerized and starts successfully.

The API behaves correctly in a containerized environment with a real database.

However, that’s just one part of the quality story. Now, let’s turn our attention to the security aspects of the application under development.

Introducing Docker Scout and Docker Hardened Images 

To follow modern best practices, we want to containerize the app and eventually deploy it to production. Before doing so, we must ensure the image is secure by using Docker Scout.

Our Dockerfile takes a multi-stage build approach and is based on the node:22-slim image.

###########################################################
# Stage: base
# This stage serves as the base for all of the other stages.
# By using this stage, it provides a consistent base for both
# the dev and prod versions of the image.
###########################################################
FROM node:22-slim AS base
WORKDIR /usr/local/app
RUN useradd -m appuser && chown -R appuser /usr/local/app
USER appuser
COPY –chown=appuser:appuser package.json package-lock.json ./

###########################################################
# Stage: dev
# This stage is used to run the application in a development
# environment. It installs all app dependencies and will
# start the app in a dev mode that will watch for file changes
# and automatically restart the app.
###########################################################
FROM base AS dev
ENV NODE_ENV=development
RUN npm ci –ignore-scripts
COPY –chown=appuser:appuser ./src ./src
EXPOSE 3000
CMD ["npx", "nodemon", "src/app.js"]

###########################################################
# Stage: final
# This stage serves as the final image for production. It
# installs only the production dependencies.
###########################################################
# Deps: install only prod deps
FROM base AS prod-deps
ENV NODE_ENV=production
RUN npm ci –production –ignore-scripts && npm cache clean –force
# Final: clean prod image
FROM base AS final
WORKDIR /usr/local/app
COPY –from=prod-deps /usr/local/app/node_modules ./node_modules
COPY ./src ./src
EXPOSE 3000
CMD [ "node", "src/app.js" ]

Let’s build our image with SBOM and provenance metadata. First, make sure that the containerd image store is enabled in Docker Desktop. We’ll also use the buildx command ( a Docker CLI plugin that extends the docker build) with the –provenance=true  and –sbom=true flags. These options attach build attestations to the image, which Docker Scout uses to provide more detailed and accurate security analysis.

docker buildx build –provenance=true –sbom=true -t movie-catalog-service:v1 .

Then set up a Docker organization with security policies and scan the image with Docker Scout: 

docker scout config organization demonstrationorg
docker scout quickview movie-catalog-service:v1

Figure 1: Docker Scout cli quickview output for node:22 based movie-catalog-service image

Docker Scout also offers a visual analysis via Docker Desktop.

Figure 2: Image layers and CVEs view in Docker Desktop for node:22 based movie-catalog-service image

In this example, no vulnerabilities were found in the application layer. However, several CVEs were introduced by the base node:22-slim image, including a high-severity CVE-2025-6020, a vulnerability present in Debian 12. This means that any Node.js image based on Debian 12 inherits this vulnerability. A common way to address this is by switching to an Alpine-based Node image, which does not include this CVE. However, Alpine uses musl libc instead of glibc, which can lead to compatibility issues depending on your application’s runtime requirements and deployment environment.

So, what’s a more secure and compatible alternative?

That’s where Docker Hardened Images (DHI) come in. These images follow a distroless philosophy, removing unnecessary components to significantly reduce the attack surface. The result? Smaller images that pull faster, run leaner, and provide a secure-by-default foundation for production workloads:

Near-zero exploitable CVEs: Continuously updated, vulnerability-scanned, and published with signed attestations to minimize patch fatigue and eliminate false positives.

Seamless migration: Drop-in replacements for popular base images, with -dev variants available for multi-stage builds.

Up to 95% smaller attack surface: Unlike traditional base images that include full OS stacks with shells and package managers, distroless images retain only the essentials needed to run your app.

Built-in supply chain security: Each image includes signed SBOMs, VEX documents, and SLSA provenance for audit-ready pipelines.

For developers, DHI means fewer CVE-related disruptions, faster CI/CD pipelines, and trusted images you can use with confidence.

Making the Switch to Docker Hardened Images

Switching to a Docker Hardened Image is straightforward. All we need to do is replace the base image node:22-slim with a DHI equivalent.

Docker Hardened Images come in two variants:

Dev variant (demonstrationorg/dhi-node:22-dev) – includes a shell and package managers, making it suitable for building and testing.

Runtime variant (demonstrationorg/dhi-node:22) – stripped down to only the essentials, providing a minimal and secure footprint for production.

This makes them perfect for use in multi-stage Dockerfiles. We can build the app in the dev image, then copy the built application into the runtime image, which will serve as the base for production.

Here’s what the updated Dockerfile would look like:

###########################################################
# Stage: base
# This stage serves as the base for all of the other stages.
# By using this stage, it provides a consistent base for both
# the dev and prod versions of the image.
###########################################################
# Changed node:22 to dhi-node:22-dev
FROM demonstrationorg/dhi-node:22-dev AS base
WORKDIR /usr/local/app
# DHI comes with nonroot user built-in.
COPY –chown=nonroot package.json package-lock.json ./

###########################################################
# Stage: dev
# This stage is used to run the application in a development
# environment. It installs all app dependencies and will
# start the app in a dev mode that will watch for file changes
# and automatically restart the app.
###########################################################
FROM base AS dev
ENV NODE_ENV=development
RUN npm ci –ignore-scripts
# DHI comes with nonroot user built-in.
COPY –chown=nonroot ./src ./src
EXPOSE 3000
CMD ["npx", "nodemon", "src/app.js"]

###########################################################
# Stage: final
# This stage serves as the final image for production. It
# installs only the production dependencies.
###########################################################
# Deps: install only prod deps
FROM base AS prod-deps
ENV NODE_ENV=production
RUN npm ci –production –ignore-scripts && npm cache clean –force
# Final: clean prod image
# Changed base to dhi-node:22
FROM demonstrationorg/dhi-node:22 AS final
WORKDIR /usr/local/app
COPY –from=prod-deps /usr/local/app/node_modules ./node_modules
COPY ./src ./src
EXPOSE 3000
CMD [ "node", "src/app.js" ]

Let’s rebuild and scan the new image:

docker buildx build –provenance=true –sbom=true -t movie-catalog-service-dhi:v1 .
docker scout quickview movie-catalog-service-dhi:v1

Figure 3: Docker Scout cli quickview output for dhi-node:22 based movie-catalog-service image

As you can see, all critical and high CVEs are gone, thanks to the clean and minimal footprint of the Docker Hardened Image.

One of the key benefits of using DHI is the security SLA it provides. If a new CVE is discovered, the DHI team commits to resolving:

Critical and high vulnerabilities within 7 days of a patch becoming available,

Medium and low vulnerabilities within 30 days.

This means you can significantly reduce your CVE remediation burden and give developers more time to focus on innovation and feature development instead of chasing vulnerabilities.

Comparing images with Docker Scout

Let’s also look at the image size and package count advantages of using distroless Hardened Images.

Docker Scout offers a helpful command docker scout compare , that allows you to analyze and compare two images. We’ll use it to evaluate the difference in size and package footprint between node:22-slim and dhi-node:22 based images.

docker scout compare local://movie-catalog-service:v1 –to local://movie-catalog-service-dhi:v1

Figure 4: Comparison of the node:22 and dhi-node:22 based movie-catalog-service images

As you can see, the original node:22-slim based image was 80 MB in size and included 427 packages, while the dhi-node:22 based image is just 41 MB with only 123 packages. 

By switching to a Docker Hardened Image, we reduced the image size by nearly 50 percent and cut down the number of packages by more than three times, significantly reducing the attack surface.

Final Step: Validate with local API tests

Last but not least, after migrating to a DHI base image, we should verify that the application still functions as expected.

Since we’ve already implemented Testcontainers-based tests, we can easily ensure that the API remains accessible and behaves correctly.

Let’s run the tests using the npm test command. 

Figure 5: Local API test execution results

As you can see, the container was built and started successfully. In less than 20 seconds, we were able to verify that the application functions correctly and integrates properly with Postgres.

At this point, we can push the changes to the remote repository, confident that the application is both secure and fully functional, and move on to the next task. 

Further integration with external security tools

In addition to providing a minimal and secure base image, Docker Hardened Images include a comprehensive set of attestations. These include a Software Bill of Materials (SBOM), which details all components, libraries, and dependencies used during the build process, as well as Vulnerability Exploitability eXchange (VEX). VEX offers contextual insights into vulnerabilities, specifying whether they are actually exploitable in a given environment, helping teams prioritize remediation.

Let’s say you’ve committed your code changes, built the application, and pushed a container image. Now you want to verify the security posture using an external scanning tool you already use, such as Grype or Trivy. That requires vulnerability information in a compatible format, which Docker Scout can generate for you.

First, you can view the list of available attestations using the docker scout attest command:

docker scout attest list demonstrationorg/movie-catalog-service-dhi:v1 –platform linux/arm64

This command returns a detailed list of attestations bundled with the image. For example, you might see two OpenVEX files: one for the DHI base image and another for any custom exceptions (like no-dsa) specific to your image.

Then, to integrate this information with external tools, you can export the VEX data into a vex.json file. Starting Docker Scout v1.18.3 you can use the docker scout vex get command to get the merged VEX document from all VEX attestations:

docker scout vex get demonstrationorg/movie-catalog-service-dhi:v1 –output vex.json

This generates a vex.json file containing all VEX statements for the specified image. Tools that support VEX can then use this file to suppress known non-exploitable CVEs.

To use the VEX information with Grype or Trivy, pass the –vex flag during scanning:

trivy image demonstrationorg/movie-catalog-service-dhi:v1 –vex vex.json

This ensures your security scanning results are consistent across tools, leveraging the same set of vulnerability contexts provided by Docker Scout.

Conclusion

Shifting left is about more than just early testing. It’s a proactive mindset for building secure, production-ready software from the beginning.

This shift-left approach combines:

Real infrastructure testing using Testcontainers

End-to-end supply chain visibility and actionable insights with Docker Scout

Trusted, minimal base images through Docker Hardened Images

Together, these tools help catch issues early, improve compliance, and reduce security risks in the software supply chain.

Learn more and request access to Docker Hardened Images!
Quelle: https://blog.docker.com/feed/