AWS Config now supports 52 new resource types

AWS Config now supports 52 additional AWS resource types across key services including Amazon EC2, Amazon Bedrock, and Amazon SageMaker. This expansion provides greater coverage over your AWS environment, enabling you to more effectively discover, assess, audit, and remediate an even broader range of resources. With this launch, if you have enabled recording for all resource types, then AWS Config will automatically track these new additions. The newly supported resource types are also available in Config rules and Config aggregators. You can now use AWS Config to monitor the following newly supported resource types in all AWS Regions where the supported resources are available:

Resource Types
 

AWS::ApiGateway::DomainName
AWS::IAM::GroupPolicy

AWS::ApiGateway::Method
AWS::IAM::RolePolicy

AWS::ApiGateway::UsagePlan
AWS::IAM::UserPolicy

AWS::AppConfig::Extension
AWS::IoTCoreDeviceAdvisor::SuiteDefinition

AWS::Bedrock::ApplicationInferenceProfile
AWS::MediaPackageV2::Channel

AWS::Bedrock::Prompt
AWS::MediaPackageV2::ChannelGroup

AWS::BedrockAgentCore::BrowserCustom
AWS::MediaTailor::LiveSource

AWS::BedrockAgentCore::CodeInterpreterCustom        
AWS::MSK::ServerlessCluster

AWS::BedrockAgentCore::Runtime
AWS::PaymentCryptography::Alias

AWS::CloudFormation::LambdaHook
AWS::PaymentCryptography::Key

AWS::CloudFormation::StackSet
AWS::RolesAnywhere::CRL

AWS::Comprehend::Flywheel
AWS::RolesAnywhere::Profile

AWS::Config::AggregationAuthorization
AWS::S3::AccessGrant

AWS::DataSync::Agent
AWS::S3::AccessGrantsInstance

AWS::Deadline::Fleet
AWS::S3::AccessGrantsLocation

AWS::Deadline::QueueFleetAssociation
AWS::SageMaker::DataQualityJobDefinition

AWS::EC2::IPAMPoolCidr
AWS::SageMaker::MlflowTrackingServer

AWS::EC2::SubnetNetworkAclAssociation
AWS::SageMaker::ModelBiasJobDefinition

AWS::EC2::VPCGatewayAttachment
AWS::SageMaker::ModelExplainabilityJobDefinition

AWS::ECR::RepositoryCreationTemplate
AWS::SageMaker::ModelQualityJobDefinition

AWS::ElasticLoadBalancingV2::TargetGroup
AWS::SageMaker::MonitoringSchedule

AWS::EMR::Studio
AWS::SageMaker::StudioLifecycleConfig

AWS::EMRContainers::VirtualCluster
AWS::SecretsManager::RotationSchedule

AWS::EMRServerless::Application
AWS::SES::DedicatedIpPool

AWS::EntityResolution::MatchingWorkflow
AWS::SES::MailManagerTrafficPolicy

AWS::Glue::Registry
AWS::SSM::ResourceDataSync

To view the complete list of AWS Config supported resource types, see the supported resource types page.
Quelle: aws.amazon.com

Amazon CloudWatch Synthetics adds multi-browser support in AWS GovCloud Regions

Amazon CloudWatch Synthetics multi-browser support is now available in the AWS GovCloud (US-East, US-West) Regions. This expansion enables customers in these two regions to test and monitor their web applications using both Chrome and Firefox browsers. With this launch, you can run the same canary script across Chrome and Firefox when using Playwright-based canaries or Puppeteer-based canaries. CloudWatch Synthetics automatically collects browser-specific performance metrics, success rates, and visual monitoring results while maintaining an aggregate view of overall application health. This helps development and operations teams quickly identify and resolve browser compatibility issues that could affect application reliability. To learn more about configuring multi-browser canaries, see the canary docs in the Amazon CloudWatch Synthetics User Guide. 
Quelle: aws.amazon.com

How scientists can leverage AI agents using Gemini Enterprise, Gemini Code Assist, and Gemini CLI

Scientific inquiry has always been a journey of curiosity, meticulous effort, and groundbreaking discoveries. Today, that journey is being redefined, fueled by the incredible capabilities of AI. It’s moving beyond simply processing data to actively participating in every stage of discovery, and Google Cloud is at the forefront of this transformation, building the tools and platforms that make it possible. 
The sheer volume of data generated by modern research is immense, often too vast for human analysis alone. This is where AI steps in, not just as a tool, but as a collaborative force. We’re seeing powerful new models and AI agents assist with everything from identifying relevant literature and generating novel hypotheses to designing experiments, running simulations, and making sense of complex results. This collaboration doesn’t replace human intellect; it amplifies it, allowing researchers to explore more avenues, more quickly, and with greater precision. 
At Google Cloud, we’re bringing together high-performance computing (HPC) and advanced AI on a single, integrated platform. This means you can seamlessly move from running massive-scale simulations to applying sophisticated machine learning models, all in one environment. 
So, how can you leverage these capabilities to get to insights faster? The journey begins at the foundation of scientific inquiry: the hypothesis.
AI-enhanced scientific inquiry
Every great discovery starts with a powerful hypothesis. With millions of research papers published annually, identifying novel opportunities is a monumental task. To overcome this information overload, scientists can now turn to AI as a powerful research partner.
Our Deep Research agent tackles the first step: performing a comprehensive analysis of published literature to produce detailed reports on a given topic that would otherwise take months to compile. Building on that foundation, our Idea Generation agent then deploys an ensemble of AI collaborators to brainstorm, evaluate, propose, debate, and rank novel hypotheses. This powerful combination, available in Gemini Enterprise, transforms the initial phase of scientific inquiry, empowering researchers to augment their expertise and find connections they might otherwise miss.
Go from hypothesis to results, faster
Once a hypothesis is formed, the work of translating it into executable code begins. This is where AI coding assistants, such as Gemini Code Assist, excel. They automate the tedious tasks of writing analysis scripts and simulation models by generating code from natural language and providing real-time suggestions, dramatically speeding up the core development process. 
But modern research is more than just a single script; it’s a complete workflow of data, environments, and results managed from the command line. For this, Gemini CLI brings that same conversational power directly to your terminal. It acts as the ultimate workflow accelerator, allowing you to instantly synthesize research and generate hypotheses with simple commands, then seamlessly transition to experimentation by generating sophisticated analysis scripts, and debugging errors on the fly, all without ever breaking your focus. Gemini CLI can further accelerate your path to impact by transforming raw results into publication-ready text, generating the code for figures and tables, and refining your work for submission. 
This capability extends to automating the entire research environment. Beyond single commands, Gemini CLI can manage complex, multi-step processes like cloning a scientific application, installing its dependencies, and then building and testing it—all with a simple prompt, maximizing your productivity.
The new era of discovery: Your expertise, AI agents, and Google Cloud
The new era of scientific discovery is here. By embedding AI into every stage of the scientific process – from sparking the initial idea to accelerating the final analysis – Google Cloud provides a single, unified platform for discovery. This new era of AI-enhanced scientific inquiry is built on a robust, intelligent infrastructure that combines the strengths of HPC simulation and AI. This includes purpose-built solutions like our H4D VMs optimized for scientific simulations, alongside the latest A4 and A4X VMs, powered by the latest NVIDIA GPUs, and Google Cloud Managed Lustre, a parallel file system that eliminates storage bottlenecks and allows your HPC and AI workloads to create and analyze massive datasets simultaneously. We provide the power to streamline the entire process so you can focus on scientific creativity – and changing the world! 
Join the Google Cloud Advanced Computing Community to connect with other researchers, share best practices, and stay up to date on the latest advancements in AI for scientific and technical computing, or contact sales to get started today.
Quelle: Google Cloud Platform

How to Use Multimodal AI Models With Docker Model Runner

One of the most exciting advances in modern AI is multimodal support, the ability for models to understand and generate multiple types of input, such as text, images, or audio. 

With multimodal models, you’re no longer limited to typing prompts; you can show an image or play a sound, and the model can understand it. This opens a world of new possibilities for developers building intelligent, local AI experiences.In this post, we’ll explore how to use multimodal models with Docker Model Runner, walk through practical examples, and explain how it all works under the hood.

What Is Multimodal AI?

Most language models only understand text, but multimodal models go further. They can analyze and combine text, image, and audio data. That means you can ask a model to:

Describe what’s in an image

Identify or reason about visual details

Transcribe or summarize an audio clip

This unlocks new ways to build AI applications that can see and listen, not just read.

How to use multimodal models

Not every model supports multimodal inputs, so your first step is to choose one that does.

In Docker Hub we indicate the inputs supported on each model on its model card, for example:

Moondream2, Gemma3, or Smolvlm models supports text and image as input, while GPT-OSS supports text only

The easiest way to start experimenting is with the CLI. Here’s a simple example that asks a multimodal model to describe an image:

docker model run gemma3 "What's in this image? /Users/ilopezluna/Documents/something.jpg"
The image shows the logo for **Docker**, a popular platform for containerization.

Here's a breakdown of what you see:

* **A Whale:** The main element is a stylized blue whale.
* **A Shipping Container:** The whale's body is shaped like a shipping container.
* **A Stack of Blocks:** Inside the container are a stack of blue blocks, representing the layers and components of an application.
* **Eye:** A simple, white eye is featured.

Docker uses this iconic whale-container image to represent the concept of packaging and running applications in isolated containers.

Using the Model Runner API for More Control

While the CLI is great for quick experiments, the API gives you full flexibility for integrating models into your apps. Docker Model Runner exposes an OpenAI-compatible API, meaning you can use the same client libraries and request formats you already know, just point them to Docker Model Runner.

Here’s an example of sending both text and image input:

curl –location 'http://localhost:12434/engines/llama.cpp/v1/chat/completions'
–header 'Content-Type: application/json'
–data '{
"model": "ai/gemma3",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "describe the image"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAgAAAAIAQMAAAD+wSzIAAAABlBMVEX///+/v7+jQ3Y5AAAADklEQVQI12P4AIX8EAgALgAD/aNpbtEAAAAASUVORK5CYII"
}
}
]
}
]
}'

Run Multimodal models from Hugging Face

Thanks to our friends at Hugging Face (special shout-out to Adrien Carreria), you can also run multimodal models directly from Hugging Face in Docker Model Runner.

Here’s an example using a model capable of audio transcription:

curl –location 'http://localhost:12434/engines/llama.cpp/v1/chat/completions'
–header 'Content-Type: application/json'
–data '{
"model": "hf.co/ggml-org/ultravox-v0_5-llama-3_1-8b-gguf",
"temperature": 0,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "transcribe the audio, one word"
},
{
"type": "input_audio",
"input_audio": {
"data": "//PoxAB8RA5OX53xAkRAKBwMBQLRYMhmLDEXQQI0CT8QFQNMawxMYiQtFQClEDCgjDHhCzjtkQuCpb4tQY+IgZ3bGZtttcm+GnGYNIBBgRAAGB+AuYK4N4wICLCPmW0GqZsapZtCnzmq4V4AABgQA2YGoAqQ5gXgWDoBRgVBQCQWxgXAXgYEswTQHR4At2IpfL935ePAKpgGACAAvlkDP2ZrfMBkAdpq4kYTARAHCoABgWAROIBgDPhUGuotB/GkF1EII2i6BgGwwAEVAAMAoARpqC8TRUMFcD2Ly6AIPTnuLEMAkBgwVALjBsBeMEABxWAwUgSDAIAPMBEAMwLAPy65gHgDmBgBALAOPIYDYBYVARMB0AdoKYYYAwYAIAYNANTcMBQAoEAEmAcApBRg+g5mCmBGIgATAPBFMEsBUwTwMzAXAHfMwNQKTAPAPDAHwcCoCAGkHAwBNYRHhYBwWhhwBEPyQuuHAJwuSmAOAeEAKLBSmQR2zbhC81/ORKWnsDhhrjlrxWcBgI2+hCBiAOXzMGLoTHb681deaxoLMAUAFY5gHgCxaTQuIZUsmnpTXVsglpKonHlejAXAHXHOJ0QxnHJyafakpJ+CJAziA/izoImFwFIO/E37iEYij/0+8s8c/9YJAiAgADAHADa28k4sSA3vhE9GrcCw/lPpTEFNRQMACQgABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB6jBZABd3gAADj6KbWk8H+MCSbKw3Jgvlxg+JpjWF5uKl4QJgiEcw5EIyCSY3E4IyvS0wFHwz3E8wrG0yzIU8E7Q5zK8xTFwwbE0HDmYaXhvZCqGoKGEgIFRyZzQxmZ2mXBOY1Aw4LDDyIN/F47SVzdzIMqAowELgCszjgFMvmMxiHzE4hMLicyaGQUaTCoDfuaSaIhgLAsuAFQSYRC4sMxISiQtMGi0ymWTKYvCoHMyjUwAJDIBIMbAhPIKgsdACDpgkFoVGLB8Fg+YTHpkoEGFxCFh0DBeYeECPyEBgQEGDxSCRijDSLJYGwBE4wOBjDYABwWLAMC4fXCFiYHEsuGCQcY2BIcQBIqGAhGYjD5iAKGNwOYgLplAAv2OgJEsCBUwsHDBILBQuEAiMnBcDHIw4AgsACgIlAJDkGY6OICSsgEh2FBOYfCwMBLcJuHE/0elvMKaw1UHBNFB9IdQDxxxH2V/AvvK9cPSJonarWZcyeYd2XQ3BLhUD0yvrpQK0hscP0UesPM0FgDjoAEb1VntaO5MPzDYnJpn4fd9EnS5isDTQSGQoAAEFzAwhBQLTQAQIdi1Arwvo4z6t9FoCcdw2/biq9fDTQ4NrsKBCFwGRDYIAiB7PFPPczALAJS4UAK3G7Sle95iVl+qKL00NXaWsmIKaigYAEhAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxABxxBZIBua3UAy9KUzYdYeFxCZJEQOExhYDGHg4bB7J5W0GV0QYdAhig3G9IQbDT53SunFlCZmoh0DsGmVOc6bZmZqnK0Ga7ABrYqmWUsZSNZeMwgDhYJlULjAfFQGOlAwYfTGYBMMDUmZgYazW4aNBM8w0XD5JMPDxo1KQjLilMbBA24QLviy5lAxhwzWwaFaIk+YIKg5cIAQKKgw4bI6DjJsEqERhHZtCBCdNInOgRVnMAEWNARfMuyIAxnwAJGGlBA0YFiQYSFggKBmHDlAcxIUmQsEX9HF/R1YUeDNzJiKZKgMLBwsAhE5pSCQiDiK6bJfUOCCxswBgmKo4IjrwAoWCQ8wgtMpUjOYEZE/DAYEgwNGxIIMMAzBAAdAbK/qVDxv2wWN3WXNJX0opEXta2XUQBMrAACNAhh4IECTV4CRXaQzqUsScKOypSqiemQTMxelkY6/ucCu1QwxfuSajv1pSzmXrJRxZK4Hxb2Fr7dJR+H2mlYcXmFQEmCEzR6BFCAxxTDjIRDANCVLW3LR0MKaE2N41VmiIpO+UB4sFpfoK1TFB0HCiwKBgkqhx0YKCDQQjWXlXmBgQLg6mSLCbSv2Gs8i0OL4h56926SbxTEFNRQMACQgABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB19BY4BOd00gCrS9rgyDTAxEMTgsQDA0HNzlqJNvgM31dzOAbHRAYyXZqNHG8TwaPCBhMHmn1oaiThh8gmQXQatWOcLz8dTEgaBoaYkmwYvDcaECZBZg8FJiQQhekwbCcwtE8tUYSCeYRAkJECY9BoZQjoYgmEY3r8Y2hEYsnaZuryaPT4ba1aZqr8bjGkZCmSAQqMSALCAeBoFmCoFllgUEQdJB0cCuhaSYJcYowRIjkmjWizNGTDLTOjzQRUigKFb1TktU4iqIGCF6QI1CAIWDgEAgUZUYTJwoZDwhqCpsTpCFEA8s+utVJYcQNwaPMzTDI4hRmVAmICGXOm5FmDEIBCak2hg3Znx50Z5k4o07SAAAMFHBATAWIR8gpFNonBX8xH0zxcAhw40a5aaAqYQ+Y1CYdWHIk2n/SkVUWRLJAomXnZu8CKb+iwxE+Wui1JZZgRTvzzPonOOxYoYGgNmyuGTKnfSRxaTu3duS57aaNvtMSt4qaVxqYdWKwcytpaiDNbb4Sq1UoGwOU5bKJYoGmUwNEx3VzCMIoHSMMTnmHaL/Splj9MZZs3MOBgwWSDKhMYS0WFLvGADiEimQXbFCLuIcVGgsOgd9AcUTCfyOKFLEWLAsafeOQmnpbMWpUxBTUUDAAkIAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAB0zAI0AO7fcaG1AIUAUDAI3ph2SpjEhxkiG4yVAGjcysGkaCkwaG4xgFcLA8BhFMSxnMAyVAoRiMkzGlnDKqtT66qTfpcjBtgTE2UjUxyzaRVmsmXA1GWASGAIbmA4cgkDDAMjjEkFDDMCAgdjFANjDEUjCcRzJUlRIrTW8/TIsDjwdgDKlbjetyTIw4TdoxRpojFwBDFwAwgIQqCQCBMwxEIxcvSzGFI1MuCzUUBpoQOI3QIQTAEEVOjZQUysGMYDDBgsoC2ENGGAFMsEAAJJQIEIC37MBHBwKCDcxJCNTOTBCF5DTg0i3zKQwRiJh4NfJAIwV1OTKjThszB+N4vwgCNfbDSyMxs+NFLjJV438TN2OwcsklwZovmLFRkqsQioiIwhTHZ8wQ0MihzXkU2iKFGF5gQaAwaMJAxIqMqSTGBYwZrNEMTHyBREaAwACLpkCXgGD08Q+gJoJEbxNlwyk9To0S2hXloiTaSYuS92ZGqdQKITZRsrgm0XROrCGZdztDUiI7o7Hpf08ex8P0LFiTByoa+P1SF0sgHBCFceqNS0N0bpYFcN8K8XNclxOsfQpSeNwviFGgk1KRAtiKJsW00VWnamIPEzblsJwfCsQs60gjPPi8sOcBrEpiCmooGABIQAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA//PoxAByRBYwBt800Am7elzDkQl9GYCJm2CfcMHAmBjsIcZfGLogAETDRMwkkAJURaYURjMjkBX5lwqBl85jBzsAHNIK0w/RjyJSN8L804rDXpaMXBQx6fAERzN4XM8FkwiDDDYqQDGBgKBSsYPD4YOjArTN7Gkxq7TcJKNpKw+2GzsFXNEYU5cCzIoDMcDsDAZOYwaDEjgcAgSTM8LABMyAw1qZIY0wdBgmZJ7AkOEDSAaELUlF1sajKOyX6mCH4ECjoZLN+RCEMaBMCQBVMWWtbNjKUGDAQUBqNjgJfxsyB8AYOQgS8bW4RezPPwURRWRILoFgOUcTGljLtwFjAxoza446IEAwsLMYHFqJn1xwGJsD5j3CKaNAkBbQHElGF4CFAEGggs6TIH+f5yGXvCsKncX7JAqcjSFlszX9QqprCR6/Ik9oCUtjc1yJ138kr+P/QMfdymbpDLPJxrVPYhDouhDbPU7lAmpP68cWcVsqqqLC+s5Q5DWJtwlBl9LSUqLAJg6TrGVyEQtK8wgAqizFDEo1xLQW8vd2WMLte5xkqBoEwVZRVDqKIghCllhfZwyYIhCniDB4QRayY9IY4i5S1k5FIq3InM6aVLZbGWuwK2SVUl9MQU1FAwAJCAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
"format": "mp3"
}
}
]
}
]
}'

This model can listen and respond, literally.

Multimodal AI example: A real-time webcam vision model

We have created a couple of demos in the Docker Model Runner repository, and of course, we couldn’t miss a demo based on ngxson’s example:

You can run this same demo by following the instructions, spoiler alert: it’s just Docker Model Runner and an HTML page.

How multimodel AI works: Understanding audio and images

How are these large language models capable of understanding images or audio? The key is something called a multimodal projector file.

This file acts as an adapter, a small neural network layer that converts non-text inputs (like pixels or sound waves) into a token representation the language model can understand. Think of it as a translator that turns visual or auditory information into the same kind of internal “language” used for text.

In simpler terms:

The projector takes an image or audio input

It processes it into numerical embeddings (tokens)

The language model then interprets those tokens just like it would words in a sentence

This extra layer allows a single model architecture to handle multiple input types without retraining the entire model.

Inspecting the Projector in OCI Artifacts

In Docker Model Runner, models are packaged as OCI artifacts, so everything needed to run the model locally (weights, configuration, and extra layers) is contained in a reproducible format.

You can actually see the multimodal projector file by inspecting the model’s OCI layers. For example, take a look at ai/gemma3.

You’ll find a layer with the media type: “application/vnd.docker.ai.mmproj”.

This layer is the multimodal projector, the component that makes the model multimodal-capable. It’s what enables gemma3 to accept images as input in addition to text.

We’re Building This Together!

Docker Model Runner is a community-friendly project at its core, and its future is shaped by contributors like you. If you find this tool useful, please head over to our GitHub repository. Show your support by giving us a star, fork the project to experiment with your own ideas, and contribute. Whether it’s improving documentation, fixing a bug, or a new feature, every contribution helps. Let’s build the future of model deployment together!

Learn more

Check out the Docker Model Runner General Availability announcement

Visit our Model Runner GitHub repo! Docker Model Runner is open-source, and we welcome collaboration and contributions from the community!

Get started with Docker Model Runner with a simple hello GenAI application

Quelle: https://blog.docker.com/feed/

Mountpoint for Amazon S3 and Mountpoint for Amazon S3 CSI driver add monitoring capability

You can now monitor Mountpoint operations in observability tools such as Amazon CloudWatch, Prometheus, and Grafana. With this launch, Mountpoint emits near real-time metrics such as request count or request latency using OpenTelemetry Protocol (OTLP), an open source data transmission protocol. This means you can use applications such as CloudWatch agent or the OpenTelemetry (OTel) collector to publish the metrics into observability tools and create dashboards for monitoring and troubleshooting. Previously, Mountpoint emitted operational data into log files, and you needed to create custom tools to parse the log files for insights. Now, when you mount your Amazon S3 bucket, you can configure Mountpoint to publish the metrics to an observability tool to proactively monitor issues that might impact your applications. For example, you can check if an application is unable to access S3 due to permission issues by analyzing the S3 request error metric that provides error types at an Amazon EC2 instance granularity. Follow the step-by-step instructions to set up the CloudWatch agent or the OTel collector and configure Mountpoint to publish metrics into an observability tool. For more information, visit the Mountpoint for Amazon S3 GitHub repository, Mountpoint product page, and Mountpoint for Amazon S3 CSI driver GitHub page.
Quelle: aws.amazon.com