Docker Desktop 4.24: Compose Watch, Resource Saver, and Docker Engine

We’re excited to share this month’s highlights that will further improve your Docker experience. Our commitment to supporting your development journey has led to enhancements across our tools, and today, we’re pleased to announce the official General Availability of  Docker Compose Watch and Resource Saver. Combined with our new enhancements to managing Docker Engine in Docker Desktop, these updates will help you be more efficient and make your software development experience more enjoyable.

Docker Compose Watch is now Generally Available

The Docker Compose Watch GA release marks a significant milestone in our journey. Once labeled alpha as docker-compose watch, this tool is faster, more resilient, and ready to support your development needs effectively.

We’ve been listening to your feedback since its initial alpha launch (introduced in Compose v2.17 and bundled with Docker Desktop 4.18). Our goal was to make it faster and more robust, ensuring a smoother development experience.

We created Docker Compose Watch to enhance your workflow by providing native support for common development tasks, such as hot reloading for front-end development.

Figure 1: Docker Compose Watch configuration.

Figure 2: Docker Compose Watch gives developers more control over how local file changes sync into the container.

These improvements mean fewer hiccups during everyday tasks, such as merging branches or switching codebases. Docker Compose Watch now intelligently manages changes, allowing you to focus on what matters most — building great software.

As Docker Compose Watch transitions to General Availability, we thank you for your support and feedback. Your insights have been invaluable in shaping this tool.

Resource Saver is now Generally Available

The performance enhancement feature, Resource Saver, is now Generally Available, supporting automatic low-memory mode for Mac, Windows, and Linux. 

This new feature automatically detects when Docker Desktop is not running containers and dramatically reduces its memory footprint by 10x, freeing up valuable resources on developers’ machines for other tasks and minimizing the risk of lag when navigating across different applications. Memory allocation can now be quick and efficient, resulting in a seamless and performant development experience.

Figure 3: Docker Desktop resource saver settings tab.

Resource Saver is available to all Desktop users as default, configured from the Resources tab in Settings. For more information, refer to the Docker Desktop’s Resource Saver mode documentation.

Docker Desktop streamlines Docker Engine control: A user-centric upgrade

At Docker, we value your feedback, and one of the most frequently requested features has been an enhancement to Docker Engine’s status and associated actions in Docker Desktop. Listening to your input, we’ve made some straightforward yet impactful UX improvements:

Constant engine status: You’ll now see the engine status at all times, eliminating the need to hover for tooltips.

One-click actions: Common engine and desktop actions like start, pause, and quit are now easily accessible from the dashboard, reducing clicks for everyday tasks.

Enhanced menu visibility: We’ve revamped the menu for greater prominence, making it easier to find essential features, such as Troubleshoot.

What’s in it for you? A more user-friendly Docker experience that minimizes clicks, reduces cognitive load, and provides quicker access to essential actions. We want to hear your thoughts on these improvements, so don’t hesitate to share your feedback via the Give Feedback option in the whale menu!

Figure 4: Docker Engine status interactive interface supporting stop, start, and pause.

Conclusion

Upgrade now to explore what’s new in the 4.24 release of Docker Desktop. Do you have feedback? Leave feedback on our public GitHub roadmap, and let us know what else you’d like to see in upcoming releases.

Learn more

Read the Docker Desktop Release Notes.

Get the latest release of Docker Desktop.

Learn more about Resource Saver Mode in Docker Desktop. 

Lear more about Docker Compose Watch.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Announcing Docker Compose Watch GA Release

Docker Compose Watch, a tool to improve the inner loop of application development, is now generally available. Hot reload is one of those engineering workflow features that’s seemingly minor and simple but has cumulative benefits. If you can trust your app will update seamlessly as you code, without losing state, it’s one less thing pulling your focus from the work at hand. You can see your frontend components come to life while you stay in your IDE. 

With containerized application development, there are more steps than Alt+Tab and hitting reload in your browser. Even with caching, rebuilding the image and re-creating the container — especially after waiting on stop and start time — can disrupt focus.

We built Docker Compose Watch to smooth away these workflow papercuts. We have learned from many people using our open source Docker Compose project for local development. Now we are natively addressing common workflow friction we observe, like the use case of hot reload for frontend development. 

Bind mount vs. Watch

A common workaround to get hot reload to work is to set up a bind mount to mirror file changes between the local system and a container. This method uses operating system and hypervisor APIs to expose a local directory to the otherwise isolated file system in the container.

The workaround is not trivial engineering since how bind mounts function in Docker Desktop differs from Docker Engine on Linux. For parity, Docker Desktop must provide seamless and efficient file sharing between your machine and its virtual machine (VM), ensuring permissions, replicating file notifications, and maintaining low-level filesystem consistency to prevent corruption. 

In contrast, Docker Compose Watch is specifically targeting development use cases. You may want to ensure your code changes sync into the container, allowing React or NextJS to kick off its own live reload. However, you don’t want the changes you’ve made in the container for ad-hoc testing to reflect in your local directory. For this reason, the tradeoffs we make for Docker Compose Watch favor fine-grained control for common development workflows with Docker Compose (Figures 1 and 2).

Figure 1: Docker Compose Watch configuration.

Figure 2: Docker Compose Watch gives developers more control over how local file changes sync into the container.

Improving Watch for development

Since the alpha launch (in Compose v2.17, bundled with Docker Desktop 4.18), we’ve responded to early feedback by making Docker Compose Watch faster and more robust. This improvement avoids hiccups on common development tasks that kick off many changes, such as merging in the latest main or switching branches. 

Your code sync operation now batches, debounces, and ignores unimportant changes:

What previously would be many API calls are now batched as a single API call to the Docker Engine.

We’ve fine-tuned the streaming of changes to your containers for improved transfer performance. A new built-in debounce mechanism prevents unnecessary transfers in case of back-to-back writes to the same file. This optimizes CPU usage by preventing unnecessary incremental compiles.

The built-in filters have been refined to ignore standard temporary files generated by common code editors and integrated development environments (IDEs).

Previously, Docker Compose Watch required attaching to an already running Compose project. Docker Compose Watch now automatically builds and starts all required services at launch. One command is all you need: docker compose watch.

Try Docker Compose Watch in Docker Desktop 4.24

As of Compose 2.22.0, bundled with Docker Desktop 4.24, Docker Compose Watch is now Generally Available. Make sure to upgrade to the latest version of Docker Desktop and develop more efficiently with docker compose watch.

Let us know how Docker Compose Watch supports your use case and where it can improve. Or you can contribute directly to the open source Docker Compose project.

Learn more

Read the Docker Desktop Release Notes.

Learn about Docker Desktop 4.24.

Get the latest release of Docker Desktop.

Contribute to the open source Docker Compose project.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Announcing Docker AI/ML Hackathon 

With the return of DockerCon, held October 4-5 in Los Angeles, we’re excited to announce the kick-off of a Docker AI/ML Hackathon. Join us at DockerCon — in-person or virtually — to learn about the latest Docker product announcements. Then, bring your innovative artificial intelligence (AI) and machine learning (ML) solutions to life in the hackathon for a chance to win cool prizes.

The Docker AI/ML Hackathon is open from October 3 – November 7, 2023. DockerCon in-person attendees are invited to the dedicated hackspace, where you can chat with fellow developers, Dockhands, and our partners Datastax, Navan.ai, Neo4J, OctoML, and Ollama. 

We’ll also host virtual webinars, Q&A, and engaging chats throughout the next five weeks to keep the ideas flowing. Register for the Docker AI/ML Hackathon to participate and to be notified of event activities.

Hackathon tips

Docker AI/ML Hackathon participants are encouraged to build solutions that are innovative, applicable in real life, use Docker technology, and have an impact on developer productivity.  Submissions can also be non-code proof-of-concepts, extensions that improve Docker workflows, or integrations to improve existing AI/ML solutions.  

Solutions should be AI/ML projects or models built using Docker technology and distributed through DockerHub, AI/ML integrations into Docker products that improve the developer experience, or extensions of Docker products that make working with AI/ML more productive.

Submissions should be a working application or a non-code proof of concept. We would like to see submissions as close to a real-world implementation as possible, but we will accept submissions that are not fully functional with a strong proof of concept. Additionally, all submissions should include a 3-5 minute video that showcases the hack along with background and context (we will not judge the submission on the quality or editing of the video itself). 

After submitting your solution, you’ll be in the running for $20,000 in cash prizes and exclusive Docker swag. Judging will be based on criteria such as the applicability of the solution, innovativeness of the solution, incorporation of Docker tooling, and impact on the developer experience and productivity.

Get started 

Follow the #DockerHackathon hashtag on social media platforms and join the Docker AI/ML Hackathon Slack channel to connect with other participants.

Check out the site for full details about the Docker AI/ML Hackathon and register to start hacking today! 

Submissions close on November 7, 2023, at 5 PM Pacific Time (November 8 at 1 AM UTC).

Learn more

Register for the Docker AI/ML Hackathon.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Get Started with the Microcks Docker Extension for API Mocking and Testing

In the dynamic landscape of software development, collaborations often lead to innovative solutions that simplify complex challenges. The Docker and Microcks partnership is a prime example, demonstrating how the relationship between two industry leaders can reshape local application development.

This article delves into the collaborative efforts of Docker and Microcks, spotlighting the emergence of the Microcks Docker Desktop Extension and its transformative impact on the development ecosystem.

What is Microcks?

Microcks is an open source Kubernetes and cloud-native tool for API mocking and testing. It has been a Cloud Native Computing Foundation Sandbox project since summer 2023.  

Microcks addresses two primary use cases: 

Simulating (or mocking) an API or a microservice from a set of descriptive assets (specifications or contracts) 

Validating (or testing) the conformance of your application regarding your API specification by conducting contract-test

The unique thing about Microcks is that it offers a uniform and consistent approach for all kinds of request/response APIs (REST, GraphQL, gRPC, SOAP) and event-driven APIs (currently supporting eight different protocols) as shown in Figure 1.

Figure 1: Microcks covers all kinds of APIs.

Microcks speeds up the API development life cycle by shortening the feedback loop from the design phase and easing the pain of provisioning environments with many dependencies. All these features establish Microcks as a great help to enforce backward compatibility of your API of microservices interfaces.  

So, for developers, Microcks brings consistency, convenience, and speed to your API lifecycle.

Why run Microcks as a Docker Desktop Extension?

Although Microcks is a powerhouse, running it as a Docker Desktop Extension takes the developer experience, ease of use, and rapid iteration in the inner loop to new levels. With Docker’s containerization capabilities seamlessly integrated, developers no longer need to navigate complex setups or wrestle with compatibility issues. It’s a plug-and-play solution that transforms the development environment into a playground for innovation.

The simplicity of running Microcks as a Docker extension is a game-changer. Developers can effortlessly set up and deploy Microcks in their existing Docker environment, eliminating the need for extensive configurations. This ease of use empowers developers to focus on what they do best — building and testing APIs rather than grappling with deployment intricacies.

In agile development, rapid iterations in the inner loop are paramount. Microcks, as a Docker extension, accelerates this process. Developers can swiftly create, test, and iterate on APIs without leaving the Docker environment. This tight feedback loop ensures developers identify and address issues early, resulting in faster development cycles and higher-quality software.

The combination of two best-of-breed projects, Docker and Microcks, provides: 

Streamlined developer experience

Easiness at its core

Rapid iterations in the inner loop

Extension architecture

The Microcks Docker Desktop Extension has an evolving architecture depending on your enabling features. The UI that executes in Docker Desktop manages your preferences in a ~/.microcks-docker-desktop-extension folder and starts/stops/cleans the needed containers.

At its core, the architecture (Figure 2) embeds two minimal elements: the Microcks main container and a MongoDB database. The different containers of the extension run in an isolated Docker network where only the HTTP port of the main container is bound to your local host.

Figure 2: Microcks extension default architecture.

Through the Settings panel offered by the extension (Figure 3), you can tune the port binding and enable more features, such as:

The support of asynchronous APIs mocking and testing via the usefulness of AsyncAPI with Kafka and WebSocket

The ability to run Postman collection tests in Microcks includes support for Postman testing.

Figure 3: Microcks extension Settings panel.

When applied, your settings are persistent in your ~/.microcks-docker-desktop-extension folder, and the extension augments the initial architecture with the required services. Even though the extension starts with additional containers, they are carefully crafted and chosen to be lightweight and consume as few resources as possible. For example, we selected the Redpanda Kafka-compatible broker for its super-light experience. 

The schema shown in Figure 4 illustrates such a “maximal architecture” for the extension:

Figure 4: Microcks extension maximal architecture.

The Docker Desktop Extension architecture encapsulates the convergence of Docker’s containerization capabilities and Microcks’ API testing prowess. This collaborative endeavor presents developers with a unified interface to toggle between these functionalities seamlessly. The architecture ensures a cohesive experience, enabling developers to harness the power of both Docker and Microcks without the need for constant tool switching.

Getting started

Getting started with the Docker Desktop Extension is a straightforward process that empowers developers to leverage the benefits of unified development. The extension can be easily integrated into existing workflows, offering a familiar interface within Docker. This seamless integration streamlines the setup process, allowing developers to dive into their projects without extensive configuration.

Here are the steps for installing Microcks as a Docker Desktop Extension:1. Choose Add Extensions in the left sidebar (Figure 5).

Figure 5: Add extensions in the Docker Desktop.

2. Switch to the Browse tab.

3. In the Filters drop-down, select the Testing Tools category.

4. Find Microcks and then select Install (Figure 6).

Figure 6: Find and open Microcks.

Launching Microcks

The next step is to launch Microcks (Figure 7).

Figure 7: Launch Microcks.

The Settings panel allows you to configure some options, like whether you’d like to enable the asynchronous APIs features (default is disabled) and if you’d need to set an offset to ports used to access the services (Figures 8 and 9).

Figure 8: Microcks is up and running.

Figure 9: Access asynchronous APIs and services.

Sample app deployment

To illustrate the real-world implications of the Docker Desktop Extension, consider a sample application deployment. As developers embark on local application development, the Docker Desktop Extension enables them to create, test, and iterate on their containers while leveraging Microcks’ API mocking and testing capabilities.

This combined approach ensures that the application’s containerization and API aspects are thoroughly validated, resulting in a higher quality end product. Check out the three-minute “Getting Started with Microcks Docker Desktop Extension” video for more information.

Conclusion

The Docker and Microcks partnership, exemplified by the Docker Desktop Extension, signifies a milestone in collaborative software development. By harmonizing containerization and API testing, this collaboration addresses the challenges of fragmented workflows, accelerating development cycles and elevating the quality of applications.

By embracing the capabilities of Docker and Microcks, developers are poised to embark on a journey characterized by efficiency, reliability, and collaborative synergy.

Remember that Microcks is a Cloud Native Computing Sandbox project supported by an open community, which means you, too, can help make Microcks even greater. Come and say hi on our GitHub discussion or Zulip chat 🐙, send some love through GitHub stars ⭐️, or follow us on Twitter, Mastodon, LinkedIn, and our YouTube channel.

Learn more

Try the Microcks Docker Extension.

Learn about Docker Extensions.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Let’s DockerCon!

For the last three years, DockerCon, our annual global developer event, was 100% virtual. Still, we were humbled by the interest and response — tens of thousands of developer participants from around the world each year. Wow! (If you missed any of ’em, they’re available on YouTube: 2020, 2021, 2022!)

With our collective global return to the “new normal,” DockerCon 2023 will be hybrid — both live (in Los Angeles, California) and virtual. Our desire is to once again experience the live magic of the hallway track, the serendipitous developer-to-developer sharing of tips and tricks, and the celebration of our community’s accomplishments … all while looking forward together toward a really exciting future. And for members of our community who can’t attend in person, we hope you’ll join us virtually!

In the spirit of keeping this post brief, I’ll share a few community highlights here, but expect much more news and updates next week at DockerCon! 

Our open source projects — containerd, Compose, BuildKit, moby/moby, and others — continue to scale in terms of contributions, contributors, and stars. Thank you! 

And overall, our developer community is now at 20M monthly active IPs, 16M registered developers, 15M Docker Hub repos, and 16B image pulls per month from Docker Hub. Again, we’re humbled by this continued growth, engagement, and enthusiasm of our developer community.

And in terms of looking forward to what’s next … well, you gotta tune-in to DockerCon to find out! 😀 But, seriously, there’s never been a better time to be a developer. To wit, with the digitization of All The Things, there’s a need for more than 750 million apps in the next couple of years. That means there’s a need for more developers and more creativity and innovation. And at DockerCon you’ll hear how our community plans to help developers capitalize on this opportunity.

Specifically, and without revealing too much here: We see a chance to bring the power of the cloud to accelerate the developer’s “inner loop,” before the git commit and CI. Furthermore, we see an untapped opportunity to apply GenAI to optimize the non-code gen aspects of the application. By some accounts, this encompasses 85% or more of the overall app.

Piqued your interest? Hope so! 😀 Looking forward to seeing you at DockerCon!

sj

Learn more

Register for DockerCon.

Register for DockerCon workshops.

Watch past DockerCon videos: 2020, 2021, 2022.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Docker’s Journey Toward Enabling Lightning-Fast Developer Innovation: Unveiling Performance Milestones

Our journey has been remarkable. Recently, Docker shifted focus from Docker Swarm, specializing in container orchestration, to the “inner loop” — the foundation — of the Software Development Life Cycle (SLDC). Today, in the early planning, coding, and building stages, we are setting the stage for container development success, ensuring development teams can rapidly and consistently build innovative containerized applications with confidence.

At Docker, we’re dedicated to optimizing the “inner loop” to ensure your journey from development to deployment and in-production software management is flawless. Whether you’re developing locally or in the cloud, our commitment to delivering all this with top-tier performance and security remains unwavering.

In this post, I’ll highlight our focus on performance and walk you through the milestones of the past year. We’re thrilled about the momentum we’re building in providing you with a robust, performant, agile, and secure container application development platform.

These achievements are more than just numbers; they demonstrate the positive impact and return on investment we deliver to you. Our work continues, but let’s explore these improvements and what they mean for you — the driving force behind innovation. 

Improving startup performance by up to 75% 

In 2022, we embarked on a journey that transformed how macOS users experience Docker. At that time, experimental virtualization was the norm, resulting in startup times that tested your patience, often exceeding 30 seconds. We knew we had to improve this, so we made several adjustments to reduce startup time significantly, such as adding support for the Mac virtualization framework and optimizing the Docker Desktop Linux VM boot sequence.

Now, when you fire up Docker Desktop 4.23, brace yourself for a lightning-fast launch, taking a mere 3.481 seconds. That’s right, a startup time that’s not just improved but slashed by 75% (Figure 1).

Figure 1: Startup time improvements across dev environments from Docker Desktop 4.12 to 4.23.

Mac users aren’t the only ones celebrating. Windows Hyper-V and Windows WSL2 users have their reason to cheer. Startup times have decreased from 20.257 seconds (with 4.12) to just 10.799 seconds (with 4.23). That 47% performance boost provides a smoother and more efficient development experience.

And the startup performance journey continues. We want to achieve sub-three-second startup times for all supported development environments. We’re looking forward to delivering this additional advancement soon, and we anticipate our startup performance to continue improving with each release.

Accelerating network performance by 85x

Downloading and uploading container images can be time-consuming. On Mac, with Docker Desktop 4.23, we’ve accelerated the process, delivering speeds over 30GB/s (bytes/sec), ensuring swift development workflows. We did this by entirely replacing the Docker Desktop network stack with a newer, modernized version that is much more efficient. This change resulted in an 85x improvement in upload speed compared to previous versions (4.12) (Figure 2). Think of it as upgrading from a horse-drawn carriage to a bullet train. Your data can now move seamlessly without delays.

Figure 2: Host-to-container use case occurs when a service hosted inside the container is accessed from outside the VM (for example, when a web developer accesses a website they are working on using a browser). Container-to-host use case occurs when the container accesses a service provided from the host (for example, when a package is installed as part of a build using internet access).

On Windows, downloading an image has never been faster. Docker Desktop 4.23 now achieves a speed of 1.1Gbits/s, enhancing developer efficiency. This achievement represents a 650% improvement compared to the previous version (4.12).

For real-time downloading speed, such as you would expect for video games and movies, Docker Desktop 4.23 on macOS offers UDP streaming improvements, soaring to 4.75GB/s (bytes/sec), a 5,800% increase in streaming speed compared to the previous version (4.12).These numbers translate to a faster, smoother digital experience, helping to keep your digital world at the speed of your ideas.

Optimizing host file sharing performance by more than 2x 

File sharing may not always be in the spotlight. Still, it’s an unsung hero of modern development that can make or break your development experience, and we’ve even made improvements here.

Imagine this scenario: Not too long ago, working with Docker Desktop 4.11 on your trusty Mac host, building Redis from within a container (where your Redis source code resided on your local host) was a patience-testing ordeal. It demanded 7 minutes and 25 seconds of your valuable time, primarily because the container’s access to the host files introduced frustrating delays. 

Today, with Docker Desktop 4.23, we’ve revolutionized the game. Thanks to groundbreaking improvements in virtiofs, that same Redis build now takes only 2 minutes and 6 seconds. That’s an impressive 71% reduction in build time.

Since macOS 12.5+, virtiofs is now the default in Docker Desktop as the standard to deliver substantial performance gains when sharing files with containers (Figure 3). You can read more about this in “Docker Desktop 4.23: Updates to Docker Init, New Configuration Integrity Check, Quick Search Improvements, Performance Enhancements, and More.”

Figure 3: Docker Desktop 4.11 compared to 4.22 with virtiofs enabled.

But wait, there’s more to come. Expect even more progress in the file-sharing arena soon. We continue working toward seamless collaboration and faster development cycles because we know that every minute saved is a minute gained for innovation.

Increasing efficiency and reducing idle memory usage by 10x

Let’s talk about efficiency and a little touch of green innovation.

In Docker Desktop 4.22, we introduced the Resource Saver mode, which is like having your development environment on standby, ready to jump into action when needed and conserving resources when it’s not. Resource Saver mode works on Mac, Windows, and Linux, and supports both Mac and Windows by massively reducing Docker Desktop’s memory and CPU footprint when Docker Desktop is idle (i.e., not running containers for a configurable period of time), reducing memory utilization on host machines by a 2GBs, thereby allowing developers to multitask uninterrupted (Figure 4).

Figure 4: Idle memory usage improvements since Docker Desktop 4.20.

Besides improving developer multitasking, what else is so remarkable about this feature? Well, let me paint the picture. We’re saving 38,500 CPU hours daily across all our Docker Desktop users. To put that in perspective, that’s enough to power 1,000 American homes for an entire month.

We have also made significant improvements while Docker Desktop is active (i.e., running containers), resulting in a 52.85% reduction in footprint. These improvements make Docker Desktop lighter and free up resources on your machine to leverage other tools and applications efficiently (Figure 5).

Figure 5: Docker Desktop active memory usage improvements since 4.20.

This means we’re not just optimizing your development workflow but doing so efficiently, reducing energy costs, and positively impacting the environment — an area we will continue to invest in. The reduced footprint is one small way of giving back while helping you build the future — a win-win.

Streamlining the build process, delivering up to a 40% compression improvement

Imagine your containers are digital backpacks; the heavier the bag, the harder to carry it around while you work. We’ve introduced support for Zstandard (zstd) compression of Docker container images in Docker Desktop 4.19 to lighten the load, reducing container image sizes with remarkable results. 

Look at the data for a debian:12.1 container image in Figure 6. Zstandard delivers a ~40% improvement in compression compared to the traditional gzip method. And for the Docker Engine:24.0 image, we are achieving a ~20% enhancement.

Figure 6: Data for a debian:12.1 container image and Docker Engine 24.0 with improved compression.

In practical terms, your container images become leaner and faster to transfer, allowing you to work more swiftly and effectively. With Docker Desktop, it’s like fitting your backpack with a magical compression spell, making every byte count. Your containers are lighter, your image pulls and pushes are faster, and your development is smoother — optimizing your journey, one compression at a time.

Enterprise-level security (and peace of mind)

When we talk about speed and performance, there’s a crucial aspect we mustn’t overlook: security. At Docker, we understand that speed without security is like a ship without a compass — it may move fast but won’t stay on course.

While we’ve been investing heavily in accelerating your development journey, we haven’t lost sight of our commitment to enterprise-level security and governance. In fact, it’s quite the opposite. Our goal is to create a seamless union between velocity and vigilance.

Here’s how we do it:

Unprivileged users: Unlike the native Docker Engine on Linux, unprivileged users can run Docker Desktop. This is because Docker Desktop runs Docker Engine inside a Linux VM, isolated from the underlying host machine.

Enhanced Container Isolation: ECI runs containers in rootless mode by default, vets sensitive system calls in containers, and restricts sensitive mounts, thereby adding an extra layer of isolation between containers and the host. It does this without changing developer workflows, so you can continue using Docker as usual with an extra layer of peace of mind.

Settings management: With settings management, IT admins can manipulate security settings in Docker Desktop per organization security policies to better secure developer environments.

Robust security model: Our security model is designed for safety and optimal performance. The two should go hand in hand. So, while protecting your environment, we ensure it runs efficiently.

Continuous security audits: Our commitment to security goes beyond features and tools. We are dedicated to safeguarding the platform, user community, and customers from various modern-day threats. We invest in regular security audits to scrutinize every nook and cranny of our applications and services. Vulnerabilities are swiftly identified and mitigated.

We aim to provide you with a holistic platform, an enterprise-grade offering that seamlessly integrates performance and security. In this fast-paced world, the perfect blend of speed and security truly empowers innovation. At Docker, we’re here to ensure you have both every step of the way.

Continuing our journey

At Docker, our unwavering commitment to performance and innovation is crystal clear. The achievements showcased here are just the beginning. So, as you embark on your development endeavors, know that we’re right there with you, making the seconds count and ensuring your confidence and ability to focus energy on what truly matters — creating and innovating. Together, we’re rewriting the story of development across the SDLC, one build, container, and application at a time.

I hope you will join us at DockerCon 2023, in person or virtually, to explore what we have planned for Docker Desktop, Docker Hub, and Docker Scout. Upgrade to the latest Docker Desktop version and check out our Docker 101 webinar: What Docker can do for your business.

Learn more

Register for DockerCon.

Register for DockerCon workshops.

See the DockerCon program.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Changes to How Docker Handles Personal Authentication Tokens 

A personal access token (PAT) is a replacement for a password that can have specific scopes for repository access. Docker is improving the visibility of Docker Desktop and Hub users’ personal access tokens. Specifically, we are changing how tokens are handled across sessions between the two tools. Read on to learn more about this security improvement.

What is changing with PATs and Docker?

To authenticate with Docker Hub, the Docker CLI uses PATs. To gain authenticated access to Hub from Docker CLI after a successful login from Docker Desktop, an API creates PATs on behalf of a Desktop user. These tokens were created after a user had successfully authenticated to Docker Hub through the login flow they have active for their organization (and thus had the required bearer tokens). 

Within Docker Hub, if you navigate to your profile, select Edit > Security, you can see all of your access tokens, including ones created by Docker Desktop for the CLI on your behalf with Docker Hub (Figure 1).

Figure 1: Auto-generated and manual access tokens displayed.

Docker has improved the visibility of these auto-generated tokens, and now all PATs are displayed inside a user’s profile for their active access tokens. 

Users will be able to see if the tokens are auto-generated or if they were manually created. Users can also deactivate or delete these auto-generated session tokens just as they can with other PATs. 

For security reasons, Docker encourages users to check their active tokens regularly. These auto-generated tokens will only maintain the five most recently used tokens. Any tokens outside those five auto-generated tokens will be deleted (Figure 2).

Figure 2: Regularly check active tokens.

Note that using Docker Single Sign-On (SSO) functionality, requiring multi-factor authentication (MFA), and enforcing sign-in for Docker Desktop significantly reduces the risk of an account becoming compromised where any of a user’s personal access tokens could be exploited. 

Appropriate monitoring around your software development lifecycle (SDLC) is essential, as all images should be scanned for malware and viruses as part of secure code analysis and on an ongoing basis.  

Conclusion

Docker Hub, Docker Desktop, and the Docker CLI will continue to behave how users expect.

We encourage you to use the latest Docker Desktop and Docker CLI versions to get the newest features and security releases.

We also encourage you to use your new visibility into these PATs for Docker CLI and include all of your PATs in the regular security review for your organization and Docker accounts. 

As always, we encourage security best practices for Docker users and will continue strengthening Docker’s tooling as we update and add new features.

Learn more

Learn about Docker SSO.

Enable two-factor authentication.

Enforce Docker Desktop sign-in.

Get the latest release of Docker Desktop.

Try Docker Scout.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

How IKEA Retail Standardizes Docker Images for Efficient Machine Learning Model Deployment

What do Docker and IKEA Retail have in common? Both companies have changed how products are built, stored, and shipped. In IKEA Retail’s case, they created the market of flat-packed furniture items, which made everything from shipping, warehousing, and delivering their furniture to the end location much easier and more cost effective. This parallels what Docker has done for developers. Docker has changed the way that software is built, shipped, and stored, with Docker Images taking up much less space “shelf” space. 

In this post, contributing authors Karan Honavar and Fernando Dorado Rueda from IKEA Retail walk through their MLOps solution, built with Docker.

Machine learning (ML) deployment, the act of shifting an ML model from the developmental stage to a live production environment, is paramount to translating complex algorithms into real-world solutions. Yet, this intricate process isn’t without its challenges, including:

Complexity and opacity: With ML models often veiled in complexity, deciphering their logic can be taxing. This obscurity impedes trust and complicates the explanation of decisions to stakeholders.

Adaptation to changing data patterns: The shifting landscape of real-world data can deviate from training sets, causing “concept drift.” Addressing this requires vigilant retraining, an arduous task that wastes time and resources.

Real-time data processing: Handling the deluge of data necessary for accurate predictions can burden systems and impede scalability.

Varied deployment methods: Whether deployed locally, in the cloud, or via web services, each method brings unique challenges, adding layers of complexity to an already intricate procedure.

Security and compliance: Ensuring that ML models align with rigorous regulations, particularly around private information, necessitates a focus on lawful implementation.

Ongoing maintenance and monitoring: The journey doesn’t end with deployment. Constant monitoring is vital to sustain the model’s health and address emerging concerns.

These factors represent substantial obstacles, but they are not insurmountable. We can streamline the journey from the laboratory to the real world by standardizing Docker images for efficient ML model deployment. 

This article will delve into the creation, measurement, deployment, and interaction with Dockerized ML models. We will demystify the complexities and demonstrate how Docker can catalyze cutting-edge concepts into tangible benefits.

Standardization deployment process via Docker

In the dynamic realm of today’s data-driven enterprises, such as our case at IKEA Retail, the multitude of tools and deployment strategies serves both as a boon and a burden. Innovation thrives, but so too does complexity, giving rise to inconsistency and delays. The antidote? Standardization. It’s more than just a buzzword; it’s a method to pave the way to efficiency, compliance, and seamless integration.

Enter Docker, the unsung hero in this narrative. In the evolving field of ML deployment, Docker offers agility and uniformity. It has reshaped the landscape by offering a consistent environment from development to production. The beauty of Docker lies in its containerization technology, enabling developers to wrap up an application with all the parts it needs, such as libraries and other dependencies, and ship it all out as one package.

At IKEA Retail, diverse teams — including hybrid data scientist teams and R&D units — conceptualize and develop models, each selecting drivers and packaging libraries according to their preferences and requirements. Although virtual environments provide a certain level of support, they can also present compatibility challenges when transitioning to a production environment. 

This is where Docker becomes an essential tool in our daily operations, offering simplification and a marked acceleration in the development and deployment process. Here are key advantages:

Portability: With Docker, the friction between different computing environments melts away. A container runs uniformly, regardless of where it’s deployed, bringing robustness to the entire pipeline.

Efficiency: Docker’s lightweight nature ensures that resources are optimally utilized, thereby reducing overheads and boosting performance.

Scalability: With Docker, scaling your application or ML models horizontally becomes a breeze. It aligns perfectly with the orchestrated symphony that large-scale deployment demands.

Then, there’s Seldon-Core, a solution chosen by IKEA Retail’s forward-thinking MLOps (machine learning operations) team. Why? Because it transforms ML models into production-ready microservices, regardless of the model’s origin (TensorFlow, PyTorch, H2O, etc.) or language (Python, Java, etc.). But that’s not all. Seldon-Core scales precisely, enabling everything from advanced metrics and logging to explainers and A/B testing.

This combination of Docker and Seldon-Core forms the heart of our exploration today. Together, they sketch the blueprint for a revolution in ML deployment. This synergy is no mere technical alliance; it’s a transformative collaboration that redefines deploying, monitoring, and interacting with ML models.

Through the looking glass of IKEA Retail’s experience, we’ll unearth how this robust duo — Docker and Seldon-Core — can turn a convoluted task into a streamlined, agile operation and how you can harness real-time metrics for profound insights.

Dive into this new MLOps era with us. Unlock efficiency, scalability, and a strategic advantage in ML production. Your innovation journey begins here, with Docker and Seldon-Core leading the way. This is more than a solution; it’s a paradigm shift.

In the rest of this article, we will cover deployment steps, including model preparation, encapsulating the model into an Docker image, and testing. Let’s get started.

Prerequisites

The following items must be present to replicate this example:

Docker: Ensure Docker is up and running, easily achievable through solutions like Docker Desktop

Python: Have a local installation at the ready (+3.7)

Model preparation

Model training and simple evaluation

Embarking on the journey to deploying an ML model is much like crafting a masterpiece: The canvas must be prepared, and every brushstroke must be deliberate. However, the focus of this exploration isn’t the art itself but rather the frame that holds it — the standardization of ML models, regardless of their creation or the frameworks used.

The primary objective of this demonstration is not to augment the model’s performance but rather to elucidate the seamless transition from local development to production deployment. It is imperative to note that the methodology we present is universally applicable across different models and frameworks. Therefore, we have chosen a straightforward model as a representative example. This choice is intentional, allowing readers to concentrate on the underlying process flows, which can be readily adapted to more sophisticated models that may require refined hyperparameter tuning and meticulous model selection. 

By focusing on these foundational principles, we aim to provide a versatile and accessible guide that transcends the specificities of individual models or use cases. Let’s delve into this process.

To align with our ethos of transparency and consumer privacy and to facilitate your engagement with this approach, a public dataset is employed for a binary classification task.

In the following code excerpt, you’ll find the essence of our training approach, reflecting how we transform raw data into a model ready for real-world challenges:

import os
import pickle
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, Perceptron
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Load the breast cancer dataset
X, y = datasets.load_breast_cancer(return_X_y=True)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.9, random_state=0)

# Combine X_test and y_test into a single DataFrame
X_test_df = pd.DataFrame(X_test, columns=[f"feature_{i}" for i in range(X_test.shape[1])])
y_test_df = pd.DataFrame(y_test, columns=["target"])

df_test = pd.concat([X_test_df, y_test_df], axis=1)

# Define the path to store models
model_path = "models/"

# Create the folder if it doesn’t exist
if not os.path.exists(model_path):
os.makedirs(model_path)

# Define a list of classifier parameters
parameters = [
{"clf": LogisticRegression(solver="liblinear", multi_class="ovr"), "name": f"{model_path}/binary-lr.joblib"},
{"clf": Perceptron(eta0=0.1, random_state=0), "name": f"{model_path}/binary-percept.joblib"},
]

# Iterate through each parameter configuration
for param in parameters:
clf = param["clf"] # Retrieve the classifier from the parameter dictionary
clf.fit(X_train, y_train) # Fit the classifier on the training data
# Save the trained model to a file using pickle
model_filename = f"{param[’name’]}"
with open(model_filename, ‘wb’) as model_file:
pickle.dump(clf, model_file)
print(f"Model saved in {model_filename}")

# Simple Model Evaluation
model_path = ‘models/binary-lr.joblib’
with open(model_path, ‘rb’) as model_file:
loaded_model = pickle.load(model_file)

# Make predictions using the loaded model
predictions = loaded_model.predict(X_test)

# Calculate metrics (accuracy, precision, recall, f1-score)
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)
recall = recall_score(y_test, predictions)
f1 = f1_score(y_test, predictions)

Model class creation

With the model files primed, the task at hand shifts to the crafting of the model class — an essential architectural element that will later reside within the Docker image. Like a skilled sculptor, we must shape this class, adhering to the exacting standards proposed by Seldon:

import joblib
import logging

class Score:
"""
Class to hold metrics for binary classification, including true positives (TP), false positives (FP),
true negatives (TN), and false negatives (FN).
"""
def __init__(self, TP=0, FP=0, TN=0, FN=0):
self.TP = TP # True Positives
self.FP = FP # False Positives
self.TN = TN # True Negatives
self.FN = FN # False Negatives

class DockerModel:
"""
Class for loading and predicting using a pre-trained model, handling feedback to update metrics,
and providing those metrics.
"""
result = {} # Dictionary to store input data

def __init__(self, model_name="models/binary-lr.joblib"):
"""
Initialize DockerModel with metrics and model name.
:param model_name: Path to the pre-trained model.
"""
self.scores = Score(0, 0, 0, 0)
self.loaded = False
self.model_name = model_name

def load(self):
"""
Load the model from the provided path.
"""
self.model = joblib.load(self.model_name)
logging.info(f"Model {self.model_name} Loaded")

def predict(self, X, features_names=None, meta=None):
"""
Predict the target using the loaded model.
:param X: Features for prediction.
:param features_names: Names of the features, optional.
:param meta: Additional metadata, optional.
:return: Predicted target values.
"""
self.result[’shape_input_data’] = str(X.shape)
logging.info(f"Received request: {X}")
if not self.loaded:
self.load()
self.loaded = True
predictions = self.model.predict(X)
return predictions

def send_feedback(self, features, feature_names, reward, truth, routing=""):
"""
Provide feedback on predictions and update the metrics.
:param features: Features used for prediction.
:param feature_names: Names of the features.
:param reward: Reward signal, not used in this context.
:param truth: Ground truth target values.
:param routing: Routing information, optional.
:return: Empty list as return value is not used.
"""
predicted = self.predict(features)
logging.info(f"Predicted: {predicted[0]}, Truth: {truth[0]}")
if int(truth[0]) == 1:
if int(predicted[0]) == int(truth[0]):
self.scores.TP += 1
else:
self.scores.FN += 1
else:
if int(predicted[0]) == int(truth[0]):
self.scores.TN += 1
else:
self.scores.FP += 1
return [] # Ignore return statement as its not used

def calculate_metrics(self):
"""
Calculate the accuracy, precision, recall, and F1-score.
:return: accuracy, precision, recall, f1_score
"""
total_samples = self.scores.TP + self.scores.TN + self.scores.FP + self.scores.FN

# Check if there are any samples to avoid division by zero
if total_samples == 0:
logging.warning("No samples available to calculate metrics.")
return 0, 0, 0, 0 # Return zeros for all metrics if no samples

accuracy = (self.scores.TP + self.scores.TN) / total_samples

# Check if there are any positive predictions to calculate precision
positive_predictions = self.scores.TP + self.scores.FP
precision = self.scores.TP / positive_predictions if positive_predictions != 0 else 0

# Check if there are any actual positives to calculate recall
actual_positives = self.scores.TP + self.scores.FN
recall = self.scores.TP / actual_positives if actual_positives != 0 else 0

# Check if precision and recall are non-zero to calculate F1-score
if precision + recall == 0:
f1_score = 0
else:
f1_score = 2 * (precision * recall) / (precision + recall)

# Return the calculated metrics
return accuracy, precision, recall, f1_score

def metrics(self):
"""
Generate metrics for monitoring.
:return: List of dictionaries containing accuracy, precision, recall, and f1_score.
"""
accuracy, precision, recall, f1_score = self.calculate_metrics()
return [
{"type": "GAUGE", "key": "accuracy", "value": accuracy},
{"type": "GAUGE", "key": "precision", "value": precision},
{"type": "GAUGE", "key": "recall", "value": recall},
{"type": "GAUGE", "key": "f1_score", "value": f1_score},
]

def tags(self):
"""
Retrieve metadata when generating predictions
:return: Dictionary the intermediate information
"""
return self.result

Let’s delve into the details of the functions and classes within the DockerModel class that encapsulates these four essential aspects:

Loading and predicting:

load(): This function is responsible for importing the pretrained model from the provided path. It’s usually called internally before making predictions to ensure the model is available.

predict(X, features_names=None, meta=None): This function deploys the loaded model to make predictions. It takes in the input features X, optional features_names, and optional metadata meta, returning the predicted target values.

Feedback handling:

send_feedback(features, feature_names, reward, truth, routing=””): This function is vital in adapting the model to real-world feedback. It accepts the input data, truth values, and other parameters to assess the model’s performance. The feedback updates the model’s understanding, and the metrics are calculated and stored for real-time analysis. This facilitates continuous retraining of the model.

Metrics calculation:

calculate_metrics(): This function calculates the essential metrics of accuracy, precision, recall, and F1-score. These metrics provide quantitative insights into the model’s performance, enabling constant monitoring and potential improvement.

Score class: This auxiliary class is used within the DockerModel to hold metrics for binary classification, including true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). It helps keep track of these parameters, which are vital for calculating the aforementioned metrics.

Monitoring assistance:

metrics(): This function generates the metrics for model monitoring. It returns a list of dictionaries containing the calculated accuracy, precision, recall, and F1 score. These metrics are compliant with Prometheus Metrics, facilitating real-time monitoring and analysis.

tags(): This function is designed to retrieve custom metadata data when generating predictions, aiding in monitoring and debugging. It returns a dictionary, which can help track and understand the nature of the requests.

Together, these functions and classes form a cohesive and robust structure that supports the entire lifecycle of an ML model. From the moment of inception (loading and predicting) through its growth (feedback handling) and evaluation (metrics calculation), to its continuous vigilance (monitoring assistance), the architecture is designed to standardize and streamline the process of deploying and maintaining ML models in a real-world production environment.

This model class is more than code; it’s the vessel that carries our ML model from a local environment to the vast sea of production. It’s the vehicle for standardization, unlocking efficiency and consistency in deploying models.

At this stage, we’ve prepared the canvas and outlined the masterpiece. Now, it’s time to dive deeper and explore how this model is encapsulated into a Docker image, an adventure that blends technology and strategy to redefine ML deployment. 

Testing model locally

Before venturing into creating a Docker image, testing the model locally is vital. This step acts as a rehearsal before the main event, providing a chance to ensure that the model is performing as expected with the testing data.

The importance of local testing lies in its ability to catch issues early, avoiding potential complications later in the deployment process. Following the example code provided below, it can confirm that the model is ready for its next phase if it provides the expected prediction in the expected format:

from DockerModel import DockerModel
demoModel = DockerModel()
demoModel.predict(X_test) # Can take the entire testing dataset or individual predictions

The expected output should match the format of the class labels you anticipate from the model. If everything works correctly, you’re assured that the model is well prepared for the next grand step: encapsulation within a Docker image.

Local testing is more than a technical process; it’s a quality assurance measure that stands as a gatekeeper, ensuring that only a well-prepared model moves forward. It illustrates the meticulous care taken in the deployment process, reflecting a commitment to excellence that transcends code and resonates with the core values of standardization and efficiency.

With the local testing accomplished, we stand on the threshold of a new frontier: creating the Docker image. Let’s continue this exciting journey, knowing each step is a stride toward innovation and mastery in ML deployment.

Encapsulating the model into a Docker image

In our IKEA Retail MLOps view, a model is not simply a collection of code. Rather, it is a sophisticated assembly comprising code, dependencies, and ML artifacts, all encapsulated within a versioned and registered Docker image. This composition is carefully designed, reflecting the meticulous planning of the physical infrastructure.

What is Docker’s role in MLOps?

Docker plays a vital role in MLOps, providing a standardized environment that streamlines the transition from development to production:

Streamlining deployment: Docker containers encapsulate everything an ML model needs to run, easing the deployment process.

Facilitating collaboration: Using Docker, data scientists and engineers can ensure that models and their dependencies remain consistent across different stages of development.

Enhancing model reproducibility: Docker provides a uniform environment that enhances the reproducibility of models, a critical aspect in machine learning.

Integrating with orchestration tools: Docker can be used with orchestration platforms like Kubernetes, enabling automated deployment, scaling, and management of containerized applications.

Docker and containerization are more than technology tools; they catalyze innovation and efficiency in MLOps. Ensuring consistency, scalability and agility, Docker unlocks new potential and opens the way for a more agile and robust ML deployment process. Whether you are a developer, a data scientist, or an IT professional, understanding Docker is critical to navigating the complex and multifaceted landscape of modern data-driven applications.

Dockerfile creation

Creating a Dockerfile is like sketching the architectural plan of a building. It outlines the instructions for creating a Docker image to run the application in a coherent, isolated environment. This design ensures that the entire model — including its code, dependencies, and unique ML artifacts — is treated as a cohesive entity, aligning with the overarching vision of IKEA Retail’s MLOps approach.

In our case, we have created a Dockerfile with the express purpose of encapsulating not only the code but all the corresponding artifacts of the model. This deliberate design facilitates a smooth transition to production, effectively bridging the gap between development and deployment.

We used the following Dockerfile for this demonstration, which represents a tangible example of how IKEA Retail’s MLOps approach is achieved through thoughtful engineering and strategic implementation.

# Use an official Python runtime as a parent image.
# Using a slim image for a smaller final size and reduced attack surface.
FROM python:3.9-slim

# Set the maintainer label for metadata.
LABEL maintainer="fernandodorado.rueda@ingka.com"

# Set environment variables for a consistent build behavior.
# Disabling the buffer helps to log messages synchronously.
ENV PYTHONUNBUFFERED=1

# Set a working directory inside the container to store all our project files.
WORKDIR /app

# First, copy the requirements file to leverage Docker’s cache for dependencies.
# By doing this first, changes to the code will not invalidate the cached dependencies.
COPY requirements.txt requirements.txt

# Install the required packages listed in the requirements file.
# It’s a good practice to include the –no-cache-dir flag to prevent the caching of dependencies
# that aren’t necessary for executing the application.
RUN pip install –no-cache-dir -r requirements.txt

# Copy the rest of the code and model files into the image.
COPY DockerModel.py DockerModel.py
COPY models/ models/

# Expose ports that the application will run on.
# Port 5000 for GRPC
# Port 9000 for REST
EXPOSE 5000 9000

# Set environment variables used by the application.
ENV MODEL_NAME DockerModel
ENV SERVICE_TYPE MODEL

# Change the owner of the directory to user 8888 for security purposes.
# It can prevent unauthorised write access by the application itself.
# Make sure to run the application as this non-root user later if applicable.
RUN chown -R 8888 /app

# Use the exec form of CMD so that the application you run will receive UNIX signals.
# This is helpful for graceful shutdown.
# Here we’re using seldon-core-microservice to serve the model.
CMD exec seldon-core-microservice $MODEL_NAME –service-type $SERVICE_TYPE

This Dockerfile contains different parts:

FROM python:3.9-slim: This line chooses the official Python 3.9 slim image as the parent image. It is favored for its reduced size and attack surface, enhancing both efficiency and security.

LABEL maintainer=”fernandodorado.rueda@ingka.com”: A metadata label that specifies the maintainer of the image, providing contact information.

ENV PYTHONUNBUFFERED=1: Disabling Python’s output buffering ensures that log messages are emitted synchronously, aiding in debugging and log analysis.

WORKDIR /app: Sets the working directory inside the container to /app, a centralized location for all project files.

COPY requirements.txt requirements.txt: Copies the requirements file into the image. Doing this before copying the rest of the code leverages Docker’s caching mechanism, making future builds faster. This file must contain the “seldon-core” package:

pandas==1.3.5
requests==2.28.1
numpy==1.20
seldon-core==1.14.1
scikit-learn==1.0.2

RUN pip install –no-cache-dir -r requirements.txt: Installs required packages as listed in the requirements file. The flag -no-cache-dir prevents unnecessary caching of dependencies, reducing the image size.

COPY DockerModel.py DockerModel.py: Copies the main Python file into the image.

COPY models/ models/: Copies the model files into the image.

EXPOSE 5000 9000: Exposes ports 5000 (GRPC) and 9000 (REST), allowing communication with the application inside the container.

ENV MODEL_NAME DockerModel: Sets the environment variable for the model name.

ENV SERVICE_TYPE MODEL: Sets the environment variable for the service type.

RUN chown -R 8888 /app: Changes the owner of the directory to user 8888. Running the application as a non-root user helps mitigate the risk of unauthorized write access.

CMD exec seldon-core-microservice $MODEL_NAME –service-type $SERVICE_TYPE: Executes the command to start the service using seldon-core-microservice. It also includes the model name and service type as parameters. Using exec ensures the application receives UNIX signals, facilitating graceful shutdown.

Building and pushing Docker image

1. Installing Docker Desktop

If not already installed, Docker Desktop is recommended for this task. Docker Desktop provides a graphical user interface that simplifies the process of building, running, and managing Docker containers. Docker Desktop also supports Kubernetes, offering an easy way to create a local cluster.

2. Navigating to the Project directory

Open a terminal or command prompt.

Navigate to the folder where the Dockerfile and other necessary files are located.

3. Building the Image

Execute the command: docker build . -t docker-model:1.0.0.

docker build . instructs Docker to build the image using the current directory (.).

-t docker-model:1.0.0 assigns a name (docker-model) and tag (1.0.0) to the image.

The build process will follow the instructions defined in the Dockerfile, creating a Docker image encapsulating the entire environment needed to run the model.

4. Pushing the image

If needed, the image can be pushed to a container registry like Docker Hub, or a private registry within an organization.

For this demonstration, the image is being kept in the local container registry, simplifying the process and removing the need for authentication with an external registry.

Deploy ML model using Docker: Unleash it into the world

Once the Docker image is built, running it is relatively straightforward. Let’s break down this process:

docker run –rm –name docker-model -p 9000:9000 docker-model:1.0.0

Components of the command:

docker run: This is the base command to run a Docker container.

-rm: This flag ensures that the Docker container is automatically removed once it’s stopped. It helps keep the environment clean, especially when you run containers for testing or short-lived tasks.

-name docker-model: Assigns a name to the running container.

p 9000:9000: This maps port 9000 on the host machine to port 9000 on the Docker container. The format is p <host_port>:<container_port>. Because the Dockerfile mentions that the application will be exposing ports 5000 for GRPC and 9000 for REST, this command makes sure the REST endpoint is available to external users or applications through port 9000 on the host.

docker-model:1.0.0: This specifies the name and tag of the Docker image to run. docker-model is the name, and 1.0.0 is the version tag we assigned during the build process.

What happens next

On executing the command, Docker will initiate a container instance from the docker-model:1.0.0 image.

The application within the Docker container will start and begin listening for requests on port 9000 (as specified).

With the port mapping, any incoming requests on port 9000 of the host machine will be forwarded to port 9000 of the Docker container.

The application can now be accessed and interacted with as if it were running natively on the host machine.

Test deployed model using Docker

With the Docker image in place, it’s time to see the model in action.

Generate predictions

The path from model to prediction is a delicate process, requiring an understanding of the specific input-output type that Seldon accommodates (e.g., ndarray, JSON data, STRDATA).

In our scenario, the model anticipates an array, and thus, the key in our payload is “ndarray.” Here’s how we orchestrate this:

import requests
import json

URL = "http://localhost:9000/api/v1.0/predictions"

def send_prediction_request(data):

# Create the headers for the request
headers = {‘Content-Type': ‘application/json’}

try:
# Send the POST request
response = requests.post(URL, headers=headers, json=data)

# Check if the request was successful
response.raise_for_status() # Will raise HTTPError if the HTTP request returned an unsuccessful status code

# If successful, return the JSON data
return response.json()
except requests.ConnectionError:
raise Exception("Failed to connect to the server. Is it running?")
except requests.Timeout:
raise Exception("Request timed out. Please try again later.")
except requests.RequestException as err:
# For any other requests exceptions, re-raise it
raise Exception(f"An error occurred with your request: {err}")

X_test

# Define the data payload (We can also use X_test[0:1].tolist() instead of the raw array)
data_payload = {
"data": {
"ndarray": [
[
1.340e+01, 2.052e+01, 8.864e+01, 5.567e+02, 1.106e-01, 1.469e-01,
1.445e-01, 8.172e-02, 2.116e-01, 7.325e-02, 3.906e-01, 9.306e-01,
3.093e+00, 3.367e+01, 5.414e-03, 2.265e-02, 3.452e-02, 1.334e-02,
1.705e-02, 4.005e-03, 1.641e+01, 2.966e+01, 1.133e+02, 8.444e+02,
1.574e-01, 3.856e-01, 5.106e-01, 2.051e-01, 3.585e-01, 1.109e-01
]
]
}
}

# Get the response and print it
try:
response = send_prediction_request(data_payload)
pretty_json_response = json.dumps(response, indent=4)
print(pretty_json_response)
except Exception as err:
print(err)

The prediction of our model will be similar to this dictionary:

{
"data": {
"names": [],
"ndarray": [
0
]
},
"meta": {
"metrics": [
{
"key": "accuracy",
"type": "GAUGE",
"value": 0
},
{
"key": "precision",
"type": "GAUGE",
"value": 0
},
{
"key": "recall",
"type": "GAUGE",
"value": 0
},
{
"key": "f1_score",
"type": "GAUGE",
"value": 0
}
],
"tags": {
"shape_input_data": "(1, 30)"
}
}
}

The response from the model will contain several keys:

“data”: Provides the generated output by our model. In our case, it’s the predicted class.

“meta”: Contains metadata and model metrics. It shows the actual values of the classification metrics, including accuracy, precision, recall, and f1_score.

“tags”: Contains intermediate metadata. This could include anything you want to track, such as the shape of the input data.

The structure outlined above ensures that not only can we evaluate the final predictions, but we also gain insights into intermediate results. These insights can be instrumental in understanding predictions and debugging any potential issues.

This stage marks a significant milestone in our journey from training a model to deploying and testing it within a Docker container. We’ve seen how to standardize an ML model and how to set it up for real-world predictions. With this foundation, you’re well-equipped to scale, monitor, and further integrate this model into a full-fledged production environment.

Send feedback in real-time and calculate metrics

The provisioned /feedback endpoint facilitates this learning by allowing truth values to be sent back to the model once they are available. As these truth values are received, the model’s metrics are updated and can be scraped by other tools for real-time analysis and monitoring. In the following code snippet, we iterate over the test dataset and send the truth value to the /feedback endpoint, using a POST request:

import requests
import json

URL = "http://localhost:9000/api/v1.0/feedback"

def send_prediction_feedback(data):

# Create the headers for the request
headers = {‘Content-Type': ‘application/json’}

try:
# Send the POST request
response = requests.post(URL, headers=headers, json=data)

# Check if the request was successful
response.raise_for_status() # Will raise HTTPError if the HTTP request returned an unsuccessful status code

# If successful, return the JSON data
return response.json()
except requests.ConnectionError:
raise Exception("Failed to connect to the server. Is it running?")
except requests.Timeout:
raise Exception("Request timed out. Please try again later.")
except requests.RequestException as err:
# For any other requests exceptions, re-raise it
raise Exception(f"An error occurred with your request: {err}")

for i in range(len(X_test)):
payload = {‘request': {‘data': {‘ndarray': [X_test[i].tolist()]}}, ‘truth': {‘data': {‘ndarray': [int(y_test[i])]}}}

# Get the response and print it
try:
response = send_prediction_feedback(payload)
pretty_json_response = json.dumps(response, indent=4) # Pretty-print JSON
print(pretty_json_response)
except Exception as err:
print(err)

After processing the feedback, the model calculates and returns key metrics, including accuracy, precision, recall, and F1-score. These metrics are then available for analysis:

{
"data": {
"ndarray": []
},
"meta": {
"metrics": [
{
"key": "accuracy",
"type": "GAUGE",
"value": 0.92607003
},
{
"key": "precision",
"type": "GAUGE",
"value": 0.9528302
},
{
"key": "recall",
"type": "GAUGE",
"value": 0.9294478
},
{
"key": "f1_score",
"type": "GAUGE",
"value": 0.9409938
}
],
"tags": {
"shape_input_data": "(1, 30)"Ω
}
}
}

What makes this approach truly powerful is that the model’s evolution is no longer confined to the training phase. Instead, it’s in a continual state of learning, adjustment, and refinement, based on real-world feedback.

This way, we’re not just deploying a static prediction engine but fostering an evolving intelligent system that can better align itself with the changing landscape of data it interprets. It’s a holistic approach to machine learning deployment that encourages continuous improvement and real-time adaptation.

Conclusions

At IKEA Retail, Docker has become an indispensable element in our daily MLOps activities, serving as a catalyst that accelerates the development and deployment of models, especially when transitioning to production. The transformative impact of Docker unfolds through a spectrum of benefits that not only streamlines our workflow but also fortifies it:

Standardization: Docker orchestrates a consistent environment during the development and deployment of any ML model, fostering uniformity and coherence across the lifecycle.

Compatibility: With support for diverse environments and seamless multi-cloud or on-premise integration, Docker bridges gaps and ensures a harmonious workflow.

Isolation: Docker ensures that applications and resources are segregated, offering an isolated environment that prioritizes efficiency and integrity.

Security: Beyond mere isolation, Docker amplifies security by completely segregating applications from each other. This robust separation empowers us with precise control over traffic flow and management, laying a strong foundation of trust.

These attributes translate into tangible advantages in our MLOps journey, sculpting a landscape that’s not only innovative but also robust:

Agile development and deployment environment: Docker ignites a highly responsive development and deployment environment, enabling seamless creation, updating, and deployment of ML models.

Optimized resource utilization: Utilize compute/GPU resources efficiently within a shared model, maximizing performance without compromising flexibility.

Scalable deployment: Docker’s architecture allows for the scalable deployment of ML models, adapting effortlessly to growing demands.

Smooth release cycles: Integrating seamlessly with our existing CI/CD pipelines, Docker smoothens the model release cycle, ensuring a continuous flow of innovation.

Effortless integration with monitoring tools: Docker’s compatibility extends to monitoring stacks like Prometheus + Grafana, creating a cohesive ecosystem fully aligned with our MLOps approach when creating and deploying models in production.

The convergence of these benefits elevates IKEA Retail’s MLOps strategy, transforming it into a symphony of efficiency, security, and innovation. Docker is not merely a tool: Docker is a philosophy that resonates with our pursuit of excellence. Docker is the bridge that connects creativity with reality, and innovation with execution.

In the complex world of ML deployment, we’ve explored a path less trodden but profoundly rewarding. We’ve tapped into the transformative power of standardization, unlocking an agile and responsive way to deploy and engage with ML models in real-time.

But this is not a conclusion; it’s a threshold. New landscapes beckon, brimming with opportunities for growth, exploration, and innovation. The following steps will continue the current approach: 

Scaling with Kubernetes: Unleash the colossal potential of Kubernetes, a beacon of flexibility and resilience, guiding you to a horizon of unbounded possibilities.

Applying real-time monitoring and alerting systems based on open source technologies, such as Prometheus and Grafana.

Connecting a data-drift detector for real-time detection: Deployment and integration of drift detectors to detect changes in data in real-time.

We hope this exploration will empower you to redefine your paths, ignite new ideas, and push the boundaries of what’s possible. The gateway to an extraordinary future is open, and the key is in our hands.

Learn more

Read our Docker AI/ML article collection.

Visit Docker Hub, the world’s largest container image registry.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

How IKEA Standardizes Docker Images for Efficient Machine Learning Model Deployment

What do Docker and IKEA have in common? Both companies have changed how products are built, stored, and shipped. In IKEA’s case, they created the market of flat-packed furniture items, which made everything from shipping, warehousing, and delivering their furniture to the end location much easier and more cost effective. This parallels what Docker has done for developers. Docker has changed the way that software is built, shipped, and stored, with Docker Images taking up much less space “shelf” space. 

In this post, contributing authors Karan Honavar and Fernando Dorado Rueda from IKEA walk through their MLOps solution, built with Docker.

Machine learning (ML) deployment, the act of shifting an ML model from the developmental stage to a live production environment, is paramount to translating complex algorithms into real-world solutions. Yet, this intricate process isn’t without its challenges, including:

Complexity and opacity: With ML models often veiled in complexity, deciphering their logic can be taxing. This obscurity impedes trust and complicates the explanation of decisions to stakeholders.

Adaptation to changing data patterns: The shifting landscape of real-world data can deviate from training sets, causing “concept drift.” Addressing this requires vigilant retraining, an arduous task that wastes time and resources.

Real-time data processing: Handling the deluge of data necessary for accurate predictions can burden systems and impede scalability.

Varied deployment methods: Whether deployed locally, in the cloud, or via web services, each method brings unique challenges, adding layers of complexity to an already intricate procedure.

Security and compliance: Ensuring that ML models align with rigorous regulations, particularly around private information, necessitates a focus on lawful implementation.

Ongoing maintenance and monitoring: The journey doesn’t end with deployment. Constant monitoring is vital to sustain the model’s health and address emerging concerns.

These factors represent substantial obstacles, but they are not insurmountable. We can streamline the journey from the laboratory to the real world by standardizing Docker images for efficient ML model deployment. 

This article will delve into the creation, measurement, deployment, and interaction with Dockerized ML models. We will demystify the complexities and demonstrate how Docker can catalyze cutting-edge concepts into tangible benefits.

Standardization deployment process via Docker

In the dynamic realm of today’s data-driven enterprises, such as our case at IKEA, the multitude of tools and deployment strategies serves both as a boon and a burden. Innovation thrives, but so too does complexity, giving rise to inconsistency and delays. The antidote? Standardization. It’s more than just a buzzword; it’s a method to pave the way to efficiency, compliance, and seamless integration.

Enter Docker, the unsung hero in this narrative. In the evolving field of ML deployment, Docker offers agility and uniformity. It has reshaped the landscape by offering a consistent environment from development to production. The beauty of Docker lies in its containerization technology, enabling developers to wrap up an application with all the parts it needs, such as libraries and other dependencies, and ship it all out as one package.

At IKEA, diverse teams — including hybrid data scientist teams and R&D units — conceptualize and develop models, each selecting drivers and packaging libraries according to their preferences and requirements. Although virtual environments provide a certain level of support, they can also present compatibility challenges when transitioning to a production environment. 

This is where Docker becomes an essential tool in our daily operations, offering simplification and a marked acceleration in the development and deployment process. Here are key advantages:

Portability: With Docker, the friction between different computing environments melts away. A container runs uniformly, regardless of where it’s deployed, bringing robustness to the entire pipeline.

Efficiency: Docker’s lightweight nature ensures that resources are optimally utilized, thereby reducing overheads and boosting performance.

Scalability: With Docker, scaling your application or ML models horizontally becomes a breeze. It aligns perfectly with the orchestrated symphony that large-scale deployment demands.

Then, there’s Seldon-Core, a solution chosen by IKEA’s forward-thinking MLOps (machine learning operations) team. Why? Because it transforms ML models into production-ready microservices, regardless of the model’s origin (TensorFlow, PyTorch, H2O, etc.) or language (Python, Java, etc.). But that’s not all. Seldon-Core scales precisely, enabling everything from advanced metrics and logging to explainers and A/B testing.

This combination of Docker and Seldon-Core forms the heart of our exploration today. Together, they sketch the blueprint for a revolution in ML deployment. This synergy is no mere technical alliance; it’s a transformative collaboration that redefines deploying, monitoring, and interacting with ML models.

Through the looking glass of IKEA’s experience, we’ll unearth how this robust duo — Docker and Seldon-Core — can turn a convoluted task into a streamlined, agile operation and how you can harness real-time metrics for profound insights.

Dive into this new MLOps era with us. Unlock efficiency, scalability, and a strategic advantage in ML production. Your innovation journey begins here, with Docker and Seldon-Core leading the way. This is more than a solution; it’s a paradigm shift.

In the rest of this article, we will cover deployment steps, including model preparation, encapsulating the model into an Docker image, and testing. Let’s get started.

Prerequisites

The following items must be present to replicate this example:

Docker: Ensure Docker is up and running, easily achievable through solutions like Docker Desktop

Python: Have a local installation at the ready (+3.7)

Model preparation

Model training and simple evaluation

Embarking on the journey to deploying an ML model is much like crafting a masterpiece: The canvas must be prepared, and every brushstroke must be deliberate. However, the focus of this exploration isn’t the art itself but rather the frame that holds it — the standardization of ML models, regardless of their creation or the frameworks used.

The primary objective of this demonstration is not to augment the model’s performance but rather to elucidate the seamless transition from local development to production deployment. It is imperative to note that the methodology we present is universally applicable across different models and frameworks. Therefore, we have chosen a straightforward model as a representative example. This choice is intentional, allowing readers to concentrate on the underlying process flows, which can be readily adapted to more sophisticated models that may require refined hyperparameter tuning and meticulous model selection. 

By focusing on these foundational principles, we aim to provide a versatile and accessible guide that transcends the specificities of individual models or use cases. Let’s delve into this process.

To align with our ethos of transparency and consumer privacy and to facilitate your engagement with this approach, a public dataset is employed for a binary classification task.

In the following code excerpt, you’ll find the essence of our training approach, reflecting how we transform raw data into a model ready for real-world challenges:

import os
import pickle
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, Perceptron
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Load the breast cancer dataset
X, y = datasets.load_breast_cancer(return_X_y=True)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.9, random_state=0)

# Combine X_test and y_test into a single DataFrame
X_test_df = pd.DataFrame(X_test, columns=[f"feature_{i}" for i in range(X_test.shape[1])])
y_test_df = pd.DataFrame(y_test, columns=["target"])

df_test = pd.concat([X_test_df, y_test_df], axis=1)

# Define the path to store models
model_path = "models/"

# Create the folder if it doesn’t exist
if not os.path.exists(model_path):
os.makedirs(model_path)

# Define a list of classifier parameters
parameters = [
{"clf": LogisticRegression(solver="liblinear", multi_class="ovr"), "name": f"{model_path}/binary-lr.joblib"},
{"clf": Perceptron(eta0=0.1, random_state=0), "name": f"{model_path}/binary-percept.joblib"},
]

# Iterate through each parameter configuration
for param in parameters:
clf = param["clf"] # Retrieve the classifier from the parameter dictionary
clf.fit(X_train, y_train) # Fit the classifier on the training data
# Save the trained model to a file using pickle
model_filename = f"{param[’name’]}"
with open(model_filename, ‘wb’) as model_file:
pickle.dump(clf, model_file)
print(f"Model saved in {model_filename}")

# Simple Model Evaluation
model_path = ‘models/binary-lr.joblib’
with open(model_path, ‘rb’) as model_file:
loaded_model = pickle.load(model_file)

# Make predictions using the loaded model
predictions = loaded_model.predict(X_test)

# Calculate metrics (accuracy, precision, recall, f1-score)
accuracy = accuracy_score(y_test, predictions)
precision = precision_score(y_test, predictions)
recall = recall_score(y_test, predictions)
f1 = f1_score(y_test, predictions)

Model class creation

With the model files primed, the task at hand shifts to the crafting of the model class — an essential architectural element that will later reside within the Docker image. Like a skilled sculptor, we must shape this class, adhering to the exacting standards proposed by Seldon:

import joblib
import logging

class Score:
"""
Class to hold metrics for binary classification, including true positives (TP), false positives (FP),
true negatives (TN), and false negatives (FN).
"""
def __init__(self, TP=0, FP=0, TN=0, FN=0):
self.TP = TP # True Positives
self.FP = FP # False Positives
self.TN = TN # True Negatives
self.FN = FN # False Negatives

class DockerModel:
"""
Class for loading and predicting using a pre-trained model, handling feedback to update metrics,
and providing those metrics.
"""
result = {} # Dictionary to store input data

def __init__(self, model_name="models/binary-lr.joblib"):
"""
Initialize DockerModel with metrics and model name.
:param model_name: Path to the pre-trained model.
"""
self.scores = Score(0, 0, 0, 0)
self.loaded = False
self.model_name = model_name

def load(self):
"""
Load the model from the provided path.
"""
self.model = joblib.load(self.model_name)
logging.info(f"Model {self.model_name} Loaded")

def predict(self, X, features_names=None, meta=None):
"""
Predict the target using the loaded model.
:param X: Features for prediction.
:param features_names: Names of the features, optional.
:param meta: Additional metadata, optional.
:return: Predicted target values.
"""
self.result[’shape_input_data’] = str(X.shape)
logging.info(f"Received request: {X}")
if not self.loaded:
self.load()
self.loaded = True
predictions = self.model.predict(X)
return predictions

def send_feedback(self, features, feature_names, reward, truth, routing=""):
"""
Provide feedback on predictions and update the metrics.
:param features: Features used for prediction.
:param feature_names: Names of the features.
:param reward: Reward signal, not used in this context.
:param truth: Ground truth target values.
:param routing: Routing information, optional.
:return: Empty list as return value is not used.
"""
predicted = self.predict(features)
logging.info(f"Predicted: {predicted[0]}, Truth: {truth[0]}")
if int(truth[0]) == 1:
if int(predicted[0]) == int(truth[0]):
self.scores.TP += 1
else:
self.scores.FN += 1
else:
if int(predicted[0]) == int(truth[0]):
self.scores.TN += 1
else:
self.scores.FP += 1
return [] # Ignore return statement as its not used

def calculate_metrics(self):
"""
Calculate the accuracy, precision, recall, and F1-score.
:return: accuracy, precision, recall, f1_score
"""
total_samples = self.scores.TP + self.scores.TN + self.scores.FP + self.scores.FN

# Check if there are any samples to avoid division by zero
if total_samples == 0:
logging.warning("No samples available to calculate metrics.")
return 0, 0, 0, 0 # Return zeros for all metrics if no samples

accuracy = (self.scores.TP + self.scores.TN) / total_samples

# Check if there are any positive predictions to calculate precision
positive_predictions = self.scores.TP + self.scores.FP
precision = self.scores.TP / positive_predictions if positive_predictions != 0 else 0

# Check if there are any actual positives to calculate recall
actual_positives = self.scores.TP + self.scores.FN
recall = self.scores.TP / actual_positives if actual_positives != 0 else 0

# Check if precision and recall are non-zero to calculate F1-score
if precision + recall == 0:
f1_score = 0
else:
f1_score = 2 * (precision * recall) / (precision + recall)

# Return the calculated metrics
return accuracy, precision, recall, f1_score

def metrics(self):
"""
Generate metrics for monitoring.
:return: List of dictionaries containing accuracy, precision, recall, and f1_score.
"""
accuracy, precision, recall, f1_score = self.calculate_metrics()
return [
{"type": "GAUGE", "key": "accuracy", "value": accuracy},
{"type": "GAUGE", "key": "precision", "value": precision},
{"type": "GAUGE", "key": "recall", "value": recall},
{"type": "GAUGE", "key": "f1_score", "value": f1_score},
]

def tags(self):
"""
Retrieve metadata when generating predictions
:return: Dictionary the intermediate information
"""
return self.result

Let’s delve into the details of the functions and classes within the DockerModel class that encapsulates these four essential aspects:

Loading and predicting:

load(): This function is responsible for importing the pretrained model from the provided path. It’s usually called internally before making predictions to ensure the model is available.

predict(X, features_names=None, meta=None): This function deploys the loaded model to make predictions. It takes in the input features X, optional features_names, and optional metadata meta, returning the predicted target values.

Feedback handling:

send_feedback(features, feature_names, reward, truth, routing=””): This function is vital in adapting the model to real-world feedback. It accepts the input data, truth values, and other parameters to assess the model’s performance. The feedback updates the model’s understanding, and the metrics are calculated and stored for real-time analysis. This facilitates continuous retraining of the model.

Metrics calculation:

calculate_metrics(): This function calculates the essential metrics of accuracy, precision, recall, and F1-score. These metrics provide quantitative insights into the model’s performance, enabling constant monitoring and potential improvement.

Score class: This auxiliary class is used within the DockerModel to hold metrics for binary classification, including true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). It helps keep track of these parameters, which are vital for calculating the aforementioned metrics.

Monitoring assistance:

metrics(): This function generates the metrics for model monitoring. It returns a list of dictionaries containing the calculated accuracy, precision, recall, and F1 score. These metrics are compliant with Prometheus Metrics, facilitating real-time monitoring and analysis.

tags(): This function is designed to retrieve custom metadata data when generating predictions, aiding in monitoring and debugging. It returns a dictionary, which can help track and understand the nature of the requests.

Together, these functions and classes form a cohesive and robust structure that supports the entire lifecycle of an ML model. From the moment of inception (loading and predicting) through its growth (feedback handling) and evaluation (metrics calculation), to its continuous vigilance (monitoring assistance), the architecture is designed to standardize and streamline the process of deploying and maintaining ML models in a real-world production environment.

This model class is more than code; it’s the vessel that carries our ML model from a local environment to the vast sea of production. It’s the vehicle for standardization, unlocking efficiency and consistency in deploying models.

At this stage, we’ve prepared the canvas and outlined the masterpiece. Now, it’s time to dive deeper and explore how this model is encapsulated into a Docker image, an adventure that blends technology and strategy to redefine ML deployment. 

Testing model locally

Before venturing into creating a Docker image, testing the model locally is vital. This step acts as a rehearsal before the main event, providing a chance to ensure that the model is performing as expected with the testing data.

The importance of local testing lies in its ability to catch issues early, avoiding potential complications later in the deployment process. Following the example code provided below, it can confirm that the model is ready for its next phase if it provides the expected prediction in the expected format:

from DockerModel import DockerModel
demoModel = DockerModel()
demoModel.predict(X_test) # Can take the entire testing dataset or individual predictions

The expected output should match the format of the class labels you anticipate from the model. If everything works correctly, you’re assured that the model is well prepared for the next grand step: encapsulation within a Docker image.

Local testing is more than a technical process; it’s a quality assurance measure that stands as a gatekeeper, ensuring that only a well-prepared model moves forward. It illustrates the meticulous care taken in the deployment process, reflecting a commitment to excellence that transcends code and resonates with the core values of standardization and efficiency.

With the local testing accomplished, we stand on the threshold of a new frontier: creating the Docker image. Let’s continue this exciting journey, knowing each step is a stride toward innovation and mastery in ML deployment.

Encapsulating the model into a Docker image

In our IKEA MLOps view, a model is not simply a collection of code. Rather, it is a sophisticated assembly comprising code, dependencies, and ML artifacts, all encapsulated within a versioned and registered Docker image. This composition is carefully designed, reflecting the meticulous planning of the physical infrastructure.

What is Docker’s role in MLOps?

Docker plays a vital role in MLOps, providing a standardized environment that streamlines the transition from development to production:

Streamlining deployment: Docker containers encapsulate everything an ML model needs to run, easing the deployment process.

Facilitating collaboration: Using Docker, data scientists and engineers can ensure that models and their dependencies remain consistent across different stages of development.

Enhancing model reproducibility: Docker provides a uniform environment that enhances the reproducibility of models, a critical aspect in machine learning.

Integrating with orchestration tools: Docker can be used with orchestration platforms like Kubernetes, enabling automated deployment, scaling, and management of containerized applications.

Docker and containerization are more than technology tools; they catalyze innovation and efficiency in MLOps. Ensuring consistency, scalability and agility, Docker unlocks new potential and opens the way for a more agile and robust ML deployment process. Whether you are a developer, a data scientist, or an IT professional, understanding Docker is critical to navigating the complex and multifaceted landscape of modern data-driven applications.

Dockerfile creation

Creating a Dockerfile is like sketching the architectural plan of a building. It outlines the instructions for creating a Docker image to run the application in a coherent, isolated environment. This design ensures that the entire model — including its code, dependencies, and unique ML artifacts — is treated as a cohesive entity, aligning with the overarching vision of IKEA’s MLOps approach.

In our case, we have created a Dockerfile with the express purpose of encapsulating not only the code but all the corresponding artifacts of the model. This deliberate design facilitates a smooth transition to production, effectively bridging the gap between development and deployment.

We used the following Dockerfile for this demonstration, which represents a tangible example of how IKEA’s MLOps approach is achieved through thoughtful engineering and strategic implementation.

# Use an official Python runtime as a parent image.
# Using a slim image for a smaller final size and reduced attack surface.
FROM python:3.9-slim

# Set the maintainer label for metadata.
LABEL maintainer="fernandodorado.rueda@ingka.com"

# Set environment variables for a consistent build behavior.
# Disabling the buffer helps to log messages synchronously.
ENV PYTHONUNBUFFERED=1

# Set a working directory inside the container to store all our project files.
WORKDIR /app

# First, copy the requirements file to leverage Docker’s cache for dependencies.
# By doing this first, changes to the code will not invalidate the cached dependencies.
COPY requirements.txt requirements.txt

# Install the required packages listed in the requirements file.
# It’s a good practice to include the –no-cache-dir flag to prevent the caching of dependencies
# that aren’t necessary for executing the application.
RUN pip install –no-cache-dir -r requirements.txt

# Copy the rest of the code and model files into the image.
COPY DockerModel.py DockerModel.py
COPY models/ models/

# Expose ports that the application will run on.
# Port 5000 for GRPC
# Port 9000 for REST
EXPOSE 5000 9000

# Set environment variables used by the application.
ENV MODEL_NAME DockerModel
ENV SERVICE_TYPE MODEL

# Change the owner of the directory to user 8888 for security purposes.
# It can prevent unauthorised write access by the application itself.
# Make sure to run the application as this non-root user later if applicable.
RUN chown -R 8888 /app

# Use the exec form of CMD so that the application you run will receive UNIX signals.
# This is helpful for graceful shutdown.
# Here we’re using seldon-core-microservice to serve the model.
CMD exec seldon-core-microservice $MODEL_NAME –service-type $SERVICE_TYPE

This Dockerfile contains different parts:

FROM python:3.9-slim: This line chooses the official Python 3.9 slim image as the parent image. It is favored for its reduced size and attack surface, enhancing both efficiency and security.

LABEL maintainer=”fernandodorado.rueda@ingka.com”: A metadata label that specifies the maintainer of the image, providing contact information.

ENV PYTHONUNBUFFERED=1: Disabling Python’s output buffering ensures that log messages are emitted synchronously, aiding in debugging and log analysis.

WORKDIR /app: Sets the working directory inside the container to /app, a centralized location for all project files.

COPY requirements.txt requirements.txt: Copies the requirements file into the image. Doing this before copying the rest of the code leverages Docker’s caching mechanism, making future builds faster. This file must contain the “seldon-core” package:

pandas==1.3.5
requests==2.28.1
numpy==1.20
seldon-core==1.14.1
scikit-learn==1.0.2

RUN pip install –no-cache-dir -r requirements.txt: Installs required packages as listed in the requirements file. The flag -no-cache-dir prevents unnecessary caching of dependencies, reducing the image size.

COPY DockerModel.py DockerModel.py: Copies the main Python file into the image.

COPY models/ models/: Copies the model files into the image.

EXPOSE 5000 9000: Exposes ports 5000 (GRPC) and 9000 (REST), allowing communication with the application inside the container.

ENV MODEL_NAME DockerModel: Sets the environment variable for the model name.

ENV SERVICE_TYPE MODEL: Sets the environment variable for the service type.

RUN chown -R 8888 /app: Changes the owner of the directory to user 8888. Running the application as a non-root user helps mitigate the risk of unauthorized write access.

CMD exec seldon-core-microservice $MODEL_NAME –service-type $SERVICE_TYPE: Executes the command to start the service using seldon-core-microservice. It also includes the model name and service type as parameters. Using exec ensures the application receives UNIX signals, facilitating graceful shutdown.

Building and pushing Docker image

1. Installing Docker Desktop

If not already installed, Docker Desktop is recommended for this task. Docker Desktop provides a graphical user interface that simplifies the process of building, running, and managing Docker containers. Docker Desktop also supports Kubernetes, offering an easy way to create a local cluster.

2. Navigating to the Project directory

Open a terminal or command prompt.

Navigate to the folder where the Dockerfile and other necessary files are located.

3. Building the Image

Execute the command: docker build . -t docker-model:1.0.0.

docker build . instructs Docker to build the image using the current directory (.).

-t docker-model:1.0.0 assigns a name (docker-model) and tag (1.0.0) to the image.

The build process will follow the instructions defined in the Dockerfile, creating a Docker image encapsulating the entire environment needed to run the model.

4. Pushing the image

If needed, the image can be pushed to a container registry like Docker Hub, or a private registry within an organization.

For this demonstration, the image is being kept in the local container registry, simplifying the process and removing the need for authentication with an external registry.

Deploy ML model using Docker: Unleash it into the world

Once the Docker image is built, running it is relatively straightforward. Let’s break down this process:

docker run –rm –name docker-model -p 9000:9000 docker-model:1.0.0

Components of the command:

docker run: This is the base command to run a Docker container.

-rm: This flag ensures that the Docker container is automatically removed once it’s stopped. It helps keep the environment clean, especially when you run containers for testing or short-lived tasks.

-name docker-model: Assigns a name to the running container.

p 9000:9000: This maps port 9000 on the host machine to port 9000 on the Docker container. The format is p <host_port>:<container_port>. Because the Dockerfile mentions that the application will be exposing ports 5000 for GRPC and 9000 for REST, this command makes sure the REST endpoint is available to external users or applications through port 9000 on the host.

docker-model:1.0.0: This specifies the name and tag of the Docker image to run. docker-model is the name, and 1.0.0 is the version tag we assigned during the build process.

What happens next

On executing the command, Docker will initiate a container instance from the docker-model:1.0.0 image.

The application within the Docker container will start and begin listening for requests on port 9000 (as specified).

With the port mapping, any incoming requests on port 9000 of the host machine will be forwarded to port 9000 of the Docker container.

The application can now be accessed and interacted with as if it were running natively on the host machine.

Test deployed model using Docker

With the Docker image in place, it’s time to see the model in action.

Generate predictions

The path from model to prediction is a delicate process, requiring an understanding of the specific input-output type that Seldon accommodates (e.g., ndarray, JSON data, STRDATA).

In our scenario, the model anticipates an array, and thus, the key in our payload is “ndarray.” Here’s how we orchestrate this:

import requests
import json

URL = "http://localhost:9000/api/v1.0/predictions"

def send_prediction_request(data):

# Create the headers for the request
headers = {‘Content-Type': ‘application/json’}

try:
# Send the POST request
response = requests.post(URL, headers=headers, json=data)

# Check if the request was successful
response.raise_for_status() # Will raise HTTPError if the HTTP request returned an unsuccessful status code

# If successful, return the JSON data
return response.json()
except requests.ConnectionError:
raise Exception("Failed to connect to the server. Is it running?")
except requests.Timeout:
raise Exception("Request timed out. Please try again later.")
except requests.RequestException as err:
# For any other requests exceptions, re-raise it
raise Exception(f"An error occurred with your request: {err}")

X_test

# Define the data payload (We can also use X_test[0:1].tolist() instead of the raw array)
data_payload = {
"data": {
"ndarray": [
[
1.340e+01, 2.052e+01, 8.864e+01, 5.567e+02, 1.106e-01, 1.469e-01,
1.445e-01, 8.172e-02, 2.116e-01, 7.325e-02, 3.906e-01, 9.306e-01,
3.093e+00, 3.367e+01, 5.414e-03, 2.265e-02, 3.452e-02, 1.334e-02,
1.705e-02, 4.005e-03, 1.641e+01, 2.966e+01, 1.133e+02, 8.444e+02,
1.574e-01, 3.856e-01, 5.106e-01, 2.051e-01, 3.585e-01, 1.109e-01
]
]
}
}

# Get the response and print it
try:
response = send_prediction_request(data_payload)
pretty_json_response = json.dumps(response, indent=4)
print(pretty_json_response)
except Exception as err:
print(err)

The prediction of our model will be similar to this dictionary:

{
"data": {
"names": [],
"ndarray": [
0
]
},
"meta": {
"metrics": [
{
"key": "accuracy",
"type": "GAUGE",
"value": 0
},
{
"key": "precision",
"type": "GAUGE",
"value": 0
},
{
"key": "recall",
"type": "GAUGE",
"value": 0
},
{
"key": "f1_score",
"type": "GAUGE",
"value": 0
}
],
"tags": {
"shape_input_data": "(1, 30)"
}
}
}

The response from the model will contain several keys:

“data”: Provides the generated output by our model. In our case, it’s the predicted class.

“meta”: Contains metadata and model metrics. It shows the actual values of the classification metrics, including accuracy, precision, recall, and f1_score.

“tags”: Contains intermediate metadata. This could include anything you want to track, such as the shape of the input data.

The structure outlined above ensures that not only can we evaluate the final predictions, but we also gain insights into intermediate results. These insights can be instrumental in understanding predictions and debugging any potential issues.

This stage marks a significant milestone in our journey from training a model to deploying and testing it within a Docker container. We’ve seen how to standardize an ML model and how to set it up for real-world predictions. With this foundation, you’re well-equipped to scale, monitor, and further integrate this model into a full-fledged production environment.

Send feedback in real-time and calculate metrics

The provisioned /feedback endpoint facilitates this learning by allowing truth values to be sent back to the model once they are available. As these truth values are received, the model’s metrics are updated and can be scraped by other tools for real-time analysis and monitoring. In the following code snippet, we iterate over the test dataset and send the truth value to the /feedback endpoint, using a POST request:

import requests
import json

URL = "http://localhost:9000/api/v1.0/feedback"

def send_prediction_feedback(data):

# Create the headers for the request
headers = {‘Content-Type': ‘application/json’}

try:
# Send the POST request
response = requests.post(URL, headers=headers, json=data)

# Check if the request was successful
response.raise_for_status() # Will raise HTTPError if the HTTP request returned an unsuccessful status code

# If successful, return the JSON data
return response.json()
except requests.ConnectionError:
raise Exception("Failed to connect to the server. Is it running?")
except requests.Timeout:
raise Exception("Request timed out. Please try again later.")
except requests.RequestException as err:
# For any other requests exceptions, re-raise it
raise Exception(f"An error occurred with your request: {err}")

for i in range(len(X_test)):
payload = {‘request': {‘data': {‘ndarray': [X_test[i].tolist()]}}, ‘truth': {‘data': {‘ndarray': [int(y_test[i])]}}}

# Get the response and print it
try:
response = send_prediction_feedback(payload)
pretty_json_response = json.dumps(response, indent=4) # Pretty-print JSON
print(pretty_json_response)
except Exception as err:
print(err)

After processing the feedback, the model calculates and returns key metrics, including accuracy, precision, recall, and F1-score. These metrics are then available for analysis:

{
"data": {
"ndarray": []
},
"meta": {
"metrics": [
{
"key": "accuracy",
"type": "GAUGE",
"value": 0.92607003
},
{
"key": "precision",
"type": "GAUGE",
"value": 0.9528302
},
{
"key": "recall",
"type": "GAUGE",
"value": 0.9294478
},
{
"key": "f1_score",
"type": "GAUGE",
"value": 0.9409938
}
],
"tags": {
"shape_input_data": "(1, 30)"Ω
}
}
}

What makes this approach truly powerful is that the model’s evolution is no longer confined to the training phase. Instead, it’s in a continual state of learning, adjustment, and refinement, based on real-world feedback.

This way, we’re not just deploying a static prediction engine but fostering an evolving intelligent system that can better align itself with the changing landscape of data it interprets. It’s a holistic approach to machine learning deployment that encourages continuous improvement and real-time adaptation.

Conclusions

At IKEA, Docker has become an indispensable element in our daily MLOps activities, serving as a catalyst that accelerates the development and deployment of models, especially when transitioning to production. The transformative impact of Docker unfolds through a spectrum of benefits that not only streamlines our workflow but also fortifies it:

Standardization: Docker orchestrates a consistent environment during the development and deployment of any ML model, fostering uniformity and coherence across the lifecycle.

Compatibility: With support for diverse environments and seamless multi-cloud or on-premise integration, Docker bridges gaps and ensures a harmonious workflow.

Isolation: Docker ensures that applications and resources are segregated, offering an isolated environment that prioritizes efficiency and integrity.

Security: Beyond mere isolation, Docker amplifies security by completely segregating applications from each other. This robust separation empowers us with precise control over traffic flow and management, laying a strong foundation of trust.

These attributes translate into tangible advantages in our MLOps journey, sculpting a landscape that’s not only innovative but also robust:

Agile development and deployment environment: Docker ignites a highly responsive development and deployment environment, enabling seamless creation, updating, and deployment of ML models.

Optimized resource utilization: Utilize compute/GPU resources efficiently within a shared model, maximizing performance without compromising flexibility.

Scalable deployment: Docker’s architecture allows for the scalable deployment of ML models, adapting effortlessly to growing demands.

Smooth release cycles: Integrating seamlessly with our existing CI/CD pipelines, Docker smoothens the model release cycle, ensuring a continuous flow of innovation.

Effortless integration with monitoring tools: Docker’s compatibility extends to monitoring stacks like Prometheus + Grafana, creating a cohesive ecosystem fully aligned with our MLOps approach when creating and deploying models in production.

The convergence of these benefits elevates IKEA’s MLOps strategy, transforming it into a symphony of efficiency, security, and innovation. Docker is not merely a tool: Docker is a philosophy that resonates with our pursuit of excellence. Docker is the bridge that connects creativity with reality, and innovation with execution.

In the complex world of ML deployment, we’ve explored a path less trodden but profoundly rewarding. We’ve tapped into the transformative power of standardization, unlocking an agile and responsive way to deploy and engage with ML models in real-time.

But this is not a conclusion; it’s a threshold. New landscapes beckon, brimming with opportunities for growth, exploration, and innovation. The following steps will continue the current approach: 

Scaling with Kubernetes: Unleash the colossal potential of Kubernetes, a beacon of flexibility and resilience, guiding you to a horizon of unbounded possibilities.

Applying real-time monitoring and alerting systems based on open source technologies, such as Prometheus and Grafana.

Connecting a data-drift detector for real-time detection: Deployment and integration of drift detectors to detect changes in data in real-time.

We hope this exploration will empower you to redefine your paths, ignite new ideas, and push the boundaries of what’s possible. The gateway to an extraordinary future is open, and the key is in our hands.

Learn more

Read our Docker AI/ML article collection.

Visit Docker Hub, the world’s largest container image registry.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

How to Get Started with the Weaviate Vector Database on Docker

Vector databases have been getting a lot of attention since the developer community realized how they can enhance large language models (LLMs). Weaviate is an open source vector database that enables modern search capabilities, such as vector search, hybrid search, and generative search. With Weaviate, you can build advanced LLM applications, next-level search systems, recommendation systems, and more.

This article explains what vector databases are and highlights key features of the Weaviate vector database. Learn how to install Weaviate on Docker using Docker Compose so you can take advantage of semantic search within your Dockerized environment.

Introducing the Weaviate vector database

The core feature of vector databases is storing vector embeddings of data objects. This functionality is especially helpful with the growing amount of unstructured data (e.g., text or images), which is difficult to manage and process with traditional relational databases. The vector embeddings are a numerical representation of the data objects — usually generated by a machine learning (ML) model — and enable the search and retrieval of data based on semantic similarity (vector search).

Vector databases do much more than just store vector embeddings: As you can imagine, retrieving data based on similarity requires a lot of comparing between objects and thus can take a long time. In contrast to other types of databases that can store vector embeddings, a vector database can retrieve data fast. To enable low-latency search queries, vector databases use specific algorithms to index the data.

Additionally, some vector databases, like Weaviate, store the vector embeddings and the original data object, which lets you combine traditional search with modern vector search for more accurate search results.

With these functionalities, vector databases are usually used in search or similar tasks (e.g., recommender systems). With the recent advancements in the LLM space, however, vector databases have also proven effective at providing long-term memory and domain-specific context to conversational LLMs. This means that you can leverage LLM capabilities on your private data or your specific field of expertise.

Key highlights of the Weaviate vector database include:

Open source: Weaviate is open source and available for anybody to use wherever they want. It is also available as a managed service with SaaS and hybrid SaaS options.

Horizontal scalability: You can scale seamlessly into billions of data objects for your exact needs, such as maximum ingestion, largest possible dataset size, maximum queries per second, etc.

Lightning-fast vector search: You can perform lightning-fast pure vector similarity search over raw vectors or data objects, even with filters. Weaviate typically performs nearest-neighbor searches of millions of objects in considerably less than 100ms (see our benchmark).

Combined keyword and vector search (hybrid search): You can store both data objects and vector embeddings. This approach allows you to combine keyword-based and vector searches for state-of-the-art search results.

Optimized for cloud-native environments: Weaviate has the fault tolerance of a cloud-native database, and the core Docker image is comparably small at 18 MB.

Modular ecosystem for seamless integrations: You can use Weaviate standalone (aka “bring your own vectors”) or with various optional modules that integrate directly with OpenAI, Cohere, Hugging Face, etc., to enable easy use of state-of-the-art ML models. These modules can be used as vectorizers to automatically vectorize any media type (text, images, etc.) or as generative modules to extend Weaviate’s core capabilities (e.g., question answering, generative search, etc.).

Prerequisites

Ensure you have both the docker and the docker-compose CLI tools installed. For the following section, we assume you have Docker 17.09.0 or higher and Docker Compose V2 installed. If your system has Docker Compose V1 installed instead of V2, use docker-compose  instead of docker compose. You can check your Docker Compose version with:

$ docker compose version

How to Configure the Docker Compose File for Weaviate

To start Weaviate with Docker Compose, you need a Docker Compose configuration file, typically called docker-compose.yml. Usually, there’s no need to obtain individual images, as we distribute entire Docker Compose files.

You can obtain a Docker Compose file for Weaviate in two different ways:

Docker Compose configurator on the Weaviate website (recommended): The configurator allows you to customize your docker-compose.yml file for your purposes (including all module containers) and directly download it.

Manually: Alternatively, if you don’t want to use the configurator, copy and paste one of the example files from the documentation and manually modify it.

This article will review the steps to configure your Docker Compose file with the Weaviate Docker Compose configurator.

Step 1: Version

First, define which version of Weaviate you want to use (Figure 1). We recommend always using the latest version.

Figure 1: The first step when using the Weaviate Docker Compose configurator, suggesting that the latest version be used.

The following shows a minimal example of a Docker Compose setup for Weaviate:

version: ‘3.4’
services:
weaviate:
image: semitechnologies/weaviate:1.20.5
ports:
– 8080:8080
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ‘true’
PERSISTENCE_DATA_PATH: ‘/var/lib/weaviate’
DEFAULT_VECTORIZER_MODULE: ‘none’
CLUSTER_HOSTNAME: ‘node1′

Step 2: Persistent volume

Configure persistent volume for Docker Compose file (Figure 2):

Figure 2: Weaviate Docker Compose configurator “Persistent Volume” configuration options.

Setting up a persistent volume to avoid data loss when you restart the container and improve reading and writing speeds is recommended.

You can set a persistent volume in two ways:

With a named volume: Docker will create a named volume weaviate_data and mount it to the PERSISTENCE_DATA_PATH inside the container after starting Weaviate with Docker Compose:

services:
weaviate:
volumes:
– weaviate_data:/var/lib/weaviate
# etc.

volumes:
weaviate_data:

With host binding: Docker will mount  ./weaviate_data  on the host to the PERSISTENCE_DATA_PATH inside the container after starting Weaviate with Docker Compose:

services:
weaviate:
volumes:
– ./weaviate_data:/var/lib/weaviate
# etc.

Step 3: Modules

Weaviate can be used with various modules, which integrate directly with inferencing services like OpenAI, Cohere, or Hugging Face. These modules can be used to vectorize any media type at import and search time automatically or to extend Weaviate’s core capabilities with generative modules.

You can also use Weaviate without any modules (standalone). In this case, no model inference is performed at import or search time, meaning you need to provide your own vectors in both scenarios. If you don’t need any modules, you can skip to Step 4: Runtime.

Configure modules for Docker Compose file (Figure 3):

Figure 3: The Weaviate Docker Compose configurator step to define if modules will be used, or if running standalone is desired.

Currently, Weaviate integrates three categories of modules:

Retriever and vectorizer modules automatically vectorize any media type (text, images, etc.) at import and search time. There are also re-ranker modules available for re-ranking search results.

Reader and generator modules can be used to extend Weaviate’s core capabilities after retrieving the data for generative search, question answering, named entity recognition (NER), and summarization.

Other modules are available for spell checking or for enabling using your custom modules.

Note that many modules (e.g., transformer models) are neural networks built to run on GPUs. Although you can run them on CPU, enabling GPU  `ENABLE_CUDA=1`, if available, will result in faster inference.

The following shows an example of a Docker Compose setup for Weaviate with the sentence-transformers model:

version: ‘3.4’
services:
weaviate:
image: semitechnologies/weaviate:1.20.5
restart: on-failure:0
ports:
– "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: ‘true’
PERSISTENCE_DATA_PATH: "./data"
DEFAULT_VECTORIZER_MODULE: text2vec-transformers
ENABLE_MODULES: text2vec-transformers
TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080
CLUSTER_HOSTNAME: ‘node1′
t2v-transformers:
image: semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1
environment:

Step 4: Runtime

In the final step of the configurator, select Docker Compose for your runtime (Figure 4):

Figure 4: The final step of the Weaviate Docker Compose configurator where “Docker Compose” can be selected as the runtime.

Step 5: Download and further customization

Once your configuration is complete, you will see a snippet similar to the following to download the docker-compose.yml file, which has been adjusted to your selected configuration.

$ curl -o docker-compose.yml "https://configuration.weaviate.io/v2/docker-compose/docker-compose.yml?<YOUR-CONFIGURATION>"

After downloading the Docker Compose file from the configurator, you can directly start Weaviate on Docker or customize it further.

You can set additional environment variables to further customize your Weaviate setup (e.g., by defining authentication and authorization). Additionally, you can create a multi-node setup with Weaviate by defining a founding member and other members in the cluster.

Founding member: Set up one node as a “founding” member by configuring CLUSTER_GOSSIP_BIND_PORT and CLUSTER_DATA_BIND_PORT:

weaviate-node-1: # Founding member service name
… # truncated for brevity
environment:
CLUSTER_HOSTNAME: ‘node1′
CLUSTER_GOSSIP_BIND_PORT: ‘7100’
CLUSTER_DATA_BIND_PORT: ‘7101’

Other members in the cluster: For each further node, configure CLUSTER_GOSSIP_BIND_PORT and CLUSTER_DATA_BIND_PORT and configure to join the founding member’s cluster using the CLUSTER_JOIN variable:

weaviate-node-2:
… # truncated for brevity
environment:
CLUSTER_HOSTNAME: ‘node2′
CLUSTER_GOSSIP_BIND_PORT: ‘7102’
CLUSTER_DATA_BIND_PORT: ‘7103’
CLUSTER_JOIN: ‘weaviate-node-1:7100′ # This must be the service name of the "founding" member node.

Optionally, you can set a hostname for each node using CLUSTER_HOSTNAME.

Note that it’s a Weaviate convention to set the CLUSTER_DATA_BIND_PORT to 1 higher than CLUSTER_GOSSIP_BIND_PORT.

How to run Weaviate on Docker

Once you have your Docker Compose file configured to your needs, you can run Weaviate in your Docker environment.

Start Weaviate

Before starting Weaviate on Docker, ensure that the Docker Compose file is named exactly docker-compose.yml and that you are in the same folder as the Docker Compose file.

Then, you can start up with the whole setup by running:

$ docker compose up -d

The -d option runs containers in detached mode. This means that your terminal will not attach to the log outputs of all the containers.

If you want to attach to the logs of specific containers (e.g., Weaviate), you can run the following command:

$ docker compose up -d && docker compose logs -f weaviate

Congratulations! Weaviate is now running and is ready to be used.

Stop Weaviate

To avoid losing your data, shut down Weaviate with the following command:

$ docker compose down

This will write all the files from memory to disk.

Conclusion

This article introduced vector databases and how they can enhance LLM applications. Specifically, we highlighted the open source vector database Weaviate, whose advantages include fast vector search at scale, hybrid search, and integration modules to state-of-the-art ML models from OpenAI, Cohere, Hugging Face, etc.

We also provided a step-by-step guide on how to install Weaviate on Docker using Docker Compose, noting that you can obtain a docker-compose.yml file from the Weaviate Docker Compose configurator, which helps you to customize your Docker Compose file for your specific needs.

Visit our AI/ML page and read the article collection to learn more about how developers are using Docker to accelerate the development of their AI/ML applications.

Learn more

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/