5 Benefits of a Container-First Approach to Software Development

Cargo containers completely transformed the shipping industry and enabled the global commerce experience of today. Similarly, software containers simplify application development and deployment, which helps enable the cloud-native software architecture that powers the modern technology we all rely on. Although you can get benefits from containerizing your applications after the fact, you get the most value when you take a container-first approach. 

In this blog post, we discuss insights from Cracking the Code: Effectively Managing All of Those Applications, highlighting five benefits of embracing a container-first approach to software development.

Consistent and reliable software performance

Inconsistency can be a major roadblock to progress. The all too familiar frustration of “it works on my machine” can cause software delivery delays and hinders collaboration. But with containers comes standardization. This ensures that software will perform consistently across the entire development process, regardless of the underlying environment.

Developers and infrastructure engineers also save a lot of time and cognitive energy on configuring and maintaining their environments and workstations. Containers have a small resource footprint, which means your infrastructure can do more with less. And, because each container includes the exact versions of software it needs, you don’t have to worry about conflicting dependencies.

Fewer bugs

Bugs are the bane of every software developer’s existence. However, a container-first approach provides environmental parity. This means that the development, staging, and production environments remain consistent, reducing the likelihood of encountering bugs caused by disparities in underlying conditions. With containers, businesses can significantly reduce debugging time and enhance the overall quality of their software, leading to higher user satisfaction and a stronger competitive edge.

Faster developer onboarding

The learning curve for new developers can often be steep, especially when dealing with complex software environments. Containers revolutionize developer onboarding by providing a replica of the exact environment in which an application will be tested and executed. This is irrespective of the developer’s local operating system or installed libraries. With containers, developers can hit the ground running, accelerating their productivity and contributing to the project’s success from day one.

A more secure supply chain

The Consortium for Information & Software Quality estimates that poor software quality has cost the United States economy $2.41 trillion. Two of the top causes are criminals exploiting vulnerabilities and supply chain problems with third-party software. Containers can help.

Because the Dockerfile is a recipe for creating the container, you can use it to produce a software bill of materials (SBOM). This makes clear what dependencies — including the specific version — go into building the container. Cryptographically signed SBOMs let you verify the provenance of your dependencies, so you can be sure that the upstream library you’re using is the actual one produced by the project.

Using the SBOM, you can also monitor your fleet of containers for known vulnerabilities. When a new vulnerability is discovered, you can quickly tell which of your containers are affected, which makes the response quicker. Containers also provide isolation, micro-segmentation, and other zero-trust techniques, which reduce your attack surface and limit the impact of exploited vulnerabilities.

Improved productivity for faster time-to-market

The standardization, consistency, and security containers bring directly impact software delivery time. With fewer issues to deal with (bugs, compatibility issues, maintenance, etc.), developers can focus on more meaningful tasks and ultimately deliver solutions to customers faster. All of this helps development teams work more efficiently, collaborate effectively, and deliver higher-quality software.

Learn more

Dive deeper into the world of containers and the benefits of adopting a container-first model in your software development by downloading the full white paper, Cracking the Code: Effectively Managing All of Those Applications.

Get the latest release of Docker Desktop.

 Vote on what’s next! Check out our public roadmap.

 Have questions? The Docker community is here to help.

 New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Docker Desktop 4.22: Resource Saver, Compose ‘include’, and Enhanced RBAC Functionality

Docker Desktop 4.22 is now available and includes a new Resource Saver feature that massively reduces idle memory and CPU utilization to ensure efficient use of your machine’s resources. Docker Compose include allows splitting complex Compose projects into subprojects to make it easier to modularize complex applications into sub-Compose files. Role-based access control (RBAC) has also been enhanced with the addition of an Editor role to allow admins to delegate repository management tasks.

Resource Saver 

In 4.22 we have added a new Resource Saver feature for Mac and Windows which detects when Docker Desktop is not running containers and massively reduces its memory and CPU footprint (WSL has CPU optimizations only at this stage). When Resource Saver detects that Docker Desktop is idle without any active containers for a duration of 30 seconds, it automatically reduces the memory and CPU footprint. This optimizes Docker Desktop for your system and helps to free up resources on your machine for other tasks. When a container needs resources, they’re quickly allocated on demand.

To see this feature in action, start Docker Desktop and leave it idle for 30 seconds with no containers running. A leaf will appear over the whale icon in your Docker Desktop menu and the sidebar of the Docker Desktop dashboard, indicating that Resource Saver mode is activated.

Figure 1: The Docker Desktop menu and the macOS menu bar show Resource Saver mode running.

Previously, Docker Desktop introduced some CPU optimizations of Resource Saver, which, at the time of writing, are already saving up to a staggering 38,500 CPU hours every single day across all Docker Desktop users.

Split complex Compose projects into multiple subprojects with ‘include’

If you’re working with complex applications, use the new include section in your Compose file to split your project into manageable subprojects. Compared to merging files with CLI flags or using extends to share common attributes of a single service from another file, include loads external Compose files as self-contained building blocks, making it easier to collaborate on services across teams and share common dependency configurations within your organization.

For more on how you can try out this feature, read “Improve Docker Compose Modularity with `include`” or refer to the documentation.

Figure 2: A compose.yaml file that is using the new ‘include’ section to define subprojects.

Editor role available for organizations

With the addition of the Editor role, admins can provision users to manage repositories without full administrator privileges. Users assigned to the Editor role can:

Create public and private repositories

Pull, push, view, edit, and delete a repository

Update repository description

Assign team permissions to repos

Update scanning settings

Delete tags

Add webhooks

Change repo visibility settings

For further details on roles and permissions, refer to the documentation. 

Organization owners can assign the Editor role to a member of their organization in either Docker Hub or Docker Admin.

Figure 3: The Editor role functionality in Docker Hub.

Conclusion

Upgrade now to explore what’s new in the 4.22 release of Docker Desktop. Do you have feedback? Leave feedback on our public GitHub roadmap and let us know what else you’d like to see in upcoming releases.

Learn more

Read Improve Docker Compose Modularity with include.

Get the latest release of Docker Desktop.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Sentiment Analysis and Insights on Cryptocurrencies Using Docker and Containerized AI/ML Models

Learning about and sharing ways in which those in the Docker community leverage Docker to drive innovation is always exciting. We are learning about many interesting AI/ML solutions that use Docker to accelerate development and simplify deployment. 

In this blog post, Saul Martin shares how Prometeo.ai leverages Docker to deploy and manage its machine learning models allowing its developers to focus on innovation, not infrastructure/deployment management, for their sentiment analysis of a range of cryptocurrencies to provide insights for traders.

The digital assets market, which is famously volatile and swift, necessitates tools that can keep up with its speed and provide real-time insights. At the forefront of these tools is Prometeo.ai, which has harnessed the power of Docker to build a sophisticated, high-frequency sentiment analysis platform. This tool sifts through the torrent of emotions that drive the cryptocurrency market, providing real-time sentiments of the top 100 assets, which is a valuable resource for hedge funds and financial institutions.

Prometeo.ai’s leveraging of Docker’s containerization capabilities allows it to deploy and manage complex machine learning models with ease, making it an example of modern, robust, scalable architecture.

In this blog post, we will delve into how Prometeo.ai is utilizing Docker for its sentiment analysis tool, highlighting the critical aspects of its data collection, machine learning model implementations, storage, and deployment processes. This exploration will give you a clear understanding of how Docker can transform machine learning application deployment, presenting a case study in the form of Prometeo.ai.

Data collection and processing: High-frequency sentiment analysis with Docker

Prometeo.ai’s comprehensive sentiment analysis capability hinges on an agile, near real-time data collection and processing infrastructure. This framework captures, enriches, and publishes an extensive range of sentiment data from diverse platforms:

Stream/data access: Platform-specific data pipelines tasked with real-time harvesting of cryptocurrency-related discussions hosted on siloed Docker containers.

Tokenization and sentiment analysis: The harvested data undergoes tokenization, transforming each content piece into a format suitable for analysis. An internal Sentiment Analysis API further enriches this tokenized data, inferring sentiment attributes from the raw information.

Publishing: Enriched sentiment data is published within one minute of collection, facilitating near real-time insights for users. During periods of content unavailability from a data source, the system generates and publishes an empty dictionary.

All these operations transpire within Docker containers, guaranteeing the necessary scalability, isolation, and resource efficiency to manage high-frequency data operations.

For efficient data storage, Prometeo.ai relies on:

NoSQL database: DynamoDB is used for storing minute-level sentiment aggregations. The primary key is defined such that it allows quick access to data based on time-range queries. These aggregations are critical for providing real-time insights to users and for hourly and daily aggregation calculations.

Object storage: For model retraining and data backup purposes, the raw data, including raw content, is exported in batches and stored in Amazon S3 buckets. This robust storage mechanism ensures data durability and aids in maintaining data integrity.

Relational database: Metadata related to different assets, including links, tickers, IDs, descriptions, and others, are hosted in PostgreSQL. This provides a relational structure for asset metadata and promotes accessible, structured access when required.

NLP models

Prometeo.ai makes use of two Bidirectional Encoder Representations from Transformers (BERT) models, both of which operate within a Docker environment for natural language processing (NLP). The following models run multi-label classification pipelines that have been fine-tuned on an in-house dataset of 50k manually labeled tweets.

proSENT model: This model specializes in identifying 28 unique emotional sentiments. It owes its comprehensive language understanding to training on a corpus of more than 1.5 million unique cryptocurrency-related social media posts.

proCRYPT model: This model is fine-tuned for crypto sentiment analysis, classifying sentiments as bullish, bearish, or neutral. The deployment architecture encapsulates both these models within a Docker container alongside a FastAPI server. This internal API acts as the conduit for running inferences.

To ensure a seamless and efficient build process, Hugging Face’s model hub is used to store the models. The models and their binary files are retrieved directly from Hugging Face during the Docker container’s build phase. This approach keeps the Docker container lean by downloading the models at runtime, thereby optimizing the build process and contributing to the overall operational efficiency of the application.

Deployment

Prometeo.ai’s deployment pipeline is composed of GitHub Actions, AWS CodeDeploy, and accelerated computing instances. This pipeline forms a coherent system for efficiently handling application updates and resource allocation:

GitHub Actions: The onset of the pipeline employs GitHub Actions, which are programmed to autonomously instigate a fresh deployment upon the push of any modifications to the production branch. This procedural design ensures the application continually operates on the most recent, vetted code version.

AWS CodeDeploy: The subsequent phase involves AWS CodeDeploy, which is triggered once GitHub Actions have successfully compiled and transferred the Docker image to the Elastic Container Registry (ECR). CodeDeploy is tasked with the automatic deployment of this updated Docker image to the GPU-optimized instances. This robust orchestration ensures smooth rollouts and establishes a reliable rollback plan if necessary.

Accelerated computing: Prometeo leverages NVIDIA Tesla GPUs for the computational prowess needed for executing complex BERT models. These GPU-optimized instances are tailored for NVIDIA-CUDA Docker image compatibility, thereby facilitating GPU acceleration, which significantly expedites the processing and analysis stages.

Below is a snippet demonstrating the configuration to exploit the GPU capabilities of the instances:

deploy:
resources:
reservations:
devices:
– driver: nvidia
capabilities: [gpu]

Please note that the image must be the same version as your CUDA version after running nvidia-smi:

FROM nvidia/cuda:12.1.0-base-ubuntu20.04

To maintain optimal performance under fluctuating load conditions, an autoscaling mechanism is incorporated. This solution perpetually monitors CPU utilization, dynamically scaling the number of instances up or down as dictated by the load. This ensures that the application always has access to the appropriate resources for efficient execution.

Conclusion

By harnessing Docker’s containerization capabilities and compatibility with NVIDIA-CUDA images, Prometeo.ai successfully manages intensive, real-time emotion analysis in the digital assets domain. Docker’s role in this strategy is pivotal, providing an environment that enables resource optimization and seamless integration with other services.

Prometeo.ai’s implementation demonstrates Docker’s potential to handle sophisticated computational tasks. The orchestration of Docker with GPU-optimized instances and cloud-based services exhibits a scalable and efficient infrastructure for high-frequency, near-real-time data analysis.

Do you have an interesting use case or story about Docker in your AI/ML workflow? We would love to hear from you and maybe even share your story.

Learn more

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Container Security and Why It Matters

Are you thinking about container security? Maybe you are on a security team trying to manage rogue cloud resources. Perhaps you work in the DevOps space and know what container security is, but you want to figure out how to decrease the pain around security triaging your containers for everyone involved. 

In this post, we’ll look at security for containers in a scalable environment, how deployment to that environment can affect your rollout of container security, and how Docker can help.

What is container security?

Container security is knowing that a container image you run in your environment includes only the libraries, base image, and any custom bits you declare in your Dockerfile,  and not malware or known vulnerabilities. (We’d also love to say no zero days, but such is the nature of the beast.)

You want to know that those libraries used to build your image and any base image behind it come from sources you expect — open source or otherwise — and are free from critical vulnerabilities, malware, and other surprises. 

The base image is usually a common image (for example, Alpine Linux, Ubuntu, or BusyBox) that is a building block upon which other companies add their own image layers. Think of an image layer as a step in the install process. Whenever you take a base image and add new libraries or steps to it for creation, you are essentially creating a new image.  

We’ve talked about the most immediate piece of container security, the image layers, but how is the image built and what is the source of those image layers?

Container image provenance

Here’s where container security gets tricky: the image build and source tracking process. You want assurances that your images, libraries, and any base images you depend on contain what you expect them to and not anything nefarious. So you should care about image provenance: where an image gets built, who builds it, and where it gets stored. 

You should pay attention to any infrastructure or automation used to build your images, which typically means continuous integration (CI) tooling such as GitHub Actions, AWS CodeBuild, or CircleCI. You need to ensure any workloads running your image builds are on build environments with minimal access and potential attack surfaces. You need to consider who has access to your GitHub actions runners, for example. Do you need to create a VPN connection from your runner to your cloud account? If so, what are the security protections on that VPN connection? Consider the confidentiality and integrity of your image pipeline carefully. 

To put it more directly: Managing container provenance in cloud workloads can make deployments easier, but it can also make it easier to deploy malware at scale if you aren’t careful. The nature of the cloud is that it adds complexity, not necessarily security.

Software Bill of Materials (SBOM) attestations can also help ensure that only what you want is inside your images. With an SBOM, you can review a list of all the libraries and dependencies used to build your image and ensure the versioning and content matches what you expect by viewing an SBOM attestation. Docker Engine provides this with docker sbom and Docker BuildKit provides it in versions newer than 0.11. 

Other considerations with SBOM attestations include attestation provider trust and protection from man-in-the-middle attacks, such as replacing libraries in the image. Docker is working to create signed SBOM attestations for images to create strong assurances around SBOM to help strengthen this part of image security.

You also want to consider software composition analysis (SCA) against your images to ensure open source tooling and licenses are as expected. Docker Official Images, for example, have a certified seal of provenance behind them for your base image, which provides assurance around a base image you might be using.

Vulnerability and malware scanning

And what about potential CVEs and malware? How do you scan your images at scale for those issues? 

A number of static scanning tools are available for CVE scanning, and some provide dynamic malware scanning. When researching tools in this space, consider what you use for your image repository, such as Docker Hub, Amazon Elastic Container Registry (ECR), Artifact Registry, or an on-premises/in-colocation option like Nexus. Depending on the dynamics and security controls you have in place on your registry, one tooling option might make more sense than another. For example, AWS ECR offers some static vulnerability scanning out of the box. Some other options bundle software composition analysis (SCA) scanning of images as well. 

The trick is to find a tool with the right signal-to-noise mix for your team. For example, you might want static scanning but minimal false positives and the ability to create exclusions. 

As with any static vulnerability scanning tool, the Common Vulnerability Scoring System (CVSS) score of a vulnerability is just a starting point. Only you and your team can determine the exploitability, possible risks, and attack surface of a particular vulnerability and whether those factors outweigh the potential effects of upgrading or changing an image deployed at scale in your environment.

In other words, a scanning tool might find some high or critical (per CVSS scoring) vulnerabilities in some of your images. Still, those vulnerabilities might not be exploitable because the affected images are only used internally inside a virtual private cloud (VPC) in your environment with no external access. But you’d want to ensure that the image stays internal and isn’t used for production. So guardrails, monitoring, and gating around the use of that image and it staying in internal workloads only is a must. 

Finally, imagine an image that is pervasive and used across all your workloads. The effort to upgrade that image might take several sprint cycles for your engineering teams to safely deploy and require service downtime as you unravel the library dependencies. Regarding vulnerability rating for the two examples — an internal-only image and a pervasive image that is difficult to upgrade — you might want to lower the priority of the vulnerability in the former and slowly track progress toward remediating the latter. 

Docker’s Security Team is intimately familiar with two of the biggest blockers security teams face: time and resources. Your team might not be able to triage and remediate all vulnerabilities across production, development, and staging environments, especially if your team is just starting its journey with container security. So start with what you can and must do something about: production images.

Production vs. non-production

Only container images that have gone through appropriate approval and automation workflows should be deployed in production. Like any mature CI/CD workflow, this means thorough testing in non-production environments, scanning before release to production, and monitoring and guardrails around images that are already live in production with things like cloud resource tagging, version control, and appropriate role-based access control around who can approve an image’s deployment to production. 

At its root, this means that Security teams that have not previously had their feet in the infrastructure or DevOps team’s ocean of work in your company’s cloud accounts should. Just as DevOps culture has caused a shift for developers in handling infrastructure, scaling, and service decisions in the cloud, the same shift is happening in the security community with DevSecOps culture and Security Engineering. It is in the middle of this intersection where container security resides.

Not only does your tool choice matter in terms of best-fit for your environment’s landscape with container security, your ability to collaborate with your infrastructure, engineering, and DevOps teams matters even more for this work. To reiterate, to get a good handle on gating production deployments and having good automation and monitoring tied to those production deployments and resources, security teams must familiarize themselves with this space and get comfortable in this intersection. Good tooling can help make all the difference in fostering that culture of collaboration, especially for a security team new to this space.

Container security tools: What to look for

Like any well-thought-out tool selection, sometimes what matters most is not the number of bells and whistles a tool offers but the tool’s fit to your organization’s needs and gaps.

Avoid container security tools that promise to be the silver bullet. Instead, think of tools that will help your team conquer small challenges today and work to build on goals for the larger challenges down the road. (Security folks know that any tool on the market promising to be a silver bullet is just selling something and isn’t a reality with the ever-changing threat landscape.)

In short, tools for container security should enable your workflow and build trust and facilitate cross-team collaboration from Engineering to Security to DevOps, not tools that become a landscape of noise and overwhelming visuals for your engineers. And here’s where Docker Scout can help.

Docker Scout

Docker engineers have been working on a new product to help increase container security: Docker Scout. Scout gives you the list of discovered vulnerabilities in your container images and offers guidance for remediation in an iterative small-improvements style. You can compare your scores from one deployment to the next and show improvement to create a sense of accomplishment for your teams, not an overwhelming bombardment of vulnerabilities and risk that seems insurmountable.

Docker Scout lets you set target goals for your images and markers for iterative improvement. You can define different goals for production images versus development or staging images so that each environment gets the level of security it needs.

Conclusion

As with most security problems, there is no silver bullet with container security. The technical, operational, and organizational moving pieces that go into protecting your company’s container images often reside at the boundaries between teams, functions, and responsibilities. This adds complexity to an already complex problem. Rather than further adding to the burdens created by this complexity, you should look for tools that enable your teams to work together and reach a deeper understanding of where goals, risks, and priorities overlap and coexist.

Even more importantly, look for container security solutions that are clear about what they can offer you and extend help in areas where they do not have offerings. 

Whether you are a security team member new to the ocean of DevOps and container security or have been in these security waters for a while, Docker is here to help support you and get to more stable waters. We are beside you in this ocean and trying to make the space better for ourselves, our customers, and developers who use Docker all over the world.

Learn more

Get the latest release of Docker Desktop.

Try Docker Scout.

Learn about Docker Security.

Generate the SBOM for Docker images.

Learn about SBOM attestations.

Check out Docker Official Images.

Visit Docker Hub.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Memgraph Docker Extension: Empowering Real-Time Analytics with High Performance

Memgraph is an open source, in-memory graph database designed with real-time analytics in mind. Providing a high-performance solution, Memgraph caters to developers and data scientists who require immediate, actionable insights from complex, interconnected data.

What sets Memgraph apart is its high-speed data processing ability, delivering performance that makes it significantly faster than other graph databases. This, however, is not achieved at the expense of data integrity or reliability. Memgraph is committed to providing accurate and dependable insights as fast as you need them.

Built entirely on a C++ codebase, Memgraph leverages in-memory storage to handle complex real-time use cases effectively. Support for ACID transactions guarantees data consistency, while the Cypher query language offers a robust toolset for data structuring, manipulation, and exploration. 

Graph databases have a broad spectrum of applications. In domains as varied as cybersecurity, credit card fraud detection, energy management, and network optimization, Memgraph can efficiently analyze and traverse complex network structures and relationships within data. This analytical prowess facilitates real-time, in-depth revelations across a broad spectrum of industries and areas of study. 

In this article, we’ll show how using Memgraph as a Docker Extension offers a powerful and efficient way to leverage real-time analytics from a graph database. 

Architecture of Memgraph

The high-speed performance of Memgraph can be attributed to its unique architecture (Figure 1). Centered around graph models, the database represents data as nodes (entities) and edges (relationships), enabling efficient management of deeply interconnected data essential for a range of modern applications.

In terms of transactions, Memgraph upholds the highest standard. It uses the standardized Cypher query language over the Bolt protocol, facilitating efficient data structuring, manipulation, and exploration.

Figure 1: Components of Memgraph’s architecture.

The key components of Memgraph’s architecture are:

In-memory storage: Memgraph stores data in RAM for low-latency access, ensuring high-speed data retrieval and modifications. This is critical for applications that require real-time insights.

Transaction processing: Memgraph supports ACID (Atomicity, Consistency, Isolation, Durability) transactions, which means it guarantees that all database transactions are processed reliably and in a way that ensures data integrity, including when failures occur.

Query engine: Memgraph uses Cypher, a popular graph query language that’s declarative and expressive, allowing for complex data relationships to be easily retrieved and updated.

Storage engine: While Memgraph primarily operates in memory, it also provides a storage engine that takes care of data durability by persisting data on disk. This ensures that data won’t be lost even if the system crashes or restarts.

High availability and replication: Memgraph’s replication architecture can automatically replicate data across multiple machines, and it supports replication to provide high availability and fault tolerance.

Streaming and integration: Memgraph can connect with various data streams and integrate with different types of data sources, making it a versatile choice for applications that need to process and analyze real-time data.

To provide users with the utmost flexibility and control, Memgraph comprises several key components, each playing a distinct role in delivering seamless performance and user experience:

MemgraphDB — MemgraphDB is the heart of the Memgraph system. It deals with all concurrency problems, consistency issues, and scaling, both in terms of data and algorithm complexity. Using the Cypher query language, MemgraphDB allows users to query data and run algorithms. It also supports both push and pull operations, which means you can query data and run algorithms and get notified when something changes in the data.

Mgconsole — mgconsole is a command-line interface (CLI) used to interact with Memgraph from any terminal or operating system. 

Memgraph Lab — Memgraph Lab is a visual user interface for running queries and visualizing graph data. It provides a more interactive experience, enabling users to apply different graph styles, import predefined datasets, and run example queries. It makes data analysis and visualization more intuitive and user-friendly.

MAGE (Memgraph Advanced Graph Extensions) — MAGE is an open source library of graph algorithms and custom Cypher procedures. It enables high-performance processing of demanding graph algorithms on streaming data. With MAGE, users can run a variety of algorithms, from PageRank or community detection to advanced machine learning techniques using graph embeddings. Moreover, MAGE does not limit users to a specific programming language.

Based on those four components, Memgraph offers four different Docker images:

memgraph-platform — Installs MemgraphDB, mgconsole, Memgraph Lab, and MAGE

memgraph-mage — Installs MemgraphDB, mgconsole, and MAGE

memgraph — Installs MemgraphDB and mgconsole

MAGE × NVIDIA cuGraph — Installs everything that you need to run MAGE in combination with NVIDIA cuGraph

With more than 10K downloads from Docker Hub, Memgraph Platform is the most popular Memgraph Docker image, so the team decided to base the Memgraph Docker extension on it. Instructions are available in the documentation if you want to use any of the other images. Let’s look at how to install Memgraph Docker Extension.

Why run Memgraph as a Docker Extension?

Running Memgraph as a Docker Extension offers a streamlined experience to users who are already familiar with Docker Desktop, simplifying the deployment and management of the graph database. Docker provides an ideal environment to bundle, ship, and run Memgraph in a lightweight, isolated setup. This encapsulation not only promotes consistent performance across different systems but also simplifies the setup process.

Moreover, Docker Desktop is the only prerequisite to run Memgraph as an extension. This means that once you have Docker installed, you can easily set up and start using Memgraph, eliminating the need for additional software installations or complex configuration steps.

Getting started

Working with Memgraph as a Docker Extension begins with opening the Docker Desktop (Figure 2). Here are the steps to follow:

Choose Extensions in the left sidebar.

Switch to the Browse tab.

In the Filters drop-down, select the Database category.

Find Memgraph and then select Install. 

Figure 2: Installing Memgraph Docker Extension.

That’s it! Once the installation is finished, select Connect now (Figure 3).

Figure 3: Connecting to Memgraph database using Memgraph Lab.

What you see now is Memgraph Lab, a visual user interface designed for running queries and visualizing graph data. With a range of pre-prepared datasets, Memgraph Lab provides an ideal starting point for exploring Memgraph, gaining proficiency in Cypher querying, and effectively visualizing query results.  

Importing the Pandora Papers datasets

For the purposes of this article, we will import the Pandora Papers dataset. To import the dataset, choose Datasets in the Memgraph Lab sidebar and then Load Dataset (Figure 4).

Figure 4: Importing the Pandora Papers dataset.

Once the dataset is loaded, select Explore Query Collection to access a selection of predefined queries (Figure 5).

Figure 5: Exploring the Pandora Papers dataset query collection.

Choose one of the queries and select Run Query (Figure 6).

Figure 6: Running the Cypher query.

And voilà! Welcome to the world of graphs. You now have the results of your query (Figure 7). Now that you’ve run your first query, feel free to explore other queries in the Query Collection, import a new dataset, or start adding your own data to the database.

Figure 7: Displaying the query result as a graph.

Conclusion

Memgraph, as a Docker Extension, offers an accessible, powerful, and efficient solution for anyone seeking to leverage real-time analytics from a graph database. Its unique architecture, coupled with a streamlined user interface and a high-speed query engine, allows developers and data scientists to extract immediate, actionable insights from complex, interconnected data.

Moreover, with the integration of Docker, the setup and use of Memgraph become remarkably straightforward, further expanding its appeal to both experienced and novice users alike. The best part is the variety of predefined datasets and queries provided by the Memgraph team, which serve as excellent starting points for users new to the platform.

Whether you’re diving into the world of graph databases for the first time or are an experienced data professional, Memgraph’s Docker Extension offers an intuitive and efficient solution. So, go ahead and install it on Docker Desktop and start exploring the intriguing world of graph databases today. If you have any questions about Memgraph, feel free to join Memgraph’s vibrant community on Discord.

Learn more

Install Memgraph’s Docker Extension.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Protecting Secrets with Docker

Modern software is interconnected. When you develop an application, it has to communicate with other services — on your infrastructure, cloud infrastructure services, or third-party applications. Of course, you don’t want just anyone to masquerade as you, so you use secrets like SSH keys or API tokens to make the communication secure. But having these secrets means you have to keep them secret.

Unfortunately, sometimes the secrets escape. When this happens, it can allow bad actors to maliciously use the secrets or post them on the “dark web” for others to use. They can insert vulnerabilities into your code. They can impersonate you or deny legitimate users access to resources. And, if the secret is for something billable (like public cloud infrastructure), they can cost you a lot of money. No matter what other costs you face, the public relations impact can cause your users to lose trust in you.

In this article, we’ll cover a few ways that Docker can help keep secrets from leaking.

Before you read on: If your secrets have been exposed, the first step is to immediately invalidate them and check for compromises.

Controlling access with Docker Hub

The principle of least privilege is a powerful part of your security posture. If someone doesn’t need access to your Docker Hub images, they shouldn’t have access. Docker Hub provides private repositories so that you can keep your images to yourself. Docker Personal subscribers can create one private repository, while Docker Pro, Docker Team, and Docker Business subscriptions offer unlimited private repositories.

Keep in mind that even with private repositories, Docker Hub is not for storing account secrets. Private repositories are a layer in your defense-in-depth model.

Of course, sometimes you want to selectively share your images. Docker Pro, Docker Team, and Docker Business subscribers can add collaborators — accounts that can push or pull images in a private repository. Docker Pro subscribers can add one collaborator to a repository. Docker Teams and Docker Business subscribers can add up to the organization size. This means you can share images with the people who need them — and no one else.

Keeping secrets out

What’s better than protecting the secrets on your Docker image? Not having them in the image in the first place! While there are cases where you need to store a secret in order to make the proper connections, many cases of secret leakage involve secrets that were added accidentally.

The best way to avoid accidentally adding secrets is to use a secret manager, such as AWS Secrets Manager, HashiCorp Vault, or 1Password, which has some CLI options. If you have to keep the secrets in a local environment, you can prevent files from accidentally winding up on your image by adding them to the .dockerignore file. For example, if you’re worried about accidentally adding SSH keys to your image, you can include: *id_rsa*

This approach works well for secrets in files with predictable names. If you’re always storing your cloud credentials in a file called cloud_key.txt, then you’re well-covered. But you won’t catch cloud_credentials.txt.

You can add another layer of security with secret scanners. Tools like Aqua Security Trivy, Anchore, and JFrog Xray search your files for things that look like secrets. If you run the scanner before pushing your image, then you can catch the secret before it escapes. Many secrets scanners can be tied into a Git commit hook as well to prevent secrets from being included in your code.

Conclusion

Keeping your secrets secret is an ongoing process but worth the effort. Like everything in cybersecurity, there’s no one magic solution, but Docker provides features that you can use to help prevent leaking secrets.

To get the most from private repositories and collaborators, check out our subscription offerings. We’re considering adding secret scanning to Docker Scout. If you’d like to see this capability, upvote the issue on our public roadmap.

Learn more

Get the latest release of Docker Desktop.

 Vote on what’s next! Check out our public roadmap.

 Have questions? The Docker community is here to help.

 New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Improve Docker Compose Modularity with “include”

This blog post discusses a feature available now in Compose v2.20.0 and in our upcoming Docker Desktop 4.22 release.

The docker command line supports many flags to fine-tune your container, and it’s difficult to remember them all when replicating an environment. Doing so is even harder when your application is not a single container but a combination of many containers with various dependency relationships. Based on this, Docker Compose quickly became a popular tool because it lets users declare all the infrastructure details for a container-based application into a single YAML file using a simple syntax directly inspired by the docker run… command line.

Still, an issue persists for large applications using dozens, maybe hundreds, of containers, with ownership distributed across multiple teams. When using a monorepo, teams often have their own “local” Docker Compose file to run a subset of the application, but then they need to rely on other teams to provide a reference Compose file that defines the expected way to run their own subset.

This issue is not new and was debated in 2014 when Docker Compose was a fresh new project and issue numbers had only three digits. Now we’ve introduced the ability to “compose compose files” to address this need — even before this issue reaches its 10th anniversary!

In this article, we’ll show how the new include attribute, introduced in Docker Compose 2.20, makes Compose files more modular and reusable.

Extend a Compose file

Docker Compose lets you reuse an existing Compose file using the extends mechanism. This special configuration attribute lets you refer to another Compose file and select a service you want to also use in your own application, with the ability to override attributes for your own needs.

services:
database:
extends:
file: ../commons/compose.yaml
service: db

That’s a good solution as long as you only need a single service to be shared, and you know about its internal details so you know how to tweak configuration. But, it is not an acceptable solution when you want to reuse someone else’s configuration as a “black box” and don’t know about its own dependencies.

Merge Compose files

Another option is to merge a set of Compose files together. Docker Compose accepts a set of files and will merge and override the service definition to eventually create a composite Compose application model.

This approach has been utilized for years, but it comes with a specific challenge. Namely, Docker Compose supports relative paths for the many resources to be included in the application model, such as: build context for service images, location of file defining environment variables, path to a local directory used in a bind-mounted volume.

With such a constraint, code organization in a monorepo can become difficult, as a natural choice would be to have dedicated folders per team or component, but then the Compose files relative paths won’t be relevant.

Let’s play the role of the “database” team and define a Compose file for the service we are responsible for. Next, we build our own image from a Dockerfile and have a reference environment set as an env file:

services:
database:
builld: .
env-file:
– ./db.env

Now, let’s switch to another team and build a web application, which requires access to the database:

services:
webapp:
depends_on:
– database

Sounds good, until we try to combine those, running the following from the webapp directory: docker compose -f compose.yaml -f ../database/compose.yaml.

In doing so, the relative paths set by the second Compose file won’t get resolved as designed by the authors but from the local working directory. Thus, the resulting application won’t work as expected.

Reuse content from other teams

The include flag was introduced for this exact need. As the name suggests, this new top-level attribute will get a whole Compose file included in your own application model, just like you did a full copy/paste. The only difference is that it will manage relative path references so that the included Compose file will be parsed the way the author expects, running from its original location. This capability makes it way easier for you to reuse content from another team, without needing to know the exact details.

include:
../database/compose.yaml

services:
webapp:
depends_on:
– database

In this example, an infrastructure team has prepared a Compose file to manage a database service, maybe including some replicas, web UI to inspect data, isolated networks, volumes for data persistence, etc.

An application relying on this service doesn’t need to know about those infrastructure details and will consume the Compose file as a building block it can rely on. Thus, the infrastructure team can refactor its own database component to introduce additional services without the dependent teams being impacted.

This approach also comes with the bonus that the dependency on another Compose file is now explicit, and users don’t need to include additional flags on each Compose command they run. They can instead rely on the beloved docker compose up command without any additional knowledge of the application architecture.

Conclusion

With microservices and monorepo, it becomes common for an application to be split into dozens of services, and complexity is moved from code into infrastructure and configuration file. Docker Compose fits well with simple applications but is harder to use in such a context. At least it was, until now.

With include support, Docker Compose makes it easier to modularize such complex applications into sub-compose files. This capability allows application configuration to be simpler and more explicit. It also helps to reflect the engineering team responsible for the code in the config file organization. With each team able to reflect in configuration the way it depends on other’s work, there’s a natural approach to compose.yaml files organization.

Read more about this new include feature on the dedicated Compose specification page and experiment with it by upgrading Docker Compose to v2.20 or later.

Learn more

Read the include documentation

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Build and Deploy a LangChain-Powered Chat App with Docker and Streamlit

We are happy to have another great AI/ML story to share from our community. In this blog post, MA Raza, Ph.D., provides a guide to building and deploying a LangChain-powered chat app with Docker and Streamlit.

This article reinforces the value that Docker brings to AI/ML projects — the speed and consistency of deployment, the ability to build once and run anywhere, and the time-saving tools available in Docker Desktop that accelerate the overall development workflow.

In this article, we will explore the process of creating a chat app using LangChain, OpenAI API, and Streamlit frameworks. We will demonstrate the use of Docker and Docker Compose for easy deployment of the app on either in-house or cloud servers.

We have created and deployed a demo app (Figure 1) on Streamlit Public Cloud and Google App Engine for a quick preview.

Figure 1: Screenshot of the chat app: LangChain demo.

We’ve developed a GitHub project (Figure 2) that includes comprehensive instructions and the demo chat app that runs on LangChain. We’ve also configured the Poetry framework for the Python Environment Manager.

Figure 2: Information on the project on GitHub.

Chat app components and technologies

We’ll briefly describe the app components and frameworks utilized to create the template app.

LangChain Python framework

The LangChain framework enables developers to create applications using powerful large language models (LLMs). Our demo chat app is built on a Python-based framework, with the OpenAI model as the default option. However, users have the flexibility to choose any LLM they prefer.

The LongChain framework effortlessly manages input prompts and establishes connections between responses generated from LLMs APIs.

OpenAI model

For demonstration purposes, we are using OpenAI API to generate responses upon submission of prompts.

Frontend Streamlit UI

Streamlit is a lightweight and faster way of building and sharing data apps. A simple UI with a Streamlit framework is developed to interact with the chat app.

Deployment with Docker

Docker is useful in developing and deploying apps to any server without worrying about dependencies and environments. After the demo app is developed and running fine locally, we have added Docker support.

FROM python:3.10-slim-bullseye

ENV HOST=0.0.0.0

ENV LISTEN_PORT 8080

EXPOSE 8080

RUN apt-get update && apt-get install -y git

COPY ./requirements.txt /app/requirements.txt

RUN pip install –no-cache-dir –upgrade -r /app/requirements.txt

WORKDIR app/

COPY ./demo_app /app/demo_app
COPY ./.streamlit /app/.streamlit

CMD ["streamlit", "run", "demo_app/main.py", "–server.port", "8080"]

The previous code shows the contents of the Dockerfile used to generate the Docker image of the demo app. To build the image, we use:

docker build -t langchain-chat-app .

Docker optimization for light and fast builds

When deploying apps for enterprise applications, we have to be mindful of the resources being utilized and the execution/deploying life cycle computations.

We have also addressed the concerns on how to optimize the Docker build process to solve the problem of image size and build it fast for every iteration of source change etc. Refer to the “Blazing fast Python Docker builds with Poetry” article for details on various tricks for optimizing Docker.

# The builder image, used to build the virtual environment
FROM python:3.11-buster as builder

RUN apt-get update && apt-get install -y git

RUN pip install poetry==1.4.2

ENV POETRY_NO_INTERACTION=1
POETRY_VIRTUALENVS_IN_PROJECT=1
POETRY_VIRTUALENVS_CREATE=1
POETRY_CACHE_DIR=/tmp/poetry_cache

ENV HOST=0.0.0.0
ENV LISTEN_PORT 8080
EXPOSE 8080

WORKDIR /app

#COPY pyproject.toml ./app/pyproject.toml
#COPY poetry.lock ./app/poetry.lock
COPY pyproject.toml poetry.lock ./

RUN poetry install –without dev –no-root && rm -rf $POETRY_CACHE_DIR

# The runtime image, used to just run the code provided its virtual environment
FROM python:3.11-slim-buster as runtime

ENV VIRTUAL_ENV=/app/.venv
PATH="/app/.venv/bin:$PATH"

COPY –from=builder ${VIRTUAL_ENV} ${VIRTUAL_ENV}

COPY ./demo_app ./demo_app
COPY ./.streamlit ./.streamlit

CMD ["streamlit", "run", "demo_app/main.py", "–server.port", "8080"]

In this Dockerfile, we have two runtime image tags. In the first one, we create a Poetry environment to form a virtual environment. Although the app is run in the second runtime image, the application is run after activating the virtual environment created in the first step.

Next, we’ll build a Docker image using DOCKER_BUILDKIT, which offers modern tooling to create Docker Images quickly and securely.

DOCKER_BUILDKIT=1 docker build –target=runtime . -t langchain-chat-app:latest

Docker-compose.yaml file

To run the app, we have also included the docker-compose.yml with the following contents:

version: ‘3’
services:
langchain-chat-app:
image: langchain-chat-app:latest
build: ./app
command: streamlit run demo_app/main.py –server.port 8080
volumes:
– ./demo_app/:/app/demo_app
ports:
– 8080:8080

To run the app on a local server, use the following command:

docker-compose up

Infrastructures

With support for Docker, the app can be deployed to any cloud infrastructure by following basic guides. We deployed the app on the following infrastructures.

Streamlit Public Cloud

Deploying Streamlit App on its public cloud is pretty straightforward with a GitHub account and repository. A deployed app can be accessed the LangChain demo.

Google App Engine

We have tried deploying the app on Google App Engine using Docker. The repo includes an app.yaml configuration file to deploy the following contents:

# With Dockerfile
runtime: custom
env: flex
# This sample incurs costs to run on the App Engine flexible environment.
# The settings below are to reduce costs during testing and are not appropriate
# for production use. For more information, see:
# https://cloud.google.com/appengine/docs/flexible/python/configuring-your-app-with-app-yaml

manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10

To deploy the chat app on Google App Engine, we used the following commands after installing the gcloud Python SDK:

gcloud app create –project=[YOUR_PROJECT_ID]
gcloud config set project [YOUR_PROJECT_ID]
gcloud app deploy app.yaml

A sample app deployed on Google App Engine (Figure 3) can be accessed through:

Figure 3: Link to a sample app deployed on Google App Engine.

Deploying the app using Google Cloud Run

We can also deploy the app on Google Cloud using the Cloud Run Service of GCP. Deploying an app using Cloud Run is faster than Google App Engine.

Here are the relevant features of adopting this method:

Package the application in a container.

Push the container to the artifacts registry.

Deploy the service from the pushed container.

Let’s go through the steps being followed to deploy the app using Google Cloud Run. We assume a project is created on Google Cloud already.

1. Enable the service:You can enable the services using gcloud sdk:

gcloud services enable cloudbuild.googleapis.com
gcloud services enable run.googleapis.com

2. Create and add roles to service account:With the following set of commands, we create a service account and set appropriate permissions. Modify the service SERVICE_ACCOUNT and PROJECT_ID:

gcloud iam service-accounts create langchain-app-cr
–display-name="langchain-app-cr"

gcloud projects add-iam-policy-binding langchain-chat
–member="serviceAccount:langchain-app-cr@langchain-chat.iam.gserviceaccount.com"
–role="roles/run.invoker"

gcloud projects add-iam-policy-binding langchain-chat
–member="serviceAccount:langchain-app-cr@langchain-chat.iam.gserviceaccount.com"
–role="roles/serviceusage.serviceUsageConsumer"

gcloud projects add-iam-policy-binding langchain-chat
–member="serviceAccount:langchain-app-cr@langchain-chat.iam.gserviceaccount.com"
–role="roles/run.admin"

3. Generate and push the Docker image:Using the following commands, we can generate and push the image to the Artifacts Registry. However, if this is the first time, we need to create the repository with permissions for the Docker placeholder:

DOCKER_BUILDKIT=1 docker build –target=runtime . -t australia-southeast1-docker.pkg.dev/langchain-chat/app/langchain-chat-app:latest
docker push australia-southeast1-docker.pkg.dev/langchain-chat/app/langchain-chat-app:latest

Here are the required commands to generate the Artifacts Repository and assign permissions:

gcloud auth configure-docker australia-southeast1-docker.pkg.dev

gcloud artifacts repositories create app
–repository-format=docker
–location=australia-southeast1
–description="A Langachain Streamlit App"
–async

The app will now be deployed.

Conclusion

This article delves into the various tools and technologies required for developing and deploying a chat app that is powered by LangChain, OpenAI API, and Streamlit. The Docker framework is also utilized in the process.

The application demonstration is available on both Streamlit Public Cloud and Google App Engine. Thanks to Docker support, developers can deploy it on any cloud platform they prefer.

This project can serve as a foundational template for rapidly developing apps that utilize the capabilities of LLMs. You can find more of Raza’s projects on his GitHub page.

Do you have an interesting use case or story about Docker in your AI/ML workflow? We would love to hear from you and maybe even share your story.

This post was originally published on Level Up Coding and is reprinted with permission.

Learn more

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Supercharging AI/ML Development with JupyterLab and Docker

JupyterLab is an open source application built around the concept of a computational notebook document. It enables sharing and executing code, data processing, visualization, and offers a range of interactive features for creating graphs. 

The latest version, JupyterLab 4.0, was released in early June. Compared to its predecessors, this version features a faster Web UI, improved editor performance, a new Extension Manager, and real-time collaboration.

If you have already installed the standalone 3.x version, evaluating the new features will require rewriting your current environment, which can be labor-intensive and risky. However, in environments where Docker operates, such as Docker Desktop, you can start an isolated JupyterLab 4.0 in a container without affecting your installed JupyterLab environment. Of course, you can run these without impacting the existing environment and access them on a different port. 

In this article, we show how to quickly evaluate the new features of JupyterLab 4.0 using Jupyter Docker Stacks on Docker Desktop, without affecting the host PC side.

Why containerize JupyterLab?

Users have downloaded the base image of JupyterLab Notebook stack Docker Official Image more than 10 million times from Docker Hub. What’s driving this significant download rate? There’s an ever-increasing demand for Docker containers to streamline development workflows, while allowing JupyterLab developers to innovate with their choice of project-tailored tools, application stacks, and deployment environments. Our JupyterLab notebook stack official image also supports both AMD64 and Arm64/v8 platforms.

Containerizing the JupyterLab environment offers numerous benefits, including the following:

Containerization ensures that your JupyterLab environment remains consistent across different deployments. Whether you’re running JupyterLab on your local machine, in a development environment, or in a production cluster, using the same container image guarantees a consistent setup. This approach helps eliminate compatibility issues and ensures that your notebooks behave the same way across different environments.

Packaging JupyterLab in a container allows you to easily share your notebook environment with others, regardless of their operating system or setup. This eliminates the need for manually installing dependencies and configuring the environment, making it easier to collaborate and share reproducible research or workflows. And this is particularly helpful in AI/ML projects, where reproducibility is crucial.

Containers enable scalability, allowing you to scale your JupyterLab environment based on the workload requirements. You can easily spin up multiple containers running JupyterLab instances, distribute the workload, and take advantage of container orchestration platforms like Kubernetes for efficient resource management. This becomes increasingly important in AI/ML development, where resource-intensive tasks are common.

Getting started

To use JupyterLab on your computer, one option is to use the JupyterLab Desktop application. It’s based on Electron, so it operates with a GUI on Windows, macOS, and Linux. Indeed, using JupyterLab Desktop makes the installation process fairly simple. In a Windows environment, however, you’ll also need to set up the Python language separately, and, to extend the capabilities, you’ll need to use pip to set up packages.

Although such a desktop solution may be simpler than building from scratch, we think the combination of Docker Desktop and Docker Stacks is still the more straightforward option. With JupyterLab Desktop, you cannot mix multiple versions or easily delete them after evaluation. Above all, it does not provide a consistent user experience across Windows, macOS, and Linux.

On a Windows command prompt, execute the following command to launch a basic notebook: 

docker container run -it –rm -p 10000:8888 jupyter/base-notebook

This command utilizes the jupyter/base-notebook Docker image, maps the host’s port 10000 to the container’s port 8888, and enables command input and a pseudo-terminal. Additionally, an option is added to delete the container once the process is completed.

After waiting for the Docker image to download, access and token information will be displayed on the command prompt as follows. Here, rewrite the URL http://127.0.0.1:8888 to http://127.0.0.1:10000 and then append the token to the end of this URL. In this example, the output will look like this:

Displayed on screen: http://127.0.0.1:8888/lab?token=6e302b2c99d56f1562e082091f4a3236051fb0a4135e10bb

To be entered in the browser address: http://127.0.0.1:10000/lab?token=6e302b2c99d56f1562e082091f4a3236051fb0a4135e10bb

Note that this token is specific to my environment, so copying it will not work for you. You should replace it with the one actually displayed on your command prompt.

Then, after waiting for a short while, JupyterLab will launch (Figure 1). From here, you can start a Notebook, access Python’s console environment, or utilize other work environments.

Figure 1. The page after entering the JupyterLab token. The left side is a file list, and the right side allows you to open Notebook creation, Python console, etc.

The port 10000 on the host side is mapped to port 8888 inside the container, as shown in Figure 2.

Figure 2. The host port 10000 is mapped to port 8888 inside the container.

In the Password or token input form on the screen, enter the token displayed in the command line or in the container logs (the string following token=), and select Log in, as shown in Figure 3.

Figure 3. Enter the token that appears in the container logs.

By the way, in this environment, the data will be erased when the container is stopped. If you want to reuse your data even after stopping the container, create a volume by adding the -v option when launching the Docker container.

To stop this container environment, click CTRL-C on the command prompt, then respond to the Jupyter server’s prompt Shutdown this Jupyter server (y/[n])? with y and press enter. If you are using Docker Desktop, stop the target container from the Containers.

Shutdown this Jupyter server (y/[n])? y
[C 2023-06-26 01:39:52.997 ServerApp] Shutdown confirmed
[I 2023-06-26 01:39:52.998 ServerApp] Shutting down 5 extensions
[I 2023-06-26 01:39:52.998 ServerApp] Shutting down 1 kernel
[I 2023-06-26 01:39:52.998 ServerApp] Kernel shutdown: 653f7c27-03ff-4604-a06c-2cb4630c098d

Once the display changes as follows, the container is terminated and the data is deleted.

When the container is running, data is saved in the /home/jovyan/work/ directory inside the container. You can either bind mount this as a volume or allocate it as a volume when starting the container. By doing so, even if you stop the container, you can use the same data again when you restart the container:

docker container run -it -p 10000:8888
-v “%cd%”:/home/jovyan/work
jupyter/base-notebook

Note: The symbol signifies that the command line continues on the command prompt. You may also write the command in a single line without using the symbol. However, in the case of Windows command prompt, you need to use the ^ symbol instead.

With this setup, when launched, the JupyterLab container mounts the /work/ directory to the folder where the docker container run command was executed. Because the data persists even when the container is stopped, you can continue using your Notebook data as it is when you start the container again.

Plotting using the famous Iris flower dataset

In the following example, we’ll use the Iris flower dataset, which consists of 150 records in total, with 50 samples from each of three types of Iris flowers (Iris setosa, Iris virginica, Iris versicolor). Each record consists of four numerical attributes (sepal length, sepal width, petal length, petal width) and one categorical attribute (type of iris). This data is included in the Python library scikit-learn, and we will use matplotlib to plot this data.

When trying to input the sample code from the scikit-learn page (the code is at the bottom of the page, and you can copy and paste it) into iPython, the following error occurs (Figure 4).

Figure 4. Error message occurred due to missing “matplotlib” module.

This is an error message on iPython stating that the “matplotlib” module does not exist. Additionally, the “scikit-learn” module is needed.

To avoid these errors and enable plotting, run the following command. Here, !pip signifies running the pip command within the iPython environment:

!pip install matplotlib scikit-learn

By pasting and executing the earlier sample code in the next cell on iPython, you can plot and display the Iris dataset as shown in Figure 5.

Figure 5. When the sample code runs successfully, two images will be output.

Note that it can be cumbersome to use the !pip command to add modules every time. Fortunately, you can add also add modules in the following ways:

By creating a dedicated Dockerfile

By using an existing group of images called Jupyter Docker Stacks

Building a Docker image

If you’re familiar with Dockerfile and building images, this five-step method is easy. Also, this approach can help keep the Docker image size in check. 

Step 1. Creating a directory

To build a Docker image, the first step is to create and navigate to the directory where you’ll place your Dockerfile and context:

mkdir myjupyter && cd myjupyter

Step 2. Creating a requirements.txt file

Create a requirements.txt file and list the Python modules you want to add with the pip command:

matplotlib
scikit-learn

Step 3. Writing a Dockerfile

FROM jupyter/base-notebook
COPY ./requirements.txt /home/jovyan/work
RUN python -m pip install –no-cache -r requirements.txt

This Dockerfile specifies a base image jupyter/base-notebook, copies the requirements.txt file from the local directory to the /home/jovyan/work directory inside the container, and then runs a pip install command to install the Python packages listed in the requirements.txt file.

Step 4. Building the Docker image

docker image build -t myjupyter

Step 5. Launching the container

docker container run -it -p 10000:8888
-v “%cd%”:/home/jovyan/work
myjupyter

Here’s what each part of this command does:

The docker run command instructs Docker to run a container.

The -it  option attaches an interactive terminal to the container.

The -p 10000:8888 maps port 10000 on the host machine to port 8888 inside the container. This allows you to access Jupyter Notebook running in the container via http://localhost:10000 in your web browser.

The -v “%cd%”:/home/jovyan/work mounts the current directory (%cd%) on the host machine to the /home/jovyan/work directory inside the container. This enables sharing files between the host and the Jupyter Notebook.

In this example, myjupyter is the name of the Docker image you want to run. Make sure you have the appropriate image available on your system. The operation after startup is the same as before. You don’t need to add libraries with the !pip command because the necessary libraries are included from the start.

How to use Jupyter Docker Stacks’ images

To execute the JupyterLab environment, we will utilize a Docker image called jupyter/scipy-notebook from the Jupyter Docker Stacks. Please note that the running Notebook will be terminated. After entering Ctrl-C on the command prompt, enter y and specify the running container.

Then, enter the following to run a new container:

docker container run -it -p 10000:8888
-v “%cd%”:/home/jovyan/work
jupyter/scipy-notebook

​​This command will run a container using the jupyter/scipy-notebook image, which provides a Jupyter Notebook environment with additional scientific libraries. 

Here’s a breakdown of the command:

The docker run command starts a new container.

The -it option attaches an interactive terminal to the container.

The -p 10000:8888 maps port 10000 on the host machine to port 8888 inside the container, allowing access to Jupyter Notebook at http://localhost:10000.

The -v “$(pwd)”:/home/jovyan/work mounts the current directory ($(pwd)) on the host machine to the /home/jovyan/work directory inside the container. This enables sharing files between the host and the Jupyter Notebook.

The jupyter/scipy-notebook is the name of the Docker image used for the container. Make sure you have this image available on your system.

The previous JupyterLab image was a minimal Notebook environment. The image we are using this time includes many packages used in the scientific field, such as numpy and pandas, so it may take some time to download the Docker image. This one is close to 4GB in image size.

Once the container is running, you should be able to run the Iris dataset sample immediately without having to execute pip like before. Give it a try.

Some images include TensorFlow’s deep learning library, ones for the R language, Julia programming language, and Apache Spark. See the image list page for details.

In a Windows environment, you can easily run and evaluate the new version of JupyterLab 4.0 using Docker Desktop. Doing so will not affect or conflict with the existing Python language environment. Furthermore, this setup provides a consistent user experience across other platforms, such as macOS and Linux, making it the ideal solution for those who want to try it.

Conclusion

By containerizing JupyterLab with Docker, AI/ML developers gain numerous advantages, including consistency, easy sharing and collaboration, and scalability. It enables efficient management of AI/ML development workflows, making it easier to experiment, collaborate, and reproduce results across different environments. With JupyterLab 4.0 and Docker, the possibilities for supercharging your AI/ML development are limitless. So why wait? Embrace containerization and experience the true power of JupyterLab in your AI/ML projects.

References

JupyterLab documentation

Docker Stacks documentation 

Learn more

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

4 Reasons I’m Excited to Attend DockerCon 2023

DockerCon is back! DockerCon is back! DockerCon is back! I can’t tell you how excited this makes me. Now, you may recognize that I work at Docker, but I haven’t always. Although I’ve worked at Docker for about 18 months, I’ve been in the Docker space for a long time. In fact, my first presentation about Docker was all the way back in December 2015.

Since then, I’ve helped organize, run, and speak at many meetups, and I was recognized as a Docker Captain in March 2017. I even received the inaugural Community Leader of the Year award for the North America region in 2018. As I look back throughout my career, many of my fondest memories can be attributed to my time at DockerCon. This will be my sixth in-person DockerCon, and here are four reasons I’m happy to be back in person this year.

Let’s go!

Michael Irwin at DockerCon EU 2018 in Barcelona.

#1 — Developer-focused content

We’ve all been to many “developer-focused” conferences, only to find out most of the sessions are sponsored sessions, the keynotes are relatively boring, and there really isn’t much focus on developers. I remember going to DockerCons and learning everything about Docker’s latest features, scaling our efforts to my team and across the organization, deepening my understanding of various cloud-native design patterns and architectures, and helping my team be as productive as possible. Especially earlier in my career, this experience helped me become the developer I am today.

As I’m helping plan DockerCon this year, I’ll admit we want all of the same things from the past, just updated. We want to help each and every developer better their craft and better deliver results for their customers… whoever they might be.

A selfie before my “Containers for Beginners” talk at DockerCon 2019 in San Francisco.

#2 — The hallway track

Honestly, this is probably one of my favorite parts of DockerCon.The Hallway Track is a special track of DockerCon in which attendees can network and learn from each other. If you want to learn about something, simply make a request! If you want to teach others, submit a session! Then, small groups get together and just chat. These hallway moments have truly been some of the best moments of DockerCon, both learning and teaching. There’s simply no better way to learn than from others who have walked the same journey.

The hallway track offers many chances to learn and connect.

#3 — Reconnecting with and making new friends

During my time as a Docker Captain from 2017-2022 (I had to semi-retire when I joined Docker), DockerCon was such a fun time to get together and spend time with my fellow Captains. In many ways, this felt like a family reunion. We learned together, taught each other, and provided insight and direction to the Docker product and executive teams. 

Although connecting with old friends was great, I always made new friends every year. Many of those came from the Hallway Track, but random conversations at meals, the conference party, and other one-offs have provided me with friendships and contacts I still use to this day. Whenever I’m stuck with any problem, there’s a good chance I can reach out to someone that I met at DockerCon.

Docker Captains gathered at DockerCon EU 2017 in Copenhagen.

Group selfie taken during a pre-conference bike ride at DockerCon 2019 in San Francisco.

#4 — Fun all around!

I may or may not be known for roaming around the DockerCon EU 2017 vendor hall in an inflatable dinosaur suit or using that same suit to start my “Containers for Beginners” talk at DockerCon 2019. Why? To be completely honest, because it’s fun! And while a conference isn’t only about having fun, it’s certainly a lot easier to be a part of a community when you’re doing so. DockerCon is not afraid to have a little bit of fun.

Me wearing a dino suit at the Docker booth at DockerCon EU 2017 in Copenhagen.

While these are some of the reasons I’m excited to have DockerCon back in person this year, and I’m sure there are tons more! We’d love to hear what makes you excited. Tweet #DockerCon why you’re excited, and we just might highlight you.

Learn more at the DockerCon 2023 website and register by August 31 to take advantage of early bird pricing. 

Learn more

Register for DockerCon 2023.

Get the latest release of Docker Desktop.

Vote on what’s next! Check out our public roadmap.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/