Using Docker AI Tools for Devs to Provide Context for Better Code Fixes

This ongoing Docker Labs GenAI series explores the exciting space of AI developer tools. At Docker, we believe there is a vast scope to explore, openly and without the hype. We will share our explorations and collaborate with the developer community in real-time. Although developers have adopted autocomplete tooling like GitHub Copilot and use chat, there is significant potential for AI tools to assist with more specific tasks and interfaces throughout the entire software lifecycle. Therefore, our exploration will be broad. We will be releasing software as open source so you can play, explore, and hack with us, too.

At Docker Labs, we’ve been exploring how LLMs can connect different parts of the developer workflow, bridging gaps between tools and processes. A key insight is that LLMs excel at fixing code issues when they have the right context. To provide this context, we’ve developed a process that maps out the codebase using linting violations and the structure of top-level code blocks. 

By combining these elements, we teach the LLM to construct a comprehensive view of the code, enabling it to fix issues more effectively. By leveraging containerization, integrating these tools becomes much simpler.

Previously, my linting process felt a bit disjointed. I’d introduce an error, run Pylint, and receive a message that was sometimes cryptic, forcing me to consult Pylint’s manual to understand the issue. When OpenAI released ChatGPT, the process improved slightly. I could run Pylint, and if I didn’t grasp an error message, I’d copy the code and the violation into GPT to get a better explanation. Sometimes, I’d ask it to fix the code and then manually paste the solution back into my editor.

However, this approach still required several manual steps: copying code, switching between applications, and integrating fixes. How might we improve this process?

Docker’s AI Tools for Devs prompt runner is an architecture that allows us to integrate tools like Pylint directly into the LLM’s workflow through containerization. By containerizing Pylint and creating prompts that the LLM can use to interact with it, we’ve developed a system where the LLM can access the necessary tools and context to help fix code issues effectively.

Understanding the cognitive architecture

For the LLM to assist effectively, it needs a structured way of accessing and processing information. In our setup, the LLM uses the Docker prompt runner to interact with containerized tools and the codebase. The project context is extracted using tools such as Pylint and Tree-sitter that run against the project. This context is then stored and managed, allowing the LLM to access it when needed.

By having access to the codebase, linting tools, and the context of previous prompts, the LLM can understand where problems are, what they are, and have the right code fragments to fix them. This setup replaces the manual process of finding issues and feeding them to the LLM with something automatic and more engaging.

Streamlining the workflow

Now, within my workflow, I can ask the assistant about code quality and violations directly. The assistant, powered by an LLM, has immediate access to a containerized Pylint tool and a database of my code through the Docker prompt runner. This integration allows the LLM to use tools to assist me directly during development, making the programming experience more efficient.

This approach helps us rethink how we interact with our tools. By enabling a conversational interface with tools that map code to issues, we’re exploring possibilities for a more intuitive development experience. Instead of manually finding problems and feeding them to an AI, we can convert our relationship with tools themselves to be conversational partners that can automatically detect issues, understand the context, and provide solutions.

Walking through the prompts

Our project is structured around a series of prompts that guide the LLM through the tasks it needs to perform. These prompts are stored in a Git repository and can be versioned, tracked, and shared. They form the backbone of the project, allowing the LLM to interact with tools and the codebase effectively. We automate this entire process using Docker and a series of prompts stored in a Git repository. Each prompt corresponds to a specific task in the workflow, and Docker containers ensure a consistent environment for running tools and scripts.

Workflow steps

An immediate and existential challenge we encountered was that this class of problem has a lot of opportunities to overwhelm the context of the LLM. Want to read a source code file? It has to be small enough to read. Need to work on more than one file? Your realistic limit is three to four files at once. To solve this, we can instruct the LLM to automate its own workflow with tools, where each step runs in a Docker container.

Again, each step in this workflow runs in a Docker container, which ensures a consistent and isolated environment for running tools and scripts. The first four steps prepare the agent to be able to extract the right context for fixing violations. Once the agent has the necessary context, the LLM can effectively fix the code issues in step 5.

1. Generate violations report using Pylint:

Run Pylint to produce a violation report.

2. Create a SQLite database:

Set up the database schema to store violation data and code snippets.

3. Generate and run INSERT statements:

Decouple violations from the range they represent.

Use a script to convert every violation and range from the report into SQL insert statements.

Run the statements against the database to populate it with the necessary data.

4. Index code in the database:

Generate an abstract syntax tree (AST) of the project with Tree-sitter (Figure 1).

Figure 1: Generating an abstract syntax tree.

Find all second-level nodes (Figure 2). In Python’s grammar, second-level nodes are statements inside of a module.

Figure 2: Extracting content for the database.

Index these top-level ranges into the database.

Populate a new table to store the source code at these top-level ranges.

5. Fix violations based on context:

Once the agent has gathered and indexed the necessary context, use prompts to instruct the LLM to query the database and fix the code issues (Figure 3).

Figure 3: Instructions for fixing violations.

Each step from 1 to 4 builds the foundation for step 5, where the LLM, with the proper context, can effectively fix violations. The structured preparation ensures that the LLM has all the information it needs to address code issues with precision.

Refining the context for LLM fixes

To understand how our system improves code fixes, let’s consider a specific violation flagged by Pylint. Say we receive a message that there’s a violation on line 60 of our code file block_listed_name.py:

{
"type": "convention",
"module": "block_listed_name",
"obj": "do_front",
"line": 60,
"column": 4,
"endLine": 60,
"endColumn": 7,
"path": "cloned_repo/naming_conventions/block_listed_name.py",
"symbol": "disallowed-name",
"message": "Disallowed name "foo"",
"message-id": "C0104"
}

From this Pylint violation, we know that the variable foo is a disallowed name. However, if we tried to ask the LLM to fix this issue based solely on this snippet of information, the response wouldn’t be as effective. Why? The LLM lacks the surrounding context — the full picture of the function in which this violation occurs.

This is where indexing the codebase becomes essential

Because we’ve mapped out the codebase, we can now ask the LLM to query the index and retrieve the surrounding code that includes the do_front function. The LLM can even generate the SQL query for us, thanks to its knowledge of the database schema. Once we’ve retrieved the full function definition, the LLM can work with a more complete view of the problem:

def do_front(front_filename, back_filename):
"""
Front strategy: loop over front image,
detect blue pixels there,
substitute in pixels from back.
Return changed front image.
"""
foo = SimpleImage(front_filename)
back = SimpleImage(back_filename)
for y in range(foo.height):xc
for x in range(foo.width):
pixel = foo.get_pixel(x, y)
# Detect blue pixels in front and replace with back pixels
if pixel[2] > 2 * max(pixel[0], pixel[1]):
back_pixel = back.get_pixel(x, y)
foo.set_pixel(x, y, back_pixel)
return foo

Now that the LLM can see the whole function, it’s able to propose a more meaningful fix. Here’s what it suggests after querying the indexed codebase and running the fix:

def do_front(front_filename, back_filename):
"""
Front strategy: loop over front image,
detect blue pixels there,
substitute in pixels from back.
Return changed front image.
"""
front_image = SimpleImage(front)
back_image = SimpleImage(back_filename)
for y in range(front_image.height):
for x in range(front_image.width pixel = front_image.get_pixel(x y)
# Detect blue pixels in front and replace with back pixels
if pixel[2 > * max(pixel[0 pixel[1]):
back_pixel = back_image.get_pixel(x,)
front_image.set_pixel(x,, back_pixel)
return front_image

Here, the variable foo has been replaced with the more descriptive front_image, making the code more readable and understandable. The key step was providing the LLM with the correct level of detail — the top-level range — instead of just a single line or violation message. With the right context, the LLM’s ability to fix code becomes much more effective, which ultimately streamlines the development process.

Remember, all of this information is retrieved and indexed by the LLM itself through the prompts we’ve set up. Through this series of prompts, we’ve reached a point where the assistant has a comprehensive understanding of the codebase. 

At this stage, not only can I ask for a fix, but I can even ask questions like “what’s the violation at line 60 in naming_conventions/block_listed_name.py?” and the assistant responds with:

On line 60 of naming_conventions/block_listed_name.py, there's a violation: Disallowed name 'foo'. The variable name 'foo' is discouraged because it doesn't convey meaningful information about its purpose.

Although Pylint has been our focus here, this approach points to a new conversational way to interact with many tools that map code to issues. By integrating LLMs with containerized tools through architectures like the Docker prompt runner, we can enhance various aspects of the development workflow.

We’ve learned that combining tool integration, cognitive preparation of the LLM, and a seamless workflow can significantly improve the development experience. This integration allows an LLM to use tools to directly help while developing, and while Pylint has been the focus here, this also points to a new conversational way to interact with many tools that map code to issues.

To follow along with this effort, check out the GitHub repository for this project.

For more on what we’re doing at Docker, subscribe to our newsletter.

Learn more

Read the Docker Labs GenAI series.

Subscribe to the Docker Newsletter. 

Get the latest release of Docker Desktop.

Have questions? The Docker community is here to help.

New to Docker? Get started.

Quelle: https://blog.docker.com/feed/

Announcing IBM Granite AI Models Now Available on Docker Hub

We are thrilled to announce that Granite models, IBM’s family of open source and proprietary models built for business, as well as Red Hat InstructLab model alignment tools, are now available on Docker Hub. 

Now, developer teams can easily access, deploy, and scale applications using IBM’s AI models specifically designed for developers.

This news will be officially announced during the AI track of the keynote at IBM TechXchange on October 22. Attendees will get an exclusive look at how IBM’s Granite models on Docker Hub accelerate AI-driven application development across multiple programming languages.

Why Granite on Docker Hub?

With a principled approach to data transparency, model alignment, and security, IBM’s open source Granite models represent a significant leap forward in natural language processing. The models are available under an Apache 2.0 license, empowering developer teams to bring generative AI into mission-critical applications and workflows. 

Granite models deliver superior performance in coding and targeted language tasks at lower latencies, all while requiring a fraction of the compute resources and reducing the cost of inference. This efficiency allows developers to experiment, build, and scale generative AI applications both on-premises and in the cloud, all within departmental budgetary limits.

Here’s what this means for you:

Simplified deployment: Pull the Granite image from Docker Hub and get up and running in minutes.

Scalability: Docker offers a lightweight and efficient method for scaling artificial intelligence and machine learning (AI/ML) applications. It allows you to run multiple containers on a single machine or distribute them across different machines in a cluster, enabling horizontal scalability.

Flexibility: Customize and extend the model to suit your specific needs without worrying about underlying infrastructure.

Portability: By creating Docker images once and deploying them anywhere, you eliminate compatibility problems and reduce the need for configurations. 

Community support: Leverage the vast Docker and IBM communities for support, extensions, and collaborations.

In addition to the IBM Granite models, Red Hat also made the InstructLab model alignment tools available on Docker Hub. Developers using InstructLab can adapt pre-trained LLMs using far less real-world data and computing resources than alternative methodologies. InstructLab is model-agnostic and can be used to fine-tune any LLM of your choice by providing additional skills and knowledge.

With IBM Granite AI models and InstructLab available on Docker Hub, Docker and IBM enable easy integration into existing environments and workflows.

Getting started with Granite

You can find the following images available on Docker Hub:

InstructLab: Ideal for desktop or Mac users looking to explore InstructLab, this image provides a simple introduction to the platform without requiring specialized hardware. It’s perfect for prototyping and testing before scaling up.

InstructLab with CUDA support: Designed for running full training workflows on GPU-equipped Linux servers, this image accelerates the synthetic data generation and training process by leveraging NVIDIA GPUs.

Granite-7b-lab: This image is optimized for model serving and inference on desktop or Mac environments, using the Granite-7B model. It allows for efficient and scalable inference tasks without needing a GPU, perfect for smaller-scale deployments or local testing.

Granite-7b-lab with CUDA support: For those with GPU-equipped Linux servers, this image supports faster model inference and serving through CUDA acceleration. This is ideal for high-performance AI applications where response times and throughput are critical.

How to pull and run IBM Granite images from Docker Hub 

IBM Granite provides a toolset for building and managing cloud-native applications. Follow these steps to pull and run an IBM Granite image using Docker and the CLI. You can follow similar steps for the Red Hat InstructLab images.

Authenticate to Docker Hub

Enter your Docker username and password when prompted.

Pull the IBM Granite Image

Pull the IBM Granite image from Docker Hub. There are two versions of the image: 

redhat/granite-7b-lab-gguf: For Mac/desktop users with no GPU support

redhat/granite-7b-lab-gguf-cuda: For Linux NVIDIA® CUDA® support

Run the Image in a Container

Start a container with the IBM Granite image. The container can be started in two modes: CLI (default) and server.

To start the container in CLI mode, run the following:docker run –ipc=host -it redhat/granite-7b-lab-gguf 

This command opens an interactive bash session within the container, allowing you to use the tools.

To run the container in server mode, run the following command:

docker run –ipc=host -it redhat/granite-7b-lab-gguf -s

You can check IBM Granite’s documentation for details on using IBM Granite Models.

Join us at IBM TechXchange

Granite on Docker Hub will be officially announced at the IBM TechXchange Conference, which will be held October 21-24 in Las Vegas. Our head of technical alliances, Eli Aleyner, will show a live demonstration at the AI track of the keynote during IBM TechXchange. Oleg Šelajev, Docker’s staff developer evangelist, will show how app developers can test their GenAI apps with local models. Additionally, you’ll learn how Docker’s collaboration with Red Hat is improving developer productivity.

The availability of Granite on Docker Hub marks a significant milestone in making advanced AI models accessible to all. We’re excited to see how developer teams will harness the power of Granite to innovate and solve complex challenges.

Stay anchored for more updates, and as always, happy coding!

Learn more

Read the Docker Labs GenAI series.

Subscribe to the Docker Newsletter. 

What is InstructLab?

What are Granite Models?

Accelerating AI Development with IBM Granite AI Models and Docker — IBM TechXchange session with Eli Aleyner.

Developer productivity for apps with AI – IBM TechXchange session with Oleg Šelajev.

Quelle: https://blog.docker.com/feed/