We ❤️ Open Source

A community education resource

6 min read

4 ways to extend your local AI co-pilot framework

Extend the framework of your AI co-pilot tools to fit your personal needs.

Welcome to the second part of this tutorial on extending your framework. Once you’re off the ground with the basic setup (part one of this tutorial), there are lots of great ways to extend the framework to fit your personal needs.

Setting up custom commands

One of the great features of continue is the ability to develop your own prompt-engineered commands. This can all be done in the "customCommands" section of the core config.json.

As an example, I created the /list-comprehension command to help with refactoring python code to use list/dict comprehensions wherever possible:

  "customCommands": [
   ...
   {
     "name": "list-comprehension",
     "prompt": "{{{ input }}}\n\nRefactor the selected python code to use list comprehensions wherever possible. Present the output as a python code snippet.",
     "description": "Refactor to use list comprehensions"
   }
 ]

You can then call your custom command from the chat window by selecting code and adding it to the context with Ctrl/Cmd-L, followed by invoking your command (/list-comprehension).

Calling custom commands from chat screenshot.

Experimenting with different models

Another nice feature of continue is the ability to easily toggle between different models in the chat panel. You can configure this using the "models" section of the core config.json. For me, this was useful to experiment with the difference between the various sizes in the Granite Code family.

Toggle between different models screenshot.

To set this up, you simply have to add additional entries in the "models" list:

  "models": [
   {
     "title": "Granite Code 34b",
     "provider": "ollama",
     "model": "granite-code:34b"
   },
   {
     "title": "Granite Code 20b",
     "provider": "ollama",
     "model": "granite-code:20b"
   },
   {
     "title": "Granite Code 8b",
     "provider": "ollama",
     "model": "granite-code:8b"
   },
   {
     "title": "Granite Code 3b",
     "provider": "ollama",
     "model": "granite-code:3b"
   }
 ],

Here are some other models on ollama that may be worth trying out. Many of these models do not have standard open source licenses, but may be worth experimenting with:

ModelSizesLicenseLink
llama 38B, 70BMETA LLAMA 3https://ollama.com/library/llama3
codellama7B, 13B, 34B, 70BLLAMA 2https://ollama.com/library/codellama
deepseek-coder-v216B, 236BDEEPSEEKhttps://ollama.com/library/deepseek-coder-v2
codegemma2B, 7BGemmahttps://ollama.com/library/codegemma
starcoder23B, 7B, 15BBigCode Open RAIL-M v1https://ollama.com/library/starcoder2
codestral22BMistralhttps://github.com/open-assistant/codestral

Read more from We Love Open Source: Getting started with Llamafile tutorial

Managing ollama serve

One of the things that caused some concern with this setup is the need to manage a global ollama serve process. I work on many different projects, each with its own VS Code instance and local terminals, so launching ollama serve from a single project’s terminal felt unnatural and often required me to hunt-and-peck through all of my workspaces to figure out where I’d launched it.

To solve this, I took a detour into macOS application development (which is waaay outside my wheelhouse!) For me, one of the beauties of the ollama interface is its similarity to docker, so naturally, I wanted to extend the similarities to match the ease of Docker Desktop / Rancher Desktop / Podman Desktop, all of which provide a convenient menu-bar app to manage the global server process. While not nearly as feature-rich as these ecosystems, the ollama-bar tool I built makes it seamless to manage a single instance of ollama serve from your menu bar.

ollama-bar tool screenshot.

To set up ollama-bar as a desktop app, you can build it from source:

git clone git@github.com:IBM/ollama-bar.git
cd ollama-bar

# Build and install the app in /Applications
make install-app

# Open the app
open /Applications/ollama-bar.app

You can also set the app up to launch on login:

  1. Open System Settings.
  2. Go to General > Login Items.
  3. Under Open at Login, click the + icon.
  4. Select /Applications/ollama-bar and click Open.

Import local models

While the ollama library is a great tool to manage your models, many of us also have numerous model files already downloaded on our machines that we don’t want to duplicate. The ollama Modelfile is a powerful tool that can be used to create customized model setups by deriving from known models and customizing the inference parameters, including the ability to add (Q)LoRA Adapters (see the docs for more details).

For our purpose, we only need the simple FROM statement, which can point to a known model in the ollama library or a local file on disk. This makes it really easy to wrap the process into an import-to-ollama bash script:

#!/usr/bin/env bash

file_path=""
model_name=""
model_label="local"
while [[ $# -gt 0 ]]
do
   key="$1"

   case $key in
       -f|--file)
           file_path="$2"
           shift
           ;;
       -m|--model-name)
           model_name="$2"
           shift
           ;;
       -l|--model-label)
           model_label="$2"
           shift
           ;;
       *)
           echo "Unknown option: $key"
           exit 1
           ;;
   esac
   shift
done

if [ "$file_path" == "" ]
then
   echo "Missing required argument -f|--file"
   exit 1
fi
file_path="$(realpath $file_path)"

# Check if model_name is empty and assign file name as model_name if true
if [ "$model_name" == "" ]
then
   model_name=$(basename $file_path)
   model_name="${model_name%.*}"
fi

# Append the model label to the model name
model_name="$model_name:$model_label"
echo "model_name: $model_name"

# Create a temporary directory for working
tempdir=$(mktemp -d)
echo "Working Dir: $tempdir"

# Write the file path to Modelfile in the temporary directory
echo "FROM $file_path" > $tempdir/Modelfile

# Import the model using ollama create command
echo "importing model $model_name"
ollama create $model_name -f $tempdir/Modelfile

Local Large Language Model (LLM) web user interface

There are numerous additional AI applications, use cases, and patterns that can be adapted to work with local Large Language Models (LLMs). Exploring LLMs locally can be greatly accelerated with a local web user interface (UI). The Open WebUI project (spawned out of ollama originally) works seamlessly with ollama to provide a web-based LLM workspace for experimenting with prompt engineering, retrieval augmented generation (RAG), and tool use.

To set up Open WebUI, follow the steps in their documentation. The simplest versions are:

Docker

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Pip

pip install open-webui
open-webui serve

Once running, you can open the UI at http://localhost:8080.

open http://localhost:8080

The first time you log in, you’ll need to set up an “account.” Since this entirely local, you can fill in garbage values (foo@bar.com/asdf) and be off to the races!

Web user interface screenshot.

Next steps


This article is adapted from “Build a local AI co-pilot using IBM Granite Code, Ollama, and Continue” by Gabe Goodhart, and is republished with permission from the author.

About the Author

OS Developer, Software Engineer, Software Developer

Read Gabe Goodhart's Full Bio

The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.

Want to contribute your open source content?

Contribute to We ❤️ Open Source

Help educate our community by contributing a blog post, tutorial, or how-to.

This year we're hosting two world-class events!

Join us for AllThingsOpen.ai, March 17-18, and All Things Open 2025, October 12-14.

Open Source Meetups

We host some of the most active open source meetups in the U.S. Get more info and RSVP to an upcoming event.