We ❤️ Open Source

A community education resource

September 12, 2024

5 min read

Create your own AI coding assistant with Continue and Ollama

Learn how to set up, explore, and use Continue and Ollama together.

By Ty Dunn

Abstract blue waves — Image by Stefan Schweihofer from Pixabay

Continue enables you to easily create your own coding assistant directly inside Visual Studio Code and JetBrains with open source LLMs. All this can run entirely on your own laptop or have Ollama deployed on a server to remotely power code completion and chat experiences based on your needs.

To get set up, you’ll want to install

Continue for VS Code or JetBrains
Ollama for macOS, Linux, or Windows

Once you have them downloaded, here’s what we recommend exploring.

Try out Mistral AI’s Codestral 22B model for autocomplete and chat

As of the now, Codestral is our current favorite model capable of both autocomplete and chat. This model demonstrates how LLMs have improved for programming tasks. However, with 22B parameters and a non-production license, it requires quite a bit of VRAM and can only be used for research and testing purposes, so it might not be the best fit for daily local usage.

a. Download and run Codestral in your terminal by running

ollama run codestral

b. Click on the gear icon in the bottom right corner of Continue to open your config.json and add

{
 "models": [
   {
     "title": "Codestral",
     "provider": "ollama",
     "model": "codestral"
   }
 ],
 "tabAutocompleteModel": {
   "title": "Codestral",
   "provider": "ollama",
   "model": "codestral"
 }
}

Use DeepSeek Coder 6.7B for autocomplete and Llama 3 8B for chat

Depending on how much VRAM you have on your machine, you might be able to take advantage of Ollama’s ability to run multiple models and handle multiple concurrent requests by using DeepSeek Coder 6.7B for autocomplete and Llama 3 8B for chat. If your machine can’t handle both at the same time, then try each of them and decide whether you prefer a local autocomplete or a local chat experience. You can then use a remotely hosted or SaaS model for the other experience.

a. Download and run DeepSeek Coder 6.7B in your terminal by running

ollama run deepseek-coder:6.7b-base

b. Download and run Llama 3 8B in another terminal window by running

ollama run llama3:8b

c. Click on the gear icon in the bottom right corner of Continue to open your config.json and add

{
 "models": [
   {
     "title": "Llama 3 8B",
     "provider": "ollama",
     "model": "llama3:8b"
   }
 ],
 "tabAutocompleteModel": {
   "title": "DeepSeek Coder 6.7B",
   "provider": "ollama",
   "model": "deepseek-coder:6.7b-base"
 }
}

Use `nomic-embed-text` embeddings with Ollama to power `@codebase`

Continue comes with an @codebase context provider built-in, which lets you automatically retrieve the most relevant snippets from your codebase. Assuming you have a chat model set up already (e.g. Codestral, Llama 3), you can keep this entire experience local thanks to embeddings with Ollama and LanceDB. As of now, we recommend using nomic-embed-text embeddings.

a. Download nomic-embed-text in your terminal by running

ollama pull nomic-embed-text

b. Click on the gear icon on the bottom right corner of Continue to open your config.json and add

{
 "embeddingsProvider": {
   "provider": "ollama",
   "model": "nomic-embed-text"
 }
}

c. Depending on the size of your codebase, it might take some time to index and then you can ask it questions with important codebase sections automatically being found and used in the answer (e.g. “@codebase what is the default context length for Llama 3?”)

Fine-tune StarCoder 2 on your development data and push it to the Ollama model library

When you use Continue, you automatically generate data on how you build software. By default, this development data is saved to .continue/dev_data on your local machine. When combined with the code that you ultimately commit, it can be used to improve the LLM that you or your team use (if you allow). For example, you can use accepted autocomplete suggestions from your team to fine-tune a model like StarCoder 2 to give you better suggestions.

a. Extract and load the “accepted tab suggestions” into Hugging Face Datasets

b. Use Hugging Face Supervised Fine-tuning Trainer to fine-tune StarCoder 2

c. Push the model to the Ollama model library for your team to use and measure how your acceptance rate changes

Learn more about Ollama by using `@docs` to ask questions with the help of Continue

Continue also comes with an @docs context provider built-in, which lets you index and retrieve snippets from any documentation site. Assuming you have a chat model set up already (e.g. Codestral, Llama 3), you can keep this entire experience local by providing a link to the Ollama README on GitHub and asking questions to learn more with it as context.

a. Type @docs in the chat sidebar, select “Add Docs”, copy and paste “https://github.com/ollama/ollama” into the URL field, and type “Ollama” into the title field

b. It should quickly index the Ollama README and then you can ask it questions with important sections automatically being found and used in the answer (e.g. “@Ollama how do I run Llama 3?”)

Join our Discord

Now that you have tried these different explorations, you should hopefully have a much better sense of what is the best way for you to use Continue and Ollama. If you ran into problems along the way or have questions, join the Continue Discord or the Ollama Discord to get some help and answers.

More from We Love Open Source

This article is adapted from “An entirely open-source AI code assistant inside your editor” by Ty Dunn, and is republished with permission from the author.

About the Author

Co-founder & CEO @ Continue

Read Ty Dunn's Full Bio

This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.

The opinions expressed on this website are those of each author, not of the author's employer or All Things Open/We Love Open Source.

Want to contribute your open source content?

Get Started

Contribute to We ❤️ Open Source

Help educate our community by contributing a blog post, tutorial, or how-to.

This year we're hosting two world-class events!

Join us for AllThingsOpen.ai, March 17-18, and All Things Open 2025, October 12-14.

Open Source Meetups

We host some of the most active open source meetups in the U.S. Get more info and RSVP to an upcoming event.