Ollama and VS Code’s Continue Extension for Code Autocompletion

Welcome, AI enthusiasts! Today, we’re diving into an exciting world where traditional AI frameworks clash with self-hosting options like Ollama. Let’s explore how to harness the capabilities of these cutting-edge models right within your familiar development environment—Visual Studio Code.

Setting Up Your Self-Hosting Environment

Ollama has options for Windows, Mac & Linux, here I’m using Linux. Also they have a option to install it bare metal but I’m using Docker to host it for ease of use.

To install on bare metal Ollama provides us with a one line command curl -fsSL https://ollama.com/install.sh | sh (Note:- this kind of one liner is not a recommended to use.)

As I’m poor here I’ll install ollama-cpu, using the compose file from big bear’s github repo. After all the necessary changes run docker compose up -d

Docker Compose

name: big-bear-ollama-cpu
services:
  big-bear-ollama-cpu:
    container_name: big-bear-ollama-cpu
    devices:
      - /dev/kfd:undefined
      - /dev/dri:undefined
    environment:
      - PORT=11434
    image: ollama/ollama:0.5.7
    ports:
      - mode: ingress
        target: 11434
        published: "11434"
        protocol: tcp
    restart: unless-stopped
    volumes:
      - type: bind
        source: /AppData/big-bear-ollama-cpu/.ollama
        target: /root/.ollama
        bind:
          create_host_path: true
    networks:
      - default
    privileged: false
    hostname: big-bear-ollama-cpu
networks:
  default:
    name: big-bear-ollama-cpu_default

Connecting the model to vscode with ‘Continue’ extension

First we need to download the model using ollama pull $model_name we’ll use. I’m using qwen2.5-coder.

Install the extension Continue in Visual Studio Code

Configure ‘Continue’

Add this config under models & tabAutocompleteModel section in the config file, use the 'apiBase' option in case the model is hosted in another device.

      "title": "Qwen2.5 Coder",
      "provider": "ollama",
      "model": "qwen2.5-coder:1.5b",
      "apiBase": "http://server_ip:11434"

Now we have auto code completion & suggestion like Github Copilot but bear in mind that this model is inferior compared to Copilot unless you are hosting a full size model.