Welcome, AI enthusiasts! Today, we’re diving into an exciting world where traditional AI frameworks clash with self-hosting options like Ollama. Let’s explore how to harness the capabilities of these cutting-edge models right within your familiar development environment—Visual Studio Code.
Setting Up Your Self-Hosting Environment
Ollama has options for Windows, Mac & Linux, here I’m using Linux. Also they have a option to install it bare metal but I’m using Docker to host it for ease of use.
To install on bare metal Ollama provides us with a one line command curl -fsSL https://ollama.com/install.sh | sh
(Note:- this kind of one liner is not a recommended to use.)
As I’m poor here I’ll install ollama-cpu, using the compose file from big bear’s github repo. After all the necessary changes run docker compose up -d
Docker Compose
name: big-bear-ollama-cpu
services:
big-bear-ollama-cpu:
container_name: big-bear-ollama-cpu
devices:
- /dev/kfd:undefined
- /dev/dri:undefined
environment:
- PORT=11434
image: ollama/ollama:0.5.7
ports:
- mode: ingress
target: 11434
published: "11434"
protocol: tcp
restart: unless-stopped
volumes:
- type: bind
source: /AppData/big-bear-ollama-cpu/.ollama
target: /root/.ollama
bind:
create_host_path: true
networks:
- default
privileged: false
hostname: big-bear-ollama-cpu
networks:
default:
name: big-bear-ollama-cpu_default
Connecting the model to vscode with ‘Continue’ extension
First we need to download the model using ollama pull $model_name
we’ll use. I’m using qwen2.5-coder.
Install the extension Continue in Visual Studio Code
Configure ‘Continue’
Add this config under models & tabAutocompleteModel
section in the config file, use the 'apiBase'
option in case the model is hosted in another device.
"title": "Qwen2.5 Coder",
"provider": "ollama",
"model": "qwen2.5-coder:1.5b",
"apiBase": "http://server_ip:11434"
Now we have auto code completion & suggestion like Github Copilot but bear in mind that this model is inferior compared to Copilot unless you are hosting a full size model.