Guide · Local dev tools · ~20 min read

Laptop on a desk for software development work
Photo by LV LIU on Pexels

Build a local Claude Code alternative with OpenCode + Qwen Coder

OpenCode is an open-source terminal coding agent — read files, edit code, run shell commands, and iterate in your repo, similar in spirit to Claude Code or Cursor’s agent mode. Pair it with a Qwen Coder model on your own GPU (via LM Studio or Ollama) and you get private, subscription-free coding assistance at home.

What you are building

Hardware and model picks

Coding agents need strong tool calling (read file, write patch, run tests). Prefer a dedicated coder model, not a general chat model.

Step 1 — Run the model locally (LM Studio)

  1. Install LM Studio (macOS, Windows, Linux).
  2. Open the Discover tab and search for qwen2.5-coder or qwen3-coder.
  3. Download a Q4_K_M (or similar) quant suited to your VRAM.
  4. Go to Developer → load the model → Start server.
  5. Note the server URL (default http://127.0.0.1:1234) and the exact model id shown in the server panel.

Verify the API

# List models exposed by the server
curl http://127.0.0.1:1234/v1/models

# Quick completion test
curl http://127.0.0.1:1234/v1/chat/completions
  -H "Content-Type: application/json"
  -d '{"model":"YOUR_MODEL_ID","messages":[{"role":"user","content":"Say OK"}],"max_tokens":16}'

Replace YOUR_MODEL_ID with the id from LM Studio (e.g. qwen/qwen2.5-coder-14b). On another machine over Tailscale, use your 100.x.x.x address instead of 127.0.0.1.

Alternative — Ollama (CLI-friendly)

If you prefer a headless Linux box or minimal setup:

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

ollama pull qwen2.5-coder:7b
ollama pull qwen2.5-coder:14b   # if you have the VRAM

# API is on port 11434
curl http://127.0.0.1:11434/v1/models

Use http://127.0.0.1:11434/v1 as baseURL in OpenCode instead of LM Studio’s port.

Step 2 — Install OpenCode

OpenCode is the terminal agent. Pick one install method:

macOS / Linux (recommended installer)

curl -fsSL https://opencode.ai/install | bash

npm (Node.js 18+)

npm install -g opencode-ai

Homebrew (macOS / Linux)

brew install anomalyco/tap/opencode

Windows

opencode --version

Step 3 — Point OpenCode at your local Qwen Coder

Create or edit the OpenCode config. Global config path (typical): ~/.config/opencode/opencode.json. You can also add opencode.json in a project root.

Example for LM Studio on the same machine:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "lmstudio": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "LM Studio (local)",
      "options": {
        "baseURL": "http://127.0.0.1:1234/v1"
      },
      "models": {
        "qwen/qwen2.5-coder-14b-instruct": {
          "name": "Qwen2.5 Coder 14B (local)",
          "limit": {
            "context": 32768,
            "output": 8192
          }
        }
      }
    }
  },
  "model": "lmstudio/qwen/qwen2.5-coder-14b-instruct",
  "small_model": "lmstudio/qwen/qwen2.5-coder-14b-instruct",
  "share": "disabled"
}

Adjust keys under models to match curl …/v1/models. Set model and small_model to provider-id/model-key so OpenCode does not call cloud models for session titles. share: disabled keeps conversations off public URLs.

Ollama variant

"options": {
    "baseURL": "http://127.0.0.1:11434/v1"
  },
  "models": {
    "qwen2.5-coder:14b": {
      "name": "Qwen2.5 Coder 14B (Ollama)"
    }
  }

Step 4 — First run in a project

  1. Start LM Studio server (or ollama serve) and keep the coder model loaded.
  2. Open a repo: cd /path/to/your-project
  3. Launch OpenCode: opencode
  4. Run /models and select your local Qwen Coder entry.
  5. Run /init once per project — OpenCode writes AGENTS.md with project context.
  6. Use Tab to switch Plan mode (read-only planning) vs Build mode (applies edits).

Example prompts that work well:

# Explain
How does auth work in @src/api/middleware.ts ?

# Small change
Add a --dry-run flag to the deploy script in @scripts/deploy.sh

# Tests
Run the unit tests for @pkg/foo and fix any failures

Useful built-in commands: /undo, /redo, /connect (cloud providers, optional).

LM Studio + Qwen 3.5 thinking models

If you use Qwen3.5 models that only emit a reasoning block on LM Studio’s native API, prefer the OpenAI-compatible server tab (/v1/chat/completions) in OpenCode, or disable thinking in LM Studio’s model settings. Coding agents need direct JSON/tool output, not long hidden chains of thought.

Troubleshooting

SymptomFix
OpenCode can’t reach the model Confirm server is running; test with curl …/v1/models; check firewall; on Tailscale use 100.x.x.x:1234.
Tool calls fail / loops forever Use a Coder model (Qwen-Coder, DeepSeek-Coder), not base chat. Increase context in config limit.
Out of memory Smaller quant (Q4), smaller model (7B), or unload other GPU apps.
Slow responses Normal on local 7B–14B; use Plan mode for big tasks; give file paths with @path includes.
Still hits the cloud Set both model and small_model in opencode.json; avoid /connect cloud providers.

How this compares to Claude Code

Summary

Install OpenCode, run Qwen2.5-Coder or Qwen3-Coder in LM Studio (or Ollama), wire opencode.json to http://127.0.0.1:1234/v1, then /init your repo and work in Plan/Build mode. You get a capable, local coding agent without sending your codebase to a subscription API.

← Back to price comparison · Local LLM guide · DIY NAS guide · About hwprice

Comments

Questions, corrections, or your own NAS build notes? Join the discussion below.

Loading comments…