Using AI Agents in VS Code — A Practical 2026 Guide

Why AI Agents in VS Code Matter Right Now

In early 2025, GitHub Copilot was a smart autocomplete tool. By April 2026, it is an autonomous multi-step agent that can read your codebase, run terminal commands, edit files across your project, call external tools via MCP (Model Context Protocol), and iterate on its own output — all from inside your editor.

This is not a marginal upgrade. The gap between “AI suggests a line” and “AI completes a feature end-to-end” is the gap between a calculator and a junior developer. Teams that understand how agent mode works — and where it breaks — are shipping faster without accumulating chaos.

This article is a practical walkthrough for intermediate developers who already use Copilot but want to harness agent mode intelligently: what it actually does, how to configure it well, real-world patterns, and the traps that will cost you time if you miss them.

What Agent Mode Actually Is

Traditional Copilot (inline completions, chat sidebar) is reactive. You ask, it answers. You accept or reject.

Agent mode is agentic: it plans, acts, observes results, and replans autonomously across multiple steps. When you give it a task, it can:

Read relevant files in your workspace
Edit multiple files simultaneously
Run terminal commands (tests, linters, build scripts)
Call MCP-connected tools (GitHub, Postgres, Figma, custom APIs)
Notice when something fails and self-correct
Ask you a clarifying question before a destructive action

Internally, it operates on a loop:

Think → Act → Observe → Think → Act → ...

This continues until the task is done, it hits a dead end, or it needs your input.

VS Code shipped agent mode to all users in April 2025, with MCP support rolling out at the same time. By January 2026, the VS Code January update formalized VS Code as a full AI agent platform, introducing custom agent creation, deeper tool orchestration, and richer workspace context controls.

Setting Up Agent Mode

Step 1: Enable GitHub Copilot

If you have a Copilot subscription (Individual, Business, or Enterprise), agent mode is already available. Open VS Code, open the Copilot Chat panel (Ctrl+Alt+I / Cmd+Alt+I), and look for the mode selector at the top of the chat input.

Switch from Ask or Edit to Agent.

[Ask]  [Edit]  [Agent ✓]

Step 2: Choose Your Model

In 2026, VS Code Copilot lets you pick the underlying model. As of April 2026, available options include GPT-4o, Claude Sonnet 3.7, Gemini 2.0 Flash, and o3. Each has trade-offs:

Model	Strength	Best For
GPT-4o	Balanced speed + quality	General feature development
Claude Sonnet 3.7	Deep reasoning, long context	Large refactors, architecture tasks
Gemini 2.0 Flash	Very fast, cheap	High-frequency small tasks
o3	Complex multi-step reasoning	Algorithmic problems, debugging chains

For most day-to-day feature work, GPT-4o or Claude Sonnet 3.7 is the right default. Switch models based on task complexity, not habit.

Step 3: Configure Workspace Context

Agent mode reads your workspace, but you control what it sees. Add a .github/copilot-instructions.md file (also called a custom instructions file) to give the agent persistent context about your project:

# Project: GyanBeej API

## Stack
- Backend: FastAPI (Python 3.12)
- Database: PostgreSQL via SQLAlchemy (async)
- Auth: JWT with role-based access
- Tests: pytest + httpx

## Conventions
- All endpoints return a `BaseResponse` wrapper schema
- Use `async def` for all route handlers
- Never hardcode secrets — always use `os.getenv()`
- Run `pytest -x` before marking any task complete

## Off-limits
- Do not modify `alembic/` migration files directly
- Do not change `core/config.py` without asking first

This file is injected into every agent session. Without it, the agent will make plausible-but-wrong assumptions about your stack and patterns on every run.

Real-World Workflow: Building a FastAPI Endpoint with Agent Mode

Here is a concrete example of agent mode doing meaningful work on a FastAPI + PostgreSQL project.

Prompt given to agent:

Add a GET /api/v1/lessons/{lesson_id}/progress endpoint.
It should return the authenticated user's progress for that lesson.
Use the existing UserProgress model. Include a 404 if the lesson doesn't exist.
Write the route, schema, and a pytest test for it.

What agent mode does:

Reads models/user_progress.py and models/lesson.py
Reads schemas/ directory to understand existing schema patterns
Reads routers/ to understand routing conventions
Writes the new route in routers/lessons.py
Adds a LessonProgressResponse schema in schemas/lesson.py
Writes a test in tests/test_lessons.py
Runs pytest tests/test_lessons.py -x in the terminal
Reads the failure output, fixes the import it missed, reruns

The result is a working, tested, convention-consistent endpoint — without you writing a single line. This is the productivity multiplier.

The route it writes looks like this:

# routers/lessons.py

from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy.ext.asyncio import AsyncSession

from core.database import get_db
from core.security import get_current_user
from models.lesson import Lesson
from models.user_progress import UserProgress
from schemas.lesson import LessonProgressResponse
from services.lesson_service import get_lesson_progress

router = APIRouter(prefix="/api/v1/lessons", tags=["lessons"])


@router.get("/{lesson_id}/progress", response_model=LessonProgressResponse)
async def get_lesson_progress_endpoint(
    lesson_id: int,
    current_user=Depends(get_current_user),
    db: AsyncSession = Depends(get_db),
):
    lesson = await db.get(Lesson, lesson_id)
    if not lesson:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
            detail="Lesson not found",
        )

    progress = await get_lesson_progress(db, current_user.id, lesson_id)
    return LessonProgressResponse.model_validate(progress)

This is clean, idiomatic FastAPI. It follows the project’s async pattern, uses the correct dependency injection style, and raises a typed 404. The agent inferred all of this from your existing code — which is exactly why .github/copilot-instructions.md and a well-structured codebase pay dividends.

MCP: Giving Agents Real-World Tools

Model Context Protocol (MCP) is the mechanism by which agent mode connects to external tools and data sources. It is an open standard that turned VS Code from a code editor into an agent platform.

MCP servers expose capabilities like:

GitHub MCP: Create PRs, read issues, comment on code reviews — from inside the agent
PostgreSQL MCP: Query your actual database during development
Figma MCP: Pull design tokens and component specs directly into your prompt
Custom MCP: Any REST API you wrap in the protocol

Configuring an MCP Server

MCP servers are configured in .vscode/mcp.json:

{
  "servers": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "${env:GITHUB_TOKEN}"
      }
    },
    "postgres": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": {
        "DATABASE_URL": "${env:DATABASE_URL}"
      }
    }
  }
}

Once configured, you can prompt the agent naturally:

Look at the open GitHub issues labelled "bug". 
Pick the highest-priority one and write a fix for it.

The agent reads the issue, understands the description, navigates to the affected file, and writes the fix — all in one agentic loop. This is what “practical” looks like in 2026.

Security note: MCP servers run as local processes with the permissions you grant them. Never give a database MCP server write access to production. Use read-only credentials for any MCP server that touches live data.

Custom Agents: Specialised AI for Your Team

Beyond the built-in Copilot agent, VS Code now supports custom agents — purpose-built agents you define with a specific prompt persona, tool set, and scope. These live in .vscode/agents/ as .agent.json files.

A custom agent for a code reviewer might look like:

{
  "name": "PR Reviewer",
  "description": "Reviews staged changes for quality, security, and convention adherence",
  "model": "claude-sonnet-3.7",
  "instructions": "You are a senior engineer reviewing a pull request. Check for: missing input validation, SQL injection risks, inconsistent error handling, missing tests, and deviation from project conventions in .github/copilot-instructions.md. Be specific. Do not approve if any critical issue exists.",
  "tools": ["read_file", "run_terminal_command", "search_workspace"],
  "scope": "workspace"
}

This agent can be invoked from chat with @pr-reviewer and will consistently apply your team’s review standards — not the generic standards of a language model trained on the entire internet.

Custom agents shine for:

Security audits: Scoped to check only auth-related files for specific vulnerability patterns
Documentation agents: Generate or update API docs whenever routes change
Migration agents: Assist with a specific refactor (e.g., Django ORM → SQLAlchemy) with domain context baked in
Onboarding agents: Answer “how does this codebase work?” questions with full workspace context

Common Pitfalls (And How to Avoid Them)

1. Letting the Agent Touch Files It Shouldn’t

Agent mode will edit any file it thinks is relevant unless you constrain it. A migration gone wrong, a config file silently changed, or a shared utility broken — these are real risks.

Fix: Use .copilotignore (analogous to .gitignore) to mark files the agent should never touch:

alembic/
core/config.py
.env*
infrastructure/

Also, always review the agent’s planned actions before confirming. VS Code shows a diff and tool-call summary before applying — read it.

2. Trusting Output Without Running Tests

Agent mode will generate code that looks correct and compiles cleanly but has subtle logic errors — especially in edge cases around authentication, permissions, and data transforms.

Fix: Tell the agent to run tests as part of every task. Add it to your custom instructions:

## Agent Rules
- Always run `pytest -x` after making backend changes
- Always run `npx tsc --noEmit` after making TypeScript changes
- Do not mark a task complete if any test fails

This turns the agent into a self-validating loop rather than a “generate and hope” pipeline.

3. Undescriptive Prompts Leading to Generic Code

The single biggest quality lever is prompt quality. Vague prompts produce vague code.

Weak Prompt	Strong Prompt
”Add authentication"	"Add JWT auth to the `/api/v1/lessons` router. Use the existing `get_current_user` dependency. Return 401 for missing token, 403 for insufficient role."
"Fix the bug"	"The `/progress` endpoint returns 500 when `lesson_id` doesn’t exist. Add a 404 guard using `db.get(Lesson, lesson_id)` before the progress lookup."
"Refactor this file"	"Refactor `routers/lessons.py` to extract DB queries into `services/lesson_service.py`. Keep route handlers thin — only request parsing and response formatting.”

The more context you give, the less correction you do.

4. Context Window Overflow on Large Codebases

Agent mode reads workspace files, but it has a finite context window. On a large monorepo, the agent may pull in irrelevant files, diluting its focus and producing worse results.

Fix:

Keep your .github/copilot-instructions.md pointing the agent to relevant directories
Use #file: references in your prompt to pin exact files: Using #file:routers/lessons.py as a reference, add a similar router for quizzes
Break large tasks into scoped sub-tasks rather than one mega-prompt

5. Skipping the Approval Step on Destructive Actions

Agent mode will ask before running rm, dropping tables, or making irreversible changes — but only if the model correctly classifies the action as destructive. It sometimes gets this wrong.

Fix: Keep “Require approval for terminal commands” enabled in VS Code Copilot settings. The 3-second pause to approve a terminal command is always worth it.

Best Practices for Production-Grade AI-Assisted Development

Treat Every Agent Output as Untrusted Code

This mirrors the principle from general AI product development: model output is a draft, not a decision. Read every diff. Run every test. Merge nothing you have not understood.

Maintain a Human-Owned Architecture Doc

Agent mode is excellent at implementing within a known pattern. It is poor at deciding which pattern to use. Keep a short ARCHITECTURE.md that defines your layer boundaries, service contracts, and non-negotiables. Point your custom instructions at it. The agent implements — you architect.

Use Feature Flags for Agent-Generated Features

When shipping code where significant chunks were agent-generated, wrap new features in feature flags. This gives you a fast rollback path if a subtle bug surfaces in production — without a full redeploy.

# core/flags.py
FEATURE_FLAGS = {
    "lesson_progress_v2": os.getenv("FLAG_LESSON_PROGRESS_V2", "false") == "true",
}

Version Your Prompt Templates

Custom instructions and agent configurations are code. Commit them. Review changes to .github/copilot-instructions.md with the same rigour as a config change — they directly affect every agent session your team runs.

Build a “Sanity Check” Step Into Your Workflow

After an agent session, before pushing:

# Quick sanity check script (add to your Makefile)
make lint        # ruff + mypy for Python
make test        # pytest
make typecheck   # tsc --noEmit for TypeScript
git diff --stat  # Review what actually changed

This takes 60 seconds and catches 90% of agent-introduced regressions.

When NOT to Use Agent Mode

Agent mode is powerful, but it is not always the right tool.

Security-critical code (auth flows, cryptography, access control): Write this by hand. Agent-generated security code needs expert review regardless of how polished it looks.
Novel architectural decisions: If you are deciding between event-driven and request-response architecture, a database schema design, or a new service boundary — this is human work. Agents implement; they do not design well under ambiguity.
Unfamiliar codebases you are trying to learn: If you are a junior developer trying to understand a system, let agents explain — but do not let them write. You will merge code you do not understand and own bugs you cannot debug.
Tiny one-liner edits: Inline Copilot completions are faster. Do not reach for agent mode when Tab is the right tool.

Key Takeaways

AI agents in VS Code have moved from novelty to a genuine shift in how software gets built. The teams winning with them share a few traits:

They invest in workspace context (custom instructions, project conventions, scoped agents)
They treat agent output as a starting point, not a final answer
They constrain what the agent can touch via .copilotignore and approval gates
They run tests on every agent task — making self-validation a built-in expectation
They keep humans in charge of architecture and let agents handle implementation

The developer who understands agent mode deeply — its capabilities, its failure modes, and its boundaries — will outpace one who uses it naively. Speed without discipline creates noise. Speed with discipline creates products.

Use agents aggressively. Trust them carefully.

Written for developers building real products in 2026. Stack references: FastAPI, Python 3.12, TypeScript, Next.js, PostgreSQL.