Vibe Coding

AI-assisted coding workflows and techniques

Vibe coding is an emerging paradigm where developers leverage AI assistants to write, refactor, and debug code through natural language prompts and iterative collaboration. Rather than manually typing every line, developers describe intent, review AI-generated output, and refine through conversation. This approach shifts the developer's role from code author to code architect and reviewer, emphasizing specification clarity, critical evaluation, and iterative refinement over raw typing speed.

Key Concepts

Prompt-to-Code Workflow

The core vibe coding workflow involves: (1) Writing a clear specification or intent, (2) Having the AI generate initial code, (3) Reviewing output for correctness and security, (4) Iteratively refining through follow-up prompts. Effective practitioners develop skills in writing precise specifications, identifying subtle bugs in AI output, and knowing when to accept vs. reject suggestions. Tools like GitHub Copilot, Cursor, and Claude Code each offer different interaction models-inline suggestions, chat-based generation, or agentic file manipulation.

Specification Writing for AI

Writing specs for AI differs from traditional documentation. Key elements include: explicit input/output examples, edge case enumeration, constraint specification (performance, memory, security), and context about surrounding codebase. A well-written spec might read: 'Create a function that validates email addresses. Accept user@domain.tld format. Reject consecutive dots, special characters in local part except . _ + -. Return {valid: boolean, error?: string}. Must handle 10K validations/second.' The more precise the spec, the less iteration needed.

Reviewing AI-Generated Code

AI code review requires skepticism. Common issues include: hallucinated APIs (methods that don't exist), subtle logic errors in edge cases, security vulnerabilities (SQL injection, XSS), hardcoded values instead of configuration, missing error handling, and over-engineering. Reviewers should: run tests immediately, check imports against actual dependencies, trace data flow for logic errors, and validate against the original spec. Tools like ESLint, TypeScript, and unit tests catch many issues, but human judgment remains essential for security and architectural decisions.

Iterative Refinement Patterns

Vibe coding rarely produces perfect code in one shot. Effective patterns include: 'Fix this specific bug' (targeted correction), 'Add error handling for X case' (incremental improvement), 'Refactor for readability' (non-functional improvement), 'Explain why this might fail' (proactive debugging), and 'Write tests for this' (validation). Each iteration should have a clear goal. Over-iteration (fixing non-problems) wastes time; under-iteration (accepting first output) risks bugs. The optimal number depends on code complexity and stakes.

Context Management

AI assistants have limited context windows (4K–200K tokens depending on model and tool). Effective context management includes: referencing specific files/functions rather than entire codebases, using @file or similar syntax to include relevant context, compacting long conversations before critical generations, and organizing code so related logic is co-located. Tools like Cursor's 'codebase' feature and Claude Code's CLAUDE.md files help maintain context across sessions. Overloading context leads to confused outputs; under-providing leads to hallucinated assumptions.

Trust Calibration

Knowing when to trust AI output is a meta-skill. High-trust scenarios: well-known patterns (CRUD operations), standard library usage, repetitive boilerplate, and test case generation. Medium-trust: business logic translation, refactoring suggestions, and documentation writing. Low-trust: novel algorithms, security-critical code, performance optimization, and unfamiliar domains. Trust should increase with: passing tests, matching documentation, linting cleanly, and following established patterns in the codebase. Trust should decrease with: unusual imports, overly complex solutions, and code that 'looks right' but wasn't explicitly tested.

Tool-Specific Patterns

Different tools have different strengths. GitHub Copilot excels at inline completion and following patterns from nearby code-best for extending existing code. Cursor combines chat with file editing and codebase context-best for exploration and multi-file changes. Claude Code offers agentic file manipulation with tool use-best for complex tasks like debugging, refactoring, and test generation. Choosing the right tool for the task improves outcomes: Copilot for speed, Cursor for discovery, Claude Code for complexity. Many practitioners use multiple tools in combination.

Debugging AI-Generated Code

When AI code fails, debugging strategies differ from traditional debugging. First, share the error message and relevant code with the AI for diagnosis. AI often identifies issues faster than manual tracing. If AI can't fix it, try: simplifying the problem (minimal reproduction), checking assumptions (AI may have misinterpreted the spec), and examining generated tests (they reveal AI's understanding). Common AI-specific bugs: wrong library version assumptions, misnamed variables across files, and logic that works for the given example but not edge cases. Always validate fixes with new test cases.

Solved Examples

Problem 1:

You prompt an AI assistant: 'Write a function to calculate the factorial of a number.' It returns:

function factorial(n) {
return n <= 1 ? 1 : n * factorial(n - 1);
}

What issues should you check before accepting this code?

Solution:

Step 1: Check input validation. The function doesn't handle negative numbers or non-integer inputs. For n < 0, it returns 1 (incorrect). For n = 1.5, it recurses infinitely.
Step 2: Check for potential stack overflow. Large n (e.g., 100000) will cause stack overflow due to recursion depth.
Step 3: Check type safety. In JavaScript/TypeScript, large factorials exceed Number.MAX_SAFE_INTEGER (~18 quintillion). factorial(18) is already ~6.4 quadrillion.
Step 4: Check the spec match. The original prompt was vague-it didn't specify language, error handling, or range expectations.
Answer: Issues include: (1) No input validation for negatives/non-integers, (2) Stack overflow risk for large n, (3) Precision loss for n > 18 in JavaScript, (4) Missing spec clarity-should have specified requirements first. Either fix the code or refine the spec before accepting.

Problem 2:

You're using Cursor to generate a React component. The AI produces code that uses a custom hook 'useUserData' that doesn't exist in your codebase. What is this phenomenon called, and how do you handle it?

Solution:

Step 1: Identify the phenomenon. This is 'hallucination'-the AI invented an API that doesn't exist, likely trained on codebases that did have such a hook.
Step 2: Check for alternatives. The hook might exist under a different name. Search your codebase for similar functionality: user data fetching, authentication context, or React Query wrappers.
Step 3: Decide on a resolution path:
- If similar functionality exists: Ask AI to 'refactor to use [actual hook/API]'
- If no equivalent: Ask AI to 'implement the useUserData hook' or 'use standard approach instead'
- If the generated approach is better: Create the hook, but review its implementation carefully
Step 4: Add the missing dependency. If creating the hook, ensure it handles loading states, error states, and caching appropriately.
Answer: This is hallucination. Resolution options: (1) Replace with existing equivalent API, (2) Have AI implement the missing hook (review carefully), or (3) Request a standard library approach instead. Always verify generated imports against your actual dependencies.

Problem 3:

You've iteratively refined AI-generated code through 8 rounds of prompts. Each time, the AI fixed one issue but introduced another. Describe a strategy to break this cycle.

Solution:

Step 1: Recognize the pattern. The 'whack-a-mole' iteration pattern suggests the AI lacks a coherent model of requirements-it's patching locally without global understanding.
Step 2: Reset with comprehensive context. Instead of incremental fixes, provide:
- Complete specification with all requirements explicit
- All edge cases enumerated
- Relevant existing code as context
- Test cases that define success
Step 3: Request a complete rewrite. 'Rewrite this function from scratch considering all these requirements: [list]. Include input validation, error handling, and these test cases.'
Step 4: Validate comprehensively. Run all tests, check edge cases, compare against the full spec-not just the most recent fix.
Step 5: If issues persist, decompose. Break the problem into smaller, independent pieces that AI can handle correctly, then compose results.
Answer: Break the cycle by: (1) Halting incremental fixes, (2) Providing complete context and specification, (3) Requesting a fresh rewrite, (4) Validating against the full requirement set, not just the last fix. Iteration limit: 3-4 rounds before considering a reset.

Problem 4:

Evaluate this statement: 'AI-generated code is secure because the AI was trained on production codebases.' Identify the flaw in reasoning.

Solution:

Step 1: Examine the premise. Production codebases contain both secure and insecure code. Training data includes vulnerabilities, deprecated patterns, and legacy code.
Step 2: Identify common security issues in AI output:
- SQL injection: String concatenation for queries
- XSS: Unescaped user input in HTML
- Hardcoded secrets: API keys in source
- Insecure defaults: Missing authentication checks
- Outdated patterns: MD5 for hashing, HTTP instead of HTTPS
Step 3: Consider training data skew. Popular patterns are overrepresented. If a vulnerable pattern is common in training data, AI will suggest it.
Step 4: Recognize the absence of security intent. AI predicts likely code, not secure code. It has no concept of 'attacker' or 'threat model.'
Answer: The flaw is assuming training on production code implies security. Production code contains vulnerabilities, and AI predicts probable continuations-not secure ones. AI-generated code requires the same security review as human code: input validation, output encoding, authentication checks, secret management, and dependency auditing. Never trust AI output for security-critical code without thorough review.

Tips & Tricks

Write specifications before generating code. A vague prompt ('write a login function') produces vague code. A precise prompt ('write a login function that accepts email/password, uses bcrypt for hashing, returns {success, token?, error?}, and logs failed attempts') produces targeted code requiring fewer iterations.
Never accept the first output for non-trivial code. At minimum: run tests, check imports, and trace the happy path. For security-critical code: also trace error paths, check input validation, and review for hardcoded values.
Use AI to generate tests alongside code. Ask: 'Write unit tests for this function including edge cases.' Tests reveal the AI's understanding of the spec and catch bugs you might miss. However, review tests too-they can be wrong in ways that match wrong code.
Learn your tool's context mechanics. In Cursor, use @file to include specific files. In Copilot, keep relevant code visible in the editor. In Claude Code, use CLAUDE.md for persistent context. Providing too little context causes hallucination; providing too much causes confusion.
Develop a 'first pass' review checklist: (1) Do all imports exist in your dependencies? (2) Does the code match the spec's input/output? (3) Are there obvious edge case failures? (4) Does it pass linting/type checking? (5) Do generated tests actually pass? Quick checks catch 80% of issues.
When AI code works but you don't understand why, don't commit it. Either have the AI explain the logic until you understand, or rewrite it yourself. Code you don't understand is unmaintainable code-future you (or teammates) will struggle with bugs AI can't help with.

Ready to practice?

Test your understanding with questions and get instant feedback.

Start Exercise →