Advanced Prompting Techniques

Chain-of-Thought (CoT): Making Reasoning Visible

Chain-of-Thought (CoT) is less common with reasoning models, since they already perform an explicit reasoning step. With SLMs and other non-reasoning models, however, CoT can still make a meaningful difference.That said, it’s still valuable to learn CoT techniques—they help you understand how these models think and how to effectively influence their behavior.

The Problem: But what if you need to debug a wrong answer? You can’t see the reasoning. The expected response would be something like (note: the response shown below is a placeholder example, not a real API response): In Production:

Use CoT for complex reasoning; avoid for deterministic extraction/classification at temperature=0.
Consider privacy/compliance: avoid logging sensitive intermediate reasoning.
Cost/latency rise with longer outputs—use selectively.

Why It Works:

Often improves performance on reasoning tasks (magnitude varies by task/model)
Creates “intermediate tokens” that guide the model
Makes errors debuggable

Production Pattern: Real-World Impact:

Code generation: 35% fewer bugs with CoT
Math problems: 50-70% accuracy improvement
Medical diagnosis: More reliable clinical reasoning

Self-Consistency: Voting for Reliability

The Problem: One response might be wrong due to non-determinism, ambiguous tasks, and/or valid solution paths. The Solution: Generate multiple responses and vote. When to Use:

High-stakes decisions (medical, financial, legal)
Complex reasoning where errors are costly
Classification tasks where confidence matters

Cost Consideration:

5x Agent tasks = 5x cost
Use only when accuracy justifies expense

Performance Data:

CoT often improves performance on reasoning benchmarks; magnitude varies by task/model (see Wei et al., 2022)
Combining CoT + Self-Consistency can yield additional gains; magnitude varies by task/model (see Wang et al., 2022)
Always validate on your evaluation set; do not assume universal gains

Extended Thinking: Anthropic’s Secret Weapon

Claude-Specific Feature: Claude can expose its “thinking” before answering using special tags.

Prompt

<thinking>
Let me analyze this complex legal document...
- First, I'll identify the key clauses
- Then, I'll look for any conflicting terms
- Finally, I'll assess risk level
</thinking>

[Your actual task here]

Why This Matters:

Debugging: See where reasoning went wrong
Quality: Forces model to think before answering
Transparency: Clients can audit AI decisions

Thinking tags can also be used to guide Claude steps:

async function analyzeContract(contractText: string): Promise<{
    analysis: any;
    reasoning: string;
}> {
    const prompt = `
<document>
${contractText}
</document>

<thinking>
I need to analyze this contract for:
1. Key obligations
2. Termination clauses
3. Liability limits
4. Red flags

Let me work through each section...
</thinking>

Provide a JSON response with:
- obligations: list of key obligations
- risks: list of potential risks
- recommendations: list of recommended actions
`;
    
    const response = await claude.generate(prompt);
    
    // Parse thinking section for audit trail
    const thinking = extractBetweenTags(response, "thinking");
    const result = extractJson(response);
    
    return {
        analysis: result,
        reasoning: thinking  // Store for compliance/review
    };
}

Prompt Chaining: Breaking Complex Tasks

Single Prompt Limitations:

Context window fills up
Errors compound
Hard to debug
Expensive to retry

Chaining Solution: Break one complex task into sequential simple tasks. Benefits:

Each step is simple → fewer errors
Failed steps can retry independently
Cheaper: Only call expensive steps when needed
Easier to evaluate and improve

Trade-off:

More latency (sequential calls)
More complex code
Multiple LLM calls (but often cheaper overall)

Home

Context Engineering & Prompt Design

Retrieval Augmented Generation (RAG)

AI Agents

Agent Reliability & Optimization

Multi-Agent Systems & Coordination

Advanced Prompting Techniques

Chain-of-Thought (CoT): Making Reasoning Visible

Self-Consistency: Voting for Reliability

Extended Thinking: Anthropic’s Secret Weapon

Prompt Chaining: Breaking Complex Tasks

Home

Context Engineering & Prompt Design

Retrieval Augmented Generation (RAG)

AI Agents

Agent Reliability & Optimization

Multi-Agent Systems & Coordination

​Chain-of-Thought (CoT): Making Reasoning Visible

​Self-Consistency: Voting for Reliability

​Extended Thinking: Anthropic’s Secret Weapon

​Prompt Chaining: Breaking Complex Tasks

Chain-of-Thought (CoT): Making Reasoning Visible

Self-Consistency: Voting for Reliability

Extended Thinking: Anthropic’s Secret Weapon

Prompt Chaining: Breaking Complex Tasks