โ Back to index
๐ Model Context Protocol (MCP)
The Three Primitives Sourced
MCP has exactly three core primitives โ knowing the distinction shows you've done the homework:
| Primitive |
Control |
Purpose |
Example |
| Tools |
Model-controlled |
Actions the model can invoke |
search_database(), send_email() |
| Resources |
App-controlled |
Read-only data exposed to context |
User profile, config files |
| Prompts |
User-controlled |
Pre-crafted instruction templates |
"Format as markdown table" |
Transport Mechanisms Sourced
- stdio โ Standard I/O streams, requires initialization handshake
- StreamableHTTP โ Server-Sent Events (SSE) for serverโclient, HTTP for clientโserver
Production insight: Stateless HTTP enables horizontal scaling with load balancers; stateful (stdio) enables server-initiated requests but limits scaling.
"MCP shifts the tool definition burden from your application code to specialized servers. Instead of writing JSON schemas manually, you use Python decorators with type hints and Field descriptions."
๐ฐ Prompt Caching Cost Critical
The Economics Sourced
This is the highest-ROI optimization topic. Know these numbers:
| Operation |
Cost Multiplier |
Notes |
| Cache Write (5min) |
1.25ร base input |
Default, sufficient for most use cases |
| Cache Write (1hr) |
2ร base input |
For long-running workflows |
| Cache Read |
0.1ร base input |
90% discount โ this is the win |
Implementation Patterns Sourced
- Automatic caching: Add
cache_control: {"type": "ephemeral"} at request level โ system auto-caches up to last cacheable block
- Explicit breakpoints: Place
cache_control on specific content blocks for fine-grained control
- Cache order: tools โ system โ messages (prefix must match exactly)
"For multi-turn conversations, automatic caching is usually sufficient. The cache breakpoint advances as the conversation grows, so you're caching the growing message history automatically."
๐ ๏ธ Claude Code & Agent Skills
Claude Code Architecture Sourced
- Terminal-based coding assistant with native filesystem access
- Uses multiple tools in combination for multi-step tasks
- Extensible via MCP servers (e.g., browser automation)
- Supports parallelization โ multiple instances for different features
- Thinking/planning modes for different complexity levels
Agent Skills (SKILL.md) Sourced
Skills are reusable markdown instructions that Claude automatically applies:
- Frontmatter with
description triggers semantic matching
allowed-tools restricts tool access per-skill
- Scripts can execute without consuming context tokens
- Distributable via plugins or enterprise managed settings
Key distinction: CLAUDE.md = project-level customization. Skills = reusable, shareable, semantically-triggered instruction sets.
๐ RAG Implementation
The Full Pipeline Sourced
- Chunking strategies: Sentence-level, paragraph-level, semantic boundaries
- Embeddings: Vector representations for semantic similarity
- BM25: Lexical/keyword search (complements embeddings)
- Multi-index pipeline: Combine semantic + lexical for better recall
- Reranking: Second-pass scoring to improve precision
- Contextual retrieval: Add surrounding context to chunks before embedding
"The Anthropic curriculum emphasizes multi-index RAG โ combining BM25 lexical search with embedding-based semantic search. Pure semantic misses exact keyword matches; pure lexical misses synonyms. Hybrid catches both."
๐ค Agent & Workflow Patterns
Workflows Sourced
- Parallelization: Independent subtasks run concurrently
- Chaining: Output of one step feeds into next
- Routing: Classifier directs to specialized handlers
Deterministic, predictable, easier to debug
Agents Sourced
- Model decides which tools to use
- Environment inspection capabilities
- Autonomous multi-step execution
Flexible, harder to predict, needs guardrails
Workflow vs Agent decision Inferred
Use workflows when the task structure is known upfront. Use agents when the model needs to adaptively explore. Production systems often use workflows with agent sub-components.
โ๏ธ Cloud Deployment Paths
| Platform |
Key Differences |
When to Choose |
| Direct API |
Anthropic-hosted, simplest integration |
Default choice, fastest iteration |
| Amazon Bedrock |
AWS IAM, VPC integration, boto3 client |
Existing AWS infrastructure |
| Google Vertex AI |
GCP IAM, same API structure |
Existing GCP infrastructure |
Enterprise Considerations Inferred
- Cloud deployments satisfy data residency requirements
- Same Claude models, different authentication/billing
- Feature parity may lag direct API slightly
๐ Cost Optimization Strategies
From the Training Materials Sourced
- Prompt caching: 90% discount on repeated prefixes
- Streaming: Reduces perceived latency, same cost
- Temperature=0: Deterministic outputs, enables caching
- Structured output: Stop sequences to avoid verbose explanations
- Model selection: Haiku for simple tasks, Opus for complex reasoning
Advanced Patterns Inferred
- Batch API: Lower cost for non-time-sensitive workloads
- Classifier routing: Use cheap model to route to expensive only when needed
- Context pruning: Strip tool outputs after extracting key info
- MCP sampling: Shifts LLM cost from server to client
๐ฌ Interview Talking Points
"MCP's three primitives map cleanly to control boundaries: tools are model-controlled actions, resources are app-controlled read-only data, and prompts are user-triggered templates. This separation makes it easier to reason about security and permissions."
"For RAG, I'd start with a hybrid BM25 + embedding approach. Pure semantic search misses exact matches that users expect to find, while pure lexical can't handle synonyms or paraphrasing."
"Prompt caching at 0.1ร read cost makes it economical to include large system prompts or few-shot examples. The 5-minute TTL refreshes automatically on use, so active conversations stay cached."
"The workflow vs agent distinction matters for reliability. Workflows are deterministic โ easier to test, debug, and explain. Agents are flexible but need careful guardrails. Production systems often combine both."
โ Smart Questions to Ask Interviewers
"How does the team think about the tradeoff between MCP's stdio transport (richer features, stateful) vs HTTP (stateless, scalable)?"
Shows you understand production deployment considerations beyond "make it work"
"What's the typical prompt caching hit rate you see in production multi-turn conversations?"
Demonstrates cost-awareness and practical optimization mindset
"How do you approach the workflow vs agent decision for new features? Any internal heuristics?"
Shows architectural thinking about reliability vs flexibility tradeoffs
"What's the Skills adoption been like internally? Any patterns that worked better than expected?"
Specific to recent Anthropic tooling, shows you've looked at their latest releases
โ ๏ธ Content Limitations
What's NOT in this guide
The Skilljar courses require login/registration to access video content and exercises. This guide is based only on:
- Public course descriptions and curricula
- Learning objectives listed on each course page
- Anthropic's public documentation (docs.anthropic.com)
Recommendation: If you have time, register and complete at least "Building with the Claude API" and "Introduction to MCP" before interviewing.