Decision Guides
Practical answers to common questions
These guides are designed to help you make practical decisions. Instead of explaining concepts, they answer specific questions with concrete recommendations.
Which Model Fits on Which Hardware?
VRAM requirements by model size and quantization. Quick lookup tables and the math behind them.
Why Is My Setup Slow?
The model fits but it's crawling. Common causes: bandwidth limits, offloading, thermal throttling, wrong settings.
When Does Multi-GPU Help?
Adding a second GPU: when it scales well, when it doesn't, and the role of interconnects.
When Is RAM Offload Too Slow?
Offloading to system RAM: how slow is too slow, and when it's still worthwhile.
Apple Silicon vs NVIDIA
Unified memory vs discrete GPU. Different architectures, different sweet spots.
AMD vs NVIDIA
ROCm support, driver maturity, price/performance. When AMD makes sense.
Buy vs Build
Prebuilt systems, DIY rigs, used enterprise hardware. Cost vs convenience tradeoffs.
Builds by Use Case
Recommended setups for: hobbyist, serious enthusiast, developer workstation, production server.
Quick Decision Trees
What Hardware Should I Get?
My Setup Is Slow — Why?
Common Mistakes
Buying for TFLOPS instead of Bandwidth
LLM inference is memory-bound, not compute-bound. A card with lower TFLOPS but higher memory bandwidth will often be faster for inference. Don't optimize for the wrong metric.
Assuming "It Fits" Means "It's Fast"
A model that barely fits in VRAM leaves no room for KV cache and may require constant swapping. You need headroom. Target 80-90% VRAM utilization, not 100%.
Ignoring Power and Cooling
High-end GPUs pull 300-450W each. Two 3090s need dedicated circuits. Four of them need serious electrical work. Factor this into "total cost of ownership."
Comparing Quantized vs Unquantized Benchmarks
A 70B Q4 model is not the same as a 70B FP16 model. When reading benchmarks or comparisons, check the quantization level. A smaller model at higher precision may outperform a larger model at aggressive quantization.