Shrutam CCA-F Crash › Domain 5 › d5-l05-right-size-model Hinglish English →

Model Routing — Haiku for Speed, Sonnet for Reasoning, Opus for Architecture

Domain 5 · 15% ~12 min Hinglish narration

Audio-only (commute / mobile data)

Same Saavi narration, smaller file. Opus 48k preferred — auto-selected by your browser.

Scenario anchor

Aaranya IT ke ek BFSI client ke liye aap ek loan-origination pipeline design kar rahe hain — har request mein 80,000 tokens ka regulatory document context hai, aur SLA hai sub-3-second response. Agar aap har call pe Opus route karte hain, toh cost explode karega aur latency SLA miss hoga. Yeh lesson resolve karega: kaunsa model kab use karna hai — Haiku, Sonnet, ya Opus — taaki aap ek production-grade routing layer build kar sakein jo cost, speed, aur reasoning quality ko simultaneously optimize kare.

Key Takeaways