How to Add LLM Model Fallbacks in Python in 5 Min
Your agent calls gpt-4o. OpenAI returns a 429. Your agent crashes, your user sees nothing. LLM APIs fail more than you think -- rate limits, outages, content-policy refusals. A single-provider agen...

Source: DEV Community
Your agent calls gpt-4o. OpenAI returns a 429. Your agent crashes, your user sees nothing. LLM APIs fail more than you think -- rate limits, outages, content-policy refusals. A single-provider agent is a single point of failure. The fix takes five minutes: a fallback chain that tries the next model automatically. The Code import os from openai import OpenAI, APIError, RateLimitError, APITimeoutError # Each entry: (base_url, api_key_env, model_name) MODEL_CHAIN = [ ("https://api.openai.com/v1", "OPENAI_API_KEY", "gpt-4o"), ("https://api.anthropic.com/v1", "ANTHROPIC_API_KEY", "claude-3-5-sonnet-20241022"), ("https://openrouter.ai/api/v1", "OPENROUTER_API_KEY", "meta-llama/llama-3-70b-instruct"), ] def chat_with_fallback(messages: list[dict], temperature: float = 0.7) -> str: """Try each model in MODEL_CHAIN until one succeeds.""" errors = [] for base_url, key_env, model in MODEL_CHAIN: try: client = OpenAI( base_url=base_url, api_key=os.environ[key_env], ) response = client.chat.comp