Configuration¶
llm-gateway-bench can be driven by a YAML file (e.g. bench.yaml) when using lgb compare.
This page documents the full schema and practical tips for reproducible results.
Example config¶
prompts:
- "Write a haiku about the ocean."
providers:
- name: openai
model: gpt-4.1-mini
api_key: ${OPENAI_API_KEY}
- name: deepseek
model: deepseek-chat
base_url: https://api.deepseek.com/v1
api_key: ${DEEPSEEK_API_KEY}
settings:
requests: 20
concurrency: 3
timeout: 30
Run it:
Schema overview¶
Top-level mapping:
prompts: list of prompt stringsproviders: list of providers/models to benchmarksettings: benchmark parameters (requests, concurrency, timeout)
Validation is performed using Pydantic models. Unknown top-level fields raise a
ConfigError.
prompts¶
- Type:
list[str] - Default:
['Say hello.']
Notes:
- Current runner uses the first prompt only (
prompts[0]). - Keep prompt text stable when comparing runs over time.
providers¶
- Type:
list[provider] - Default:
[]
Each provider entry supports these fields:
Required¶
name(str): provider identifier (lowercased internally)model(str): model id used by the provider
Optional¶
base_url(str): OpenAI-compatible API base URL (e.g.https://.../v1)api_key(str): secret, usually referenced as${ENV_NAME}
Example:
providers:
- name: openrouter
model: meta-llama/llama-3.3-70b-instruct
base_url: https://openrouter.ai/api/v1
api_key: ${OPENROUTER_API_KEY}
Extra fields¶
The provider model is configured with extra="allow", so you may include extra fields for future extensions:
providers:
- name: openrouter
model: meta-llama/llama-3.3-70b-instruct
api_key: ${OPENROUTER_API_KEY}
headers:
HTTP-Referer: https://example.com
X-Title: llm-gateway-bench
The current runner ignores these extra keys.
Important: where the API key is read from¶
The runner resolves API keys in this order:
- provider-level
api_keyfrom YAML - provider-specific environment variable from
PROVIDER_DEFAULTS - dummy key for local providers that do not require authentication
This means ${ENV_NAME} values in YAML are expanded by the loader and then used directly by the runner.
settings¶
- Type:
settings
Fields:
requests¶
- Type:
int(> 0) - Default:
20
How many total requests to send per provider.
concurrency¶
- Type:
int(> 0) - Default:
3
Maximum in-flight requests.
timeout¶
- Type:
int(> 0) - Default:
30
Per-request timeout in seconds.
Environment variables¶
.env¶
This project calls dotenv.load_dotenv() so a local .env file will be loaded automatically.
Example .env:
YAML env var expansion¶
In YAML, you can reference env vars as ${ENV_NAME}:
The loader expands this at runtime.
Validation rules & errors¶
Typical errors:
- Missing file:
Config file not found - Invalid YAML:
Invalid YAML: ... - Wrong types: Pydantic validation errors wrapped as
ConfigError
Tips:
- Ensure
providersis a YAML list (- name: ...) - Ensure
settingsis present and is a mapping
Reproducibility checklist¶
- Use a fixed prompt
- Keep
requests/concurrency/timeoutstable - Record provider base URLs and model ids
- Run from the same region/network when comparing changes
- Track both central tendency (p50) and tail (p95)