22 lines
1.4 KiB
Markdown
22 lines
1.4 KiB
Markdown
# Product Context: LLM API Benchmark Tool
|
|
|
|
## 1. Problem Solved
|
|
Evaluating and comparing the performance characteristics (speed, throughput, responsiveness) of different LLM APIs or different configurations of the same API can be difficult and time-consuming. This tool aims to provide a standardized and automated way to perform these benchmarks.
|
|
|
|
## 2. Target Users
|
|
- Developers integrating LLM APIs who need to choose the best performing option.
|
|
- Operations teams monitoring LLM API performance and reliability.
|
|
- Researchers comparing different LLM models or hosting infrastructures.
|
|
|
|
## 3. How it Should Work
|
|
- The user provides a `config.yaml` file specifying the target API, credentials, benchmark parameters (duration, concurrency, etc.), and desired prompt characteristics.
|
|
- The tool runs the benchmark according to the configuration, sending requests and collecting performance data.
|
|
- Upon completion (or interruption), the tool calculates and displays summary statistics for the key metrics.
|
|
- Streaming requests should accurately measure TTFT and token throughput.
|
|
|
|
## 4. User Experience Goals
|
|
- **Easy Configuration:** Simple and clear YAML configuration.
|
|
- **Clear Output:** Understandable summary report of key metrics.
|
|
- **Reliable Measurement:** Accurate and consistent performance data collection, especially for streaming TTFT.
|
|
- **Informative Logging:** Useful logs during the benchmark run for diagnostics.
|