1.4 KiB
1.4 KiB
Product Context: LLM API Benchmark Tool
1. Problem Solved
Evaluating and comparing the performance characteristics (speed, throughput, responsiveness) of different LLM APIs or different configurations of the same API can be difficult and time-consuming. This tool aims to provide a standardized and automated way to perform these benchmarks.
2. Target Users
- Developers integrating LLM APIs who need to choose the best performing option.
- Operations teams monitoring LLM API performance and reliability.
- Researchers comparing different LLM models or hosting infrastructures.
3. How it Should Work
- The user provides a
config.yamlfile specifying the target API, credentials, benchmark parameters (duration, concurrency, etc.), and desired prompt characteristics. - The tool runs the benchmark according to the configuration, sending requests and collecting performance data.
- Upon completion (or interruption), the tool calculates and displays summary statistics for the key metrics.
- Streaming requests should accurately measure TTFT and token throughput.
4. User Experience Goals
- Easy Configuration: Simple and clear YAML configuration.
- Clear Output: Understandable summary report of key metrics.
- Reliable Measurement: Accurate and consistent performance data collection, especially for streaming TTFT.
- Informative Logging: Useful logs during the benchmark run for diagnostics.