32 lines
2.5 KiB
Markdown
32 lines
2.5 KiB
Markdown
# Progress (As of 2025-04-23 ~08:04 UTC+8)
|
|
|
|
## 1. What Works
|
|
- **Core Benchmark Execution:** Can run concurrent requests (both streaming and non-streaming modes selectable via config) for a specified duration.
|
|
- **Configuration Loading:** Reads parameters from `config.yaml`.
|
|
- **Concurrency Control:** Limits active workers based on the `concurrency` setting.
|
|
- **HTTP Client:** `FastHTTPClient` handles both request types:
|
|
- Non-streaming (`Do`) uses `fasthttp`.
|
|
- Streaming (`Stream`) uses `net/http` internally and processes SSE events.
|
|
- **Tokenizer Integration:** Generates prompts and counts response tokens.
|
|
- **Basic Stats Collection:**
|
|
- Records individual `RequestResult` (IsSuccess, Latency, TimeToFirstToken, TotalTokens) correctly for both streaming and non-streaming requests.
|
|
- Calculates and displays aggregate stats upon completion: Total Requests, Success/Fail counts, Success Rate, Avg QPS, Latency (Avg/Min/Max/Percentiles), TTFT (Avg/Min/Max/Percentiles), Avg Tokens/Second.
|
|
|
|
## 2. What's Left / Needs Refinement
|
|
- **Error Reporting:** Currently only tracks success/failure boolean. Need to capture and potentially report specific error messages (`RequestResult.Error` field).
|
|
- **Token Statistics:** While individual token counts are recorded, aggregate token stats (e.g., average tokens per successful request, total tokens generated) are not yet calculated or displayed in the final report.
|
|
- **Non-Streaming TTFT:** The current TTFT for non-streaming requests is approximated as the total latency. This might need refinement or clearer definition.
|
|
- **Streaming Timeout:** The timeout for the entire streaming request is hardcoded in `client.Stream`. Consider making this configurable.
|
|
- **Report Generation:** Final report is currently basic console output. Planning to implement HTML report generation (possibly using `go-echarts`).
|
|
- **Code Refinement:** `pkg/concurrency/manager.go`'s `runWorker` could be refactored for better readability/maintainability.
|
|
- **Testing:** Need comprehensive integration tests covering various config scenarios (especially different prompt sizes, rate limits, etc.). Existing unit tests cover client and potentially other modules.
|
|
|
|
## 3. Current Status
|
|
- The main bug related to recording streaming results has been fixed.
|
|
- The client initialization logic is corrected.
|
|
- The tool is functional for basic benchmarking scenarios, especially streaming.
|
|
- Ready to proceed with further feature enhancements or refinements.
|
|
|
|
## 4. Known Issues
|
|
- None critical at the moment. Previous lint warnings have been addressed.
|