2.5 KiB
2.5 KiB
Progress (As of 2025-04-23 ~08:04 UTC+8)
1. What Works
- Core Benchmark Execution: Can run concurrent requests (both streaming and non-streaming modes selectable via config) for a specified duration.
- Configuration Loading: Reads parameters from
config.yaml. - Concurrency Control: Limits active workers based on the
concurrencysetting. - HTTP Client:
FastHTTPClienthandles both request types:- Non-streaming (
Do) usesfasthttp. - Streaming (
Stream) usesnet/httpinternally and processes SSE events.
- Non-streaming (
- Tokenizer Integration: Generates prompts and counts response tokens.
- Basic Stats Collection:
- Records individual
RequestResult(IsSuccess, Latency, TimeToFirstToken, TotalTokens) correctly for both streaming and non-streaming requests. - Calculates and displays aggregate stats upon completion: Total Requests, Success/Fail counts, Success Rate, Avg QPS, Latency (Avg/Min/Max/Percentiles), TTFT (Avg/Min/Max/Percentiles), Avg Tokens/Second.
- Records individual
2. What's Left / Needs Refinement
- Error Reporting: Currently only tracks success/failure boolean. Need to capture and potentially report specific error messages (
RequestResult.Errorfield). - Token Statistics: While individual token counts are recorded, aggregate token stats (e.g., average tokens per successful request, total tokens generated) are not yet calculated or displayed in the final report.
- Non-Streaming TTFT: The current TTFT for non-streaming requests is approximated as the total latency. This might need refinement or clearer definition.
- Streaming Timeout: The timeout for the entire streaming request is hardcoded in
client.Stream. Consider making this configurable. - Report Generation: Final report is currently basic console output. Planning to implement HTML report generation (possibly using
go-echarts). - Code Refinement:
pkg/concurrency/manager.go'srunWorkercould be refactored for better readability/maintainability. - Testing: Need comprehensive integration tests covering various config scenarios (especially different prompt sizes, rate limits, etc.). Existing unit tests cover client and potentially other modules.
3. Current Status
- The main bug related to recording streaming results has been fixed.
- The client initialization logic is corrected.
- The tool is functional for basic benchmarking scenarios, especially streaming.
- Ready to proceed with further feature enhancements or refinements.
4. Known Issues
- None critical at the moment. Previous lint warnings have been addressed.