llm-api-benchmark-tool/.windsurfrules

136 lines
6.2 KiB
Plaintext

# Workspace Rules for `llm-api-benchmark-tool`
## 1. Project Structure
- **Root Directory**: `llm-api-benchmark-tool`
- **Subdirectories**:
- `cmd/`: Entry point for the main application.
- `pkg/`: Core functionality modules.
- `client/`: HTTP client abstraction layer.
- `tokenizer/`: Prompt generation using `github.com/tiktoken-go/tokenizer`.
- `concurrency/`: Concurrency management with goroutine pools and `sync.Pool`.
- `stats/`: Performance metrics collection with `sync.Map` and `gonum/stat`.
- `report/`: Report generation with `go-echarts`.
- `test/`: Test cases for all modules.
- `config/`: Configuration files (e.g., `config.yaml`).
- `docs/`: Documentation.
- `scripts/`: Utility scripts.
## 2. Test-Driven Development (TDD)
- **MANDATORY TDD**: The entire project must follow test-driven development. Write test cases first, lock them, then iteratively develop code until all tests pass.
- **MANDATORY TDD**: The entire project must follow test-driven development. Write test cases first, lock them, then iteratively develop code until all tests pass.
- **MANDATORY TDD**: The entire project must follow test-driven development. Write test cases first, lock them, then iteratively develop code until all tests pass.
- **Process**:
1. Write test cases defining expected behavior, including edge cases and error handling.
2. Run tests to ensure they fail (red).
3. Write minimal code to pass tests (green).
4. Refactor code while keeping tests passing.
5. Repeat for each feature.
## 3. Technical Stack
- **Language**: Go.
- **HTTP Client**:
- Interface: `HTTPClient` defined in `pkg/client/interface.go`.
- Implementations:
- `fasthttp` (`github.com/valyala/fasthttp`): High-performance, default choice.
- `net/http` (standard library): Fallback for compatibility.
- Configurable via `client: "fasthttp"` or `client: "net-http"` in `config.yaml`.
- **Tokenizer**: `github.com/tiktoken-go/tokenizer` for prompt generation.
- **Concurrency**: Goroutine pool (e.g., `ants` or custom), `sync.Pool` for object reuse, `context.Context` for control.
- **Data Processing**: `sync.Map` for response storage, `gonum/stat` for percentile calculations.
- **Charts**: `go-echarts` for visualizing metrics.
## 4. HTTP Client Abstraction Layer
- **Interface**: `HTTPClient` with:
- `Do(req *Request) (*Response, error)`: Non-streaming requests.
- `Stream(req *Request, callback func(chunk SSEChunk) error) error`: Streaming SSE responses.
- **Implementations**:
- `pkg/client/fasthttp.go`: Uses `fasthttp` for high throughput.
- `pkg/client/nethttp.go`: Uses `net/http` for standard compatibility.
- **Tests**: `test/client_test.go` verifies `Do` and `Stream`, including TTFT measurement with SSE.
- **Switching**: Config-driven client selection at runtime.
## 5. Prompt Generation
- **Library**: `github.com/tiktoken-go/tokenizer`.
- **Features**: Generate prompts of 50 tokens (±5%) and 1000 tokens (±5%).
- **Tests**: `test/tokenizer_test.go` ensures token counts are within tolerance.
## 6. Concurrency Management
- **Goroutine Pool**: Limit to 600 goroutines max, using `ants` or custom implementation.
- **Object Reuse**: `sync.Pool` for `Request` and `Response` objects.
- **Control**: `context.WithTimeout` for test duration (default 300s).
- **Tests**: `test/concurrency_test.go` validates concurrency levels, intervals, and termination.
## 7. Performance Metrics Collection
- **Storage**: `sync.Map` for thread-safe response data.
- **Calculations**: `gonum/stat` for P90/P95/P99 percentiles.
- **Metrics**:
- Request stats (total, successful, failed, timeout ratio).
- Response time (avg, min/max, percentiles).
- TTFT (min/max, percentiles) via SSE.
- QPS (avg, max).
- Token generation rate (avg, max) for `content` and `reasoning_content`.
- **Tests**: `test/stats_test.go` ensures accurate metrics.
## 8. Report Generation
- **Charts**: Use `go-echarts` for response time, TTFT, QPS, and token rate visualizations.
- **Format**: HTML/PDF reports via templates.
- **Tests**: `test/report_test.go` validates report content and charts.
## 9. Configuration Management
- **File**: `config/config.yaml` for API, prompts, timeouts, etc.
- **Parsing**: Use `viper` or `yaml` package.
- **Tests**: `test/config_test.go` verifies loading and parsing.
## 10. Logging and Monitoring
- **Logging**: Use `log` package, enable debug logs with `--debug`.
- **Monitoring**: Use `runtime` for CPU, memory, and goroutine tracking.
- **Tests**: `test/log_test.go` and `test/monitor_test.go` ensure proper output.
## 11. Continuous Integration
- **CI/CD**: GitHub Actions for automated testing.
- **Coverage**: Target 80%+ code coverage.
- **Automation**: Run tests on every commit.
## 12. Version Control
- **Tool**: Git.
- **Branches**: `main`, `develop`, `feature/*`, `bugfix/*`.
## 13. Code Standards
- **Formatting**: `gofmt` for consistency.
- **Linting**: `golint` or `staticcheck` for quality checks.
- **Naming**: Follow Go conventions.
## 14. Dependency Management
- **Tool**: `go mod` for dependencies.
- **Optional**: `go mod vendor` for offline builds.
## 15. Security
- **API Keys**: Use environment variables or config files, never hardcode.
- **Error Handling**: Catch and log all errors to prevent crashes.
## 16. Performance Optimization
- **Benchmarks**: Use `testing` package for performance tests.
- **Profiling**: Use `pprof` for analysis.
## 17. User Engagement
- **Feedback**: Encourage issues and PRs via GitHub.
- **Guidelines**: Provide `CONTRIBUTING.md`.
## 18. Release Process
- **Versioning**: Semantic versioning (e.g., `v1.0.0`).
- **Changelog**: Maintain `CHANGELOG.md`.
## 19. Community
- **Discussion**: Use GitHub Discussions.
- **License**: MIT.
## 20. TDD Workflow
- **Steps**:
1. Write and lock test cases in `test/`.
2. Implement code in `pkg/` or `cmd/`.
3. Iterate until tests pass.
4. Refactor and retest.
- **Examples**: See `test/client_test.go`, `test/tokenizer_test.go`, etc.
## Summary
These rules ensure a robust, testable, and maintainable `llm-api-benchmark-tool`. The TDD approach—writing tests first, locking them, and coding until they pass—is enforced throughout. Windsurf can leverage this structure to generate code, tests, and documentation systematically.