1.7 KiB

Technical Context

1. Core Technology

  • Language: Go (Golang)

2. Key Libraries & Dependencies

  • HTTP:
    • github.com/valyala/fasthttp: For high-performance non-streaming requests.
    • net/http: Go standard library, used for streaming (SSE) requests within the client implementation.
  • Tokenization:
    • github.com/tiktoken-go/tokenizer: For OpenAI-compatible token counting and prompt generation helper functions.
  • Concurrency:
    • Go standard library (sync, context, goroutines, channels).
  • Statistics:
    • gonum.org/v1/gonum/stat: For percentile calculations.
    • sync/atomic: Used within stats collector for thread-safe counters.
    • sync.Map: For thread-safe storage of request results.
  • Configuration:
    • github.com/spf13/viper: Likely used for loading YAML configuration (verify import).
    • gopkg.in/yaml.v3 (or similar): For YAML parsing.
  • CLI Flags:
    • Go standard library flag.

3. Development Environment

  • OS: macOS (as per user information)
  • Build/Dependency Management: Go Modules (go mod)

4. Technical Constraints & Considerations

  • HTTP Client Choice: Balancing performance (fasthttp) and streaming stability (net/http). The current approach uses a hybrid within FastHTTPClient.
  • Concurrency Limits: Managing goroutine lifecycle and preventing resource exhaustion.
  • Accurate Timing: Measuring latency and TTFT precisely, especially across concurrent operations.
  • Memory Management: Using sync.Pool for fasthttp objects; being mindful of memory usage when handling large responses or many results.
  • TDD: Workspace rules mandate Test-Driven Development.