llm-api-benchmark-tool/docs/configuration.md

# 配置文档

本文档详细说明了LLM API基准测试工具的配置选项。

## 配置文件格式

配置文件使用YAML格式，默认文件名为`config.yaml`。您可以通过命令行参数`-config`指定其他配置文件路径。

## 配置选项

### API配置

```yaml
api:
  endpoint: "https://api.example.com/v1/completions"  # API端点URL
  api_key: "your_api_key"                            # API密钥
  model: "gpt-3.5-turbo"                             # 模型名称
```

### 提示词模板配置

```yaml
prompt_templates:
  short:  # 短咨询提示词配置
    target_tokens: 50  # 目标Token数（±5%）
    templates:  # 提示词模板列表
      - "什么是{concept}？"
      - "请简要解释{concept}的基本原理。"
      - "简述{topic}的主要观点。"

  long:  # 长文档提示词配置
    target_tokens: 1000  # 目标Token数（±5%）
    templates:  # 提示词模板列表
      - "请撰写一篇关于{country}历史的详细介绍，包括主要历史事件和文化发展。"
      - "请对{topic}进行深入分析，包括其背景、现状、问题和未来发展趋势。"
```

提示词模板中可以使用以下占位符：
- `{country}`: 国家名称
- `{concept}`: 概念名称
- `{topic}`: 主题名称
- `{event}`: 事件名称

系统会自动替换这些占位符，并根据目标Token数动态调整提示词长度。

### 请求配置

```yaml
requests:
  - type: "short"  # 提示词类型
    weight: 0.7    # 权重（占比）
  - type: "long"
    weight: 0.3
```

权重总和应为1.0，表示各类型提示词的比例。

### 并发配置

```yaml
concurrency:
  steps: [50, 200, 500]  # 并发步骤
  duration_per_step: 300  # 每个步骤的持续时间（秒）
```

系统会按照指定的并发步骤逐步增加并发用户数，每个步骤持续指定的时间。

### 超时配置

```yaml
timeout: 60  # 请求超时时间（秒）
```

### 泊松分布参数

```yaml
poisson_lambda: 1.0  # 请求间隔泊松分布参数
```

该参数控制请求间隔的随机性，模拟真实用户行为。

### 分词器配置

```yaml
tokenizer:
  model: "gpt-3.5-turbo"  # 用于tiktoken-go的分词器模型
```

## 完整配置示例

```yaml
api:
  endpoint: "https://api.example.com/v1/completions"
  api_key: "your_api_key"
  model: "gpt-3.5-turbo"

prompt_templates:
  short:
    target_tokens: 50
    templates:
      - "什么是{concept}？"
      - "请简要解释{concept}的基本原理。"
      - "简述{topic}的主要观点。"
      - "{country}的首都是哪里？"
      - "如何快速入门{concept}？"
  long:
    target_tokens: 1000
    templates:
      - "请撰写一篇关于{country}历史的详细介绍，包括主要历史事件和文化发展。"
      - "请对{topic}进行深入分析，包括其背景、现状、问题和未来发展趋势。"
      - "请撰写一份关于{event}影响的综合报告，包括社会、经济和政治层面的分析。"
      - "请详细介绍{concept}的技术原理、应用场景和未来发展方向。"
      - "请撰写一篇关于{topic}的学术论文，包括研究背景、方法、结果和讨论。"

requests:
  - type: "short"
    weight: 0.7
  - type: "long"
    weight: 0.3

concurrency:
  steps: [50, 200, 500]
  duration_per_step: 300  # 秒

timeout: 60  # 秒
poisson_lambda: 1.0  # 请求间隔泊松分布参数

tokenizer:
  model: "gpt-3.5-turbo"  # 用于tiktoken-go的分词器模型
```