feat: 增强项目功能,完善文档与测试,新增设计文件和工单模板

This commit is contained in:
songsenand 2026-03-17 17:26:34 +08:00
parent d431b24755
commit 200a045ebb
18 changed files with 2166 additions and 824 deletions

7
.gitignore vendored
View File

@ -8,3 +8,10 @@ wheels/
# Virtual environments
.venv
*.log
.pytest_cache
uv.lock
test_output/
*/__pycache__/
*.egg-info
*/*.egg-info

309
README.md
View File

@ -1,12 +1,12 @@
# LLM 代码生成工具(自举版)
# LLM 代码生成工具(自举版 · 增强版)
本项目是一个基于大语言模型的智能代码生成与维护工具。它不仅能够根据项目 `README.md` 描述**自动生成完整的 Python 包代码**,还支持**在现有项目上增量添加功能**和**自动修复 Bug**。工具采用 `uv` 管理依赖,包含单元测试、并行检查、断点续写等特性,并通过一个**面向 LLM 的中间设计层**来提升生成质量和可维护性。
本项目是一个基于大语言模型的代码生成工具,能够根据项目 `README.md` 描述自动生成完整的 Python 包代码,并具备代码检查、测试和自动修复能力。它是前一个代码生成器的升级版本,采用 `uv` 进行包管理,包含完整的单元测试、并行检查模块,并可通过命令行直接调用。
## 特别说明
我已经实现了一个简易版本,请在此基础上修改、拓展、开发:
我已经实现了一个简易版本,请在此基础上拓展开发:
```python
```
#!/home/songsenand/env/.venv/bin/python
"""
基于LLM的自动化代码生成工具
@ -372,72 +372,234 @@ if __name__ == "__main__":
app()
```
## 功能特性
## ✨ 核心特性
- 📦 **自动生成**:解析 `README.md`,分析需要生成的文件列表及依赖关系,按顺序生成每个文件的代码。
- 🔧 **命令执行**:生成文件后可自动执行建议命令(如安装依赖、运行构建),内置危险命令拦截。
- 📋 **中间设计层**:生成一个 `design.json` 文件,包含项目结构、文件关联、功能摘要等信息。后续所有代码生成均以该 JSON + README 作为上下文,确保 LLM 始终理解全局设计。
- 🧩 **增量功能开发**:通过编写**需求工单**(如 `feature.issue`),描述新增功能,工具自动分析现有代码并生成新增或修改的文件。
- 🐞 **自动 Bug 修复**:通过编写**Bug 工单**(如 `bug.issue`),描述问题现象,工具结合代码和错误信息生成修复补丁。
- 🔧 **命令执行**:生成文件后可自动执行建议命令(如安装依赖、运行构建),内置危险命令拦截(执行命令失败不会终止任务,仅记录错误)。
- ✅ **单元测试**:使用 `pytest` 编写测试用例,支持测试覆盖率统计。
- 🔍 **并行检查**:生成代码后并行运行多个检查工具(如 `pylint`、`mypy`、`black`),收集错误信息。
- 🔄 **自修复**:将检查错误、`README` 和相关代码作为上下文提交给 LLM自动生成修复补丁并应用。
- ⏯️ **断点续写**如果生成过程意外中断如网络问题、API 限制),重新运行时会从上次中断处继续,已生成的文件和已执行的命令不会重复执行,状态自动保存在输出目录下的 `.llm_generator_state.json` 文件中
- 🖥️ **命令行工具**:提供 `llm-codegen` 命令,参数兼容原脚本(`--output`、`--api-key`、`--model` 等)
- 🔍 **并行检查**:生成代码后并行运行多个检查工具(`pylint`、`mypy`、`black`),收集错误信息。
- 🔄 **自修复**:将检查错误、README、design.json 和相关代码提交给 LLM自动生成修复补丁并应用。
- ⏯️ **断点续写**生成过程中断后可自动从上次中断处继续,状态保存在 `.llm_generator_state.json`
- 🖥️ **命令行工具**:提供 `llm-codegen` 命令,支持多种操作模式
- 📝 **详细日志**所有操作、LLM 响应、错误均通过 `loguru` 记录到文件。
- 🎨 **美观输出**:使用 `rich` 显示进度条和彩色状态。
## 安装
## 🚀 安装
### 依赖
- Python 3.9+
- 使用 `uv` 管理包
```bash
# 使用 uv
# 安装依赖
uv add [dev]
```
### 配置 API 密钥
设置环境变量(推荐):
```bash
export DEEPSEEK_APIKEY="your-api-key"
```
或在命令行中通过 `--api-key` 传入。
## 使用方法
## 📖 使用方法
工具支持三种操作模式,通过子命令区分:
```bash
llm-codegen [OPTIONS] README
llm-codegen init README.md # 从零初始化项目
llm-codegen enhance feature.issue # 根据需求工单增强项目
llm-codegen fix bug.issue # 根据Bug工单修复项目
```
### 参数
### 1. 初始化项目 (`init`)
| 参数 | 说明 |
|------|------|
| `README` | `README.md` 文件路径(必须) |
| `--output, -o` | 输出根目录默认README 所在目录) |
| `--api-key` | API 密钥(默认:环境变量 `DEEPSEEK_APIKEY` |
| `--base-url` | API 基础 URL默认`https://api.deepseek.com` |
| `--model, -m` | 使用的模型(默认:`deepseek-reasoner` |
| `--log` | 日志文件路径(默认:输出目录下 `generator.log` |
| `--resume/--no-resume` | 是否启用断点续写(默认:`--resume`,即自动从上次中断处继续) |
| `--no-check` | 跳过生成后的检查和修复 |
| `--help` | 显示帮助信息 |
### 示例
根据 `README.md` 生成完整的项目骨架和代码。
```bash
llm-codegen my_project/README.md -o ./generated
llm-codegen init path/to/README.md -o ./generated
```
如果中途中断,只需再次运行相同的命令,工具会自动检测状态文件并从上次中断处继续生成。
**流程**
- 读取 `README.md`,调用 LLM 生成**中间设计文件** `design.json`(位于输出目录)。
- 基于 `design.json``README.md` 按顺序生成每个文件。
- 生成完成后执行可选命令、检查和自动修复。
### 2. 增强项目 (`enhance`)
当已有项目需要添加新功能时,编写一个**需求工单**(如 `add-logging.issue`),然后运行:
```bash
llm-codegen enhance add-logging.issue -o ./project
```
**需求工单模板**`feature.issue`
```yaml
# 需求工单示例
name: 添加日志记录功能
description: 为所有核心函数增加日志输出,记录调用参数和执行时间。
affected_files:
- src/llm_codegen/core.py
- src/llm_codegen/utils.py
acceptance_criteria:
- 每个公共函数应记录开始和结束日志
- 日志级别为 INFO包含函数名和参数
- 使用 loguru 记录
```
工具会自动:
- 读取现有项目的 `design.json` 和代码。
- 分析需求,确定需要修改的文件。
- 生成代码变更(新增或修改文件)。
- 执行检查和修复。
### 3. 修复 Bug (`fix`)
发现 Bug 时,编写一个**Bug 工单**(如 `crash-on-empty.issue`),然后运行:
```bash
llm-codegen fix crash-on-empty.issue -o ./project
```
**Bug 工单模板**`bug.issue`
```yaml
# Bug 工单示例
name: 当输入为空时程序崩溃
description: 调用 parse_readme 时若 README 为空文件,抛出未处理的 IndexError。
steps_to_reproduce:
- 创建空文件 empty.md
- 运行 llm-codegen init empty.md
expected_behavior: 应给出友好提示并退出。
actual_behavior: 抛出 IndexError 并打印堆栈。
affected_files:
- src/llm_codegen/core.py
```
工具会自动:
- 定位相关代码。
- 结合错误信息生成修复方案。
- 应用补丁,并重新运行测试验证。
## 🧠 中间设计层 (`design.json`)
`design.json` 是工具与 LLM 之间的“通用语言”,它记录了项目的完整设计蓝图,结构如下:
```json
{
"project_name": "MyProject",
"version": "1.0.0",
"description": "项目简短描述",
"files": [
{
"path": "src/llm_codegen/core.py",
"summary": "核心生成逻辑,包含 CodeGenerator 类",
"dependencies": ["src/llm_codegen/utils.py"],
"functions": [
{
"name": "generate_file",
"summary": "生成单个文件,返回代码和命令",
"inputs": ["file_path", "prompt", "deps"],
"outputs": ["code", "commands"]
}
],
"classes": [...]
}
],
"commands": [
"pip install -e .",
"pytest tests/"
],
"check_tools": ["pytest", "pylint", "mypy"]
}
```
该文件由 LLM 在 `init` 阶段生成,并在后续所有操作中作为上下文提供给 LLM确保每次生成都符合整体设计。
## 🔄 核心工作流
### 初始化流程
1. 读取 `README.md`,调用 LLM 生成 `design.json`
2. 解析 `design.json`,获得文件列表和依赖关系。
3. 按顺序生成每个文件,生成时上下文包括:
- `README.md`
- `design.json`
- 已生成的依赖文件内容
4. 执行文件关联的命令(如安装依赖)。
5. (可选)运行检查工具,若有错误则触发自修复。
### 增强/修复流程
1. 读取项目根目录下的 `design.json` 和现有代码。
2. 解析需求/缺陷工单,识别受影响文件。
3. 调用 LLM 生成代码变更(可新增文件或修改现有文件),上下文包括:
- `README.md`
- `design.json`
- 所有受影响文件的当前内容
- 工单内容
4. 应用变更,更新 `design.json` 中的摘要(如果新增了函数/类)。
5. 执行检查与修复。
## 📝 工单模板
### 需求工单 (`feature.issue`)
```yaml
name: <功能名称>
description: <详细描述>
affected_files: # 可能影响到的文件(可选,留空则让 LLM 自动分析)
- path/to/file1.py
- path/to/file2.py
acceptance_criteria: # 验收条件(列表)
- 条件1
- 条件2
```
### Bug 工单 (`bug.issue`)
```yaml
name: <Bug 标题>
description: <详细描述>
steps_to_reproduce: # 复现步骤
- 步骤1
- 步骤2
expected_behavior: <期望行为>
actual_behavior: <实际行为>
affected_files: # 可能相关的文件(可选)
- path/to/file.py
```
## ⚙️ 配置
通过 `pyproject.toml``[tool.llm-codegen]` 部分自定义行为:
```toml
[tool.llm-codegen]
check_tools = ["pytest", "pylint", "mypy", "black"]
max_retries = 3
dangerous_commands = ["rm", "sudo", "chmod", "dd"]
```
## 🛠️ 开发指南
### 环境设置
```bash
# 安装 uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# 创建虚拟环境并激活
uv venv
source .venv/bin/activate
# 安装项目(可编辑模式)和开发依赖
uv pip install -e ".[dev]"
```
## 项目结构
生成的项目将包含以下文件和目录:
```
```txt
.
├── README.md # 项目说明(原始输入)
├── pyproject.toml # 项目元数据、依赖、脚本入口
@ -457,77 +619,20 @@ llm-codegen my_project/README.md -o ./generated
└── logs/ # 运行日志(自动创建)
```
## 核心流程
1. **解析阶段**:读取 `README.md`,调用 LLM 获取 `files`(按生成顺序的文件路径列表)和 `dependencies`(每个文件依赖的已有文件列表)。
2. **生成阶段**:按顺序生成每个文件,使用 `README` 和依赖文件作为上下文,同时获取 LLM 建议的命令。每成功生成一个文件并执行命令后,状态会自动保存到 `.llm_generator_state.json`
3. **命令执行**:对每个建议命令进行危险检查,低风险则执行。已执行的命令记录在状态文件中,避免重复执行。
4. **检查阶段**(可选):生成完成后,并行运行配置的检查工具(如 `pytest`、`pylint`、`mypy`),收集错误。
5. **修复阶段**(可选):若检查失败,将错误信息、`README` 和相关文件内容提交给 LLM请求生成修复方案并自动应用修改。重复直到检查通过或达到重试次数上限。
## 断点续写机制
- 状态文件保存在输出目录下的 `.llm_generator_state.json`,记录已成功生成的文件列表和已执行的命令。
- 重新运行工具时(默认启用 `--resume`),会自动读取状态文件,跳过已完成的部分,从下一个文件开始继续。
- 如果 `README` 发生重大变更导致文件列表不一致,工具会检测并提示用户重新开始(可通过 `--no-resume` 强制从头生成)。
- 状态文件在全部流程成功完成后可手动删除,工具不会自动删除,以便后续查看或用于调试。
## 开发指南
### 环境设置
```bash
# 安装 uv若未安装
curl -LsSf https://astral.sh/uv/install.sh | sh
# 创建虚拟环境并激活
uv venv
source .venv/bin/activate # Linux/macOS
# 或 .venv\Scripts\activate # Windows
# 安装项目(可编辑模式)和开发依赖
uv pip install -e ".[dev]"
```
### 运行测试
```bash
pytest tests/ --cov=src/llm_codegen
pytest tests/
```
### 代码检查
```bash
# 运行所有检查
pre-commit run --all-files
# 或手动运行
pylint src/llm_codegen
mypy src/llm_codegen
black --check src/llm_codegen
```
### 添加新功能
1. 在 `src/llm_codegen/` 下添加或修改模块。
2. 在 `tests/` 中添加对应的单元测试。
3. 更新 `README.md` 和命令行帮助信息。
## 配置
通过 `pyproject.toml``[tool.llm-codegen]` 部分可以自定义检查工具和修复行为:
```toml
[tool.llm-codegen]
check_tools = ["pytest", "pylint", "mypy", "black"]
max_retries = 3
dangerous_commands = ["rm", "sudo", "chmod", "dd"]
```
### 编写工单示例
项目生成后,`issues/` 目录下会包含示例工单文件,可参考编写。
## 📌 注意事项
- 中间设计文件 `design.json` 是核心资产,请勿手动修改(除非你完全理解设计意图),否则可能导致后续生成偏差。
- 断点续写状态文件 `.llm_generator_state.json` 自动管理,无需手动干预。
- 若 `README.md``design.json` 发生重大变更导致结构不一致,工具会提示并建议重新初始化。
---
通过引入中间设计层和工单驱动机制本工具不仅实现了从零生成更成为项目的“AI 协作者”,能够持续参与功能迭代与缺陷修复,大幅提升开发效率。

35
create-test.issue Normal file
View File

@ -0,0 +1,35 @@
# 需求工单:完善单元测试
name: 完善单元测试
description: 当前项目的单元测试覆盖不足,需补充核心模块的测试用例,确保代码质量并便于后续迭代。
affected_files:
# 测试文件(可能需新建)
- tests/test_cli.py
- tests/test_core.py
- tests/test_checker.py
- tests/test_utils.py
- tests/test_models.py
# 核心代码文件(测试将覆盖它们,但本身无需修改)
- src/llm_codegen/cli.py
- src/llm_codegen/core.py
- src/llm_codegen/checker.py
- src/llm_codegen/utils.py
- src/llm_codegen/models.py
acceptance_criteria:
- 所有新增或修改的测试用例均通过 `pytest` 运行,无失败、错误或跳过。
- 测试覆盖率(语句覆盖率)不低于 85%,分支覆盖率不低于 70%,可通过 `pytest --cov=src/llm_codegen --cov-branch` 验证。
- 核心类 `CodeGenerator` 的以下方法被充分测试:
- `__init__`(不同参数组合)
- `_call_llm`(模拟 API 响应、超时、异常)
- `parse_readme`(正常文件、空文件、编码问题)
- `get_project_structure`(模拟 LLM 返回)
- `generate_file`(依赖文件存在/不存在)
- `execute_command`(正常执行、危险命令拦截、超时)
- `run`(完整流程的模拟)
- 并行检查模块 `checker.py` 的主要函数(如 `run_checks`、`apply_fixes`)需覆盖正常与错误场景。
- 工具函数 `is_dangerous_command` 应测试多个危险命令变体及安全命令。
- 命令行接口CLI需包含端到端测试验证 `init`、`enhance`、`fix` 子命令的基本流程(可使用 `CliRunner` 或 `subprocess` 模拟)。
- 测试应使用 `pytest` 的临时目录(`tmp_path`)和 `unittest.mock` 模拟外部依赖如文件系统、API 调用),避免污染实际环境。
- 为常用模拟操作(如模拟 OpenAI 客户端、模拟文件读写)编写可复用的 fixture。
- 测试代码遵循项目的编码规范(使用 black、isort 格式化,类型注解完整)。

146
design.json Normal file
View File

@ -0,0 +1,146 @@
{
"project_name": "llm-codegen",
"version": "1.0.0",
"description": "一个基于大语言模型的智能代码生成与维护工具支持自动生成、增量添加功能和自动修复Bug。",
"files": [
{
"path": "pyproject.toml",
"summary": "项目元数据、依赖配置和脚本入口",
"dependencies": [],
"functions": [],
"classes": []
},
{
"path": "src/llm_codegen/__init__.py",
"summary": "包初始化文件",
"dependencies": [],
"functions": [],
"classes": []
},
{
"path": "src/llm_codegen/cli.py",
"summary": "命令行接口使用typer定义命令",
"dependencies": ["src/llm_codegen/core.py"],
"functions": [
{
"name": "main",
"summary": "主CLI入口处理命令行参数并启动生成器",
"inputs": ["readme", "output_dir", "api_key", "base_url", "model", "log_file"],
"outputs": []
}
],
"classes": []
},
{
"path": "src/llm_codegen/core.py",
"summary": "核心生成逻辑包含CodeGenerator类",
"dependencies": ["src/llm_codegen/utils.py"],
"functions": [
{
"name": "_call_llm",
"summary": "调用LLM并返回解析后的JSON",
"inputs": ["system_prompt", "user_prompt", "temperature", "expect_json"],
"outputs": ["result"]
},
{
"name": "parse_readme",
"summary": "读取README文件内容",
"inputs": ["readme_path"],
"outputs": ["content"]
},
{
"name": "get_project_structure",
"summary": "根据README内容生成文件列表和依赖关系",
"inputs": [],
"outputs": ["files", "dependencies"]
},
{
"name": "generate_file",
"summary": "生成单个文件,返回代码、描述和命令列表",
"inputs": ["file_path", "prompt_instruction", "dependency_files"],
"outputs": ["code", "description", "commands"]
},
{
"name": "execute_command",
"summary": "执行单个命令,检查风险",
"inputs": ["cmd", "cwd"],
"outputs": []
},
{
"name": "run",
"summary": "主执行流程,控制整个生成过程",
"inputs": ["readme_path"],
"outputs": []
}
],
"classes": [
{
"name": "CodeGenerator",
"summary": "代码生成器,封装所有逻辑",
"methods": ["__init__", "_call_llm", "parse_readme", "get_project_structure", "generate_file", "execute_command", "run"]
}
]
},
{
"path": "src/llm_codegen/checker.py",
"summary": "并行检查与修复模块,运行检查工具并收集错误",
"dependencies": ["src/llm_codegen/core.py"],
"functions": [],
"classes": []
},
{
"path": "src/llm_codegen/utils.py",
"summary": "工具函数,如危险命令判断和文件操作",
"dependencies": [],
"functions": [
{
"name": "is_dangerous_command",
"summary": "判断命令是否危险",
"inputs": ["cmd"],
"outputs": ["is_dangerous", "reason"]
}
],
"classes": []
},
{
"path": "src/llm_codegen/models.py",
"summary": "数据模型使用Pydantic定义数据结构",
"dependencies": [],
"functions": [],
"classes": []
},
{
"path": "tests/__init__.py",
"summary": "测试包初始化",
"dependencies": [],
"functions": [],
"classes": []
},
{
"path": "tests/test_cli.py",
"summary": "测试命令行接口",
"dependencies": ["src/llm_codegen/cli.py"],
"functions": [],
"classes": []
},
{
"path": "tests/test_core.py",
"summary": "测试核心生成逻辑",
"dependencies": ["src/llm_codegen/core.py"],
"functions": [],
"classes": []
},
{
"path": "tests/test_checker.py",
"summary": "测试检查模块",
"dependencies": ["src/llm_codegen/checker.py"],
"functions": [],
"classes": []
}
],
"commands": [
"pip install -e .",
"pytest tests/"
],
"check_tools": ["pytest", "pylint", "mypy", "black"]
}

10
issues/bug.issue Normal file
View File

@ -0,0 +1,10 @@
# Bug 工单示例
name: 当输入为空时程序崩溃
description: 调用 parse_readme 时若 README 为空文件,抛出未处理的 IndexError。
steps_to_reproduce:
- 创建空文件 empty.md
- 运行 llm-codegen init empty.md
expected_behavior: 应给出友好提示并退出。
actual_behavior: 抛出 IndexError 并打印堆栈。
affected_files:
- src/llm_codegen/core.py

10
issues/feature.issue Normal file
View File

@ -0,0 +1,10 @@
# 需求工单示例
name: 添加日志记录功能
description: 为所有核心函数增加日志输出,记录调用参数和执行时间。
affected_files: # 可能影响到的文件(可选,留空则让 LLM 自动分析)
- src/llm_codegen/core.py
- src/llm_codegen/utils.py
acceptance_criteria: # 验收条件(列表)
- 每个公共函数应记录开始和结束日志
- 日志级别为 INFO包含函数名和参数
- 使用 loguru 记录

360
llmcodegen.py Normal file
View File

@ -0,0 +1,360 @@
#!/home/songsenand/env/.venv/bin/python
#!
"""
基于LLM的自动化代码生成工具
根据README.md文件自动生成项目文件结构并填充代码执行必要命令
"""
import json
import os
import subprocess
import sys
from typing import List, Dict, Optional, Any, Tuple
from pathlib import Path
import typer
from rich.console import Console
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskID
from loguru import logger
from openai import OpenAI
# ==================== 配置 ====================
DANGEROUS_COMMANDS = ["rm", "sudo", "chmod", "dd", "mkfs", "> /dev/sda", "format"]
ALLOWED_COMMANDS = [] # 可设置白名单,为空则只检查黑名单
app = typer.Typer(help="基于LLM的自动化代码生成工具")
console = Console()
# ==================== 工具函数 ====================
def is_dangerous_command(cmd: str) -> Tuple[bool, str]:
"""
判断命令是否危险
返回 (是否危险, 原因)
"""
cmd_lower = cmd.lower()
for danger in DANGEROUS_COMMANDS:
if danger in cmd_lower:
return True, f"包含危险关键词 '{danger}'"
return False, ""
# ==================== 核心类 ====================
class CodeGenerator:
"""代码生成器,封装所有逻辑"""
def __init__(
self,
api_key: Optional[str] = None,
base_url: str = "https://api.deepseek.com",
model: str = "deepseek-reasoner",
output_dir: str = "./generated",
log_file: Optional[str] = None,
):
"""
初始化生成器
Args:
api_key: OpenAI API密钥默认从环境变量DEEPSEEK_APIKEY读取
base_url: API基础URL
model: 使用的模型
output_dir: 输出根目录
log_file: 日志文件路径默认自动生成
"""
self.api_key = api_key or os.getenv("DEEPSEEK_APIKEY")
if not self.api_key:
raise ValueError("必须提供API密钥或设置环境变量DEEPSEEK_APIKEY")
self.client = OpenAI(api_key=self.api_key, base_url=base_url)
self.model = model
self.output_dir = Path(output_dir)
self.output_dir.mkdir(parents=True, exist_ok=True)
# 配置日志
if log_file is None:
log_file = self.output_dir / "generator.log"
logger.remove() # 移除默认handler
logger.add(sys.stderr, level="WARNING") # 控制台输出INFO及以上
logger.add(log_file, rotation="10 MB", level="DEBUG") # 文件记录DEBUG
logger.info(f"日志已初始化,保存至: {log_file}")
self.readme_content = None
self.progress: Optional[Progress] = None
self.tasks: Dict[str, TaskID] = {} # 任务ID映射
def _call_llm(
self,
system_prompt: str,
user_prompt: str,
temperature: float = 0.2,
expect_json: bool = True,
) -> Dict[str, Any]:
"""
调用LLM并返回解析后的JSON
"""
logger.debug(f"调用LLM模型: {self.model}")
logger.debug(f"System: {system_prompt[:200]}...")
logger.debug(f"User: {user_prompt[:200]}...")
try:
response = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
],
temperature=temperature,
response_format={"type": "json_object"} if expect_json else None,
)
message = response.choices[0].message
content = message.content
# 记录思考过程(如果存在)
if hasattr(message, "reasoning_content") and message.reasoning_content:
logger.info(f"模型思考过程: {message.reasoning_content}")
logger.debug(f"LLM原始响应: {content[:500]}...")
if expect_json:
result = json.loads(content)
else:
result = {"content": content}
return result
except json.JSONDecodeError as e:
logger.error(f"JSON解析失败: {e}")
raise ValueError(f"LLM返回的不是有效JSON: {content[:200]}")
except Exception as e:
logger.error(f"LLM调用失败: {e}")
raise
def parse_readme(self, readme_path: Path) -> str:
"""
读取README文件内容
"""
logger.info(f"读取README文件: {readme_path}")
try:
with open(readme_path, "r", encoding="utf-8") as f:
content = f.read()
logger.debug(f"README内容长度: {len(content)} 字符")
if (readme_path.parent / 'design.json').exists():
with open((readme_path.parent / 'design.json')) as f:
content += f'\n\ndesign.json(包含项目设计有关信息)内容如下:{f.read()}'
return content
except Exception as e:
logger.error(f"读取README失败: {e}")
raise
def get_project_structure(self) -> Tuple[List[str], Dict[str, List[str]]]:
"""
根据README内容让LLM生成文件列表和依赖关系
Returns:
(files, dependencies)
files: 按顺序需要生成的文件路径列表
dependencies: 字典 {file: [依赖文件路径]}
"""
system_prompt = (
"你是一个软件架构师。请根据README描述分析需要生成哪些源代码文件并确定它们的生成顺序"
"同时给出每个文件生成时最少需要读取哪些已有文件作为上下文。"
"返回严格的JSON对象包含两个字段\n"
"- files: 数组,按生成顺序排列的文件路径(相对于项目根目录)\n"
"- dependencies: 对象,键为文件路径,值为该文件依赖的已有文件路径列表(可为空)\n"
"注意:依赖文件必须是已存在的参考文件,不要包含待生成的文件。"
)
user_prompt = f"README内容如下\n\n{self.readme_content}"
result = self._call_llm(system_prompt, user_prompt)
files = result.get("files", [])
dependencies = result.get("dependencies", {})
if not files:
raise ValueError("LLM未返回任何文件列表")
logger.info(f"解析到 {len(files)} 个待生成文件")
logger.debug(f"文件列表: {files}")
logger.debug(f"依赖关系: {dependencies}")
return files, dependencies
def generate_file(
self,
file_path: str,
prompt_instruction: str,
dependency_files: List[str],
) -> Tuple[str, str, List[str]]:
"""
生成单个文件返回 (代码, 描述, 命令列表)
"""
# 读取依赖文件内容
context_content = []
if self.readme_content:
context_content.append(f"### 项目 README ###\n{self.readme_content}\n")
for dep in dependency_files:
dep_path = Path(dep)
if not dep_path.exists():
# 尝试相对于当前目录或输出目录查找
alt_path = self.output_dir / dep
if alt_path.exists():
dep_path = alt_path
else:
logger.warning(FileNotFoundError(f"依赖文件不存在: {dep}"))
with open(dep_path, "r", encoding="utf-8") as f:
content = f.read()
context_content.append(f"### 文件: {dep_path.name} (路径: {dep}) ###\n{content}\n")
full_context = "\n".join(context_content)
system_prompt = (
"你是一个专业的编程助手。根据用户指令和提供的上下文文件,生成完整的代码。"
"返回严格的JSON对象包含三个字段\n"
"- code: (string) 生成的完整代码\n"
"- description: (string) 简短的中文功能描述\n"
"- commands: (array of string) 生成此文件后需要执行的操作系统命令列表(如编译、安装依赖等),若无则返回空数组"
)
user_prompt = f"{prompt_instruction}\n\n参考文件上下文:\n{full_context}"
result = self._call_llm(system_prompt, user_prompt)
code = result.get("code", "")
description = result.get("description", "")
commands = result.get("commands", [])
if not isinstance(commands, list):
commands = []
return code, description, commands
def execute_command(self, cmd: str, cwd: Optional[Path] = None) -> None:
"""
执行单个命令检查风险
"""
dangerous, reason = is_dangerous_command(cmd)
if dangerous:
logger.error(f"危险命令被阻止: {cmd},原因: {reason}")
return
logger.info(f"执行命令: {cmd}")
try:
result = subprocess.run(
cmd,
shell=True,
cwd=cwd or self.output_dir,
capture_output=True,
text=True,
timeout=300, # 5分钟超时
)
logger.debug(f"命令返回码: {result.returncode}")
if result.stdout:
logger.debug(f"stdout: {result.stdout[:500]}")
if result.stderr:
logger.warning(f"stderr: {result.stderr[:500]}")
except subprocess.TimeoutExpired:
logger.error(f"命令执行超时: {cmd}")
except Exception as e:
logger.error(f"命令执行失败: {e}")
def run(self, readme_path: Path):
"""
主执行流程
"""
logger.info("=" * 50)
logger.info("开始代码生成流程")
logger.info(f"README: {readme_path}")
logger.info(f"输出目录: {self.output_dir}")
# 初始化阶段用rich输出状态不会被日志级别过滤
console.print("[bold yellow]🔍 正在解析README...[/bold yellow]")
self.readme_content = self.parse_readme(readme_path)
console.print("[bold yellow]📋 正在分析项目结构...[/bold yellow]")
files, dependencies = self.get_project_structure()
console.print(f"[green]✅ 解析完成,共 {len(files)} 个文件待生成[/green]")
# 3. 创建进度条
with Progress(
SpinnerColumn(),
TextColumn("[progress.description]{task.description}"),
BarColumn(),
TextColumn("[progress.percentage]{task.percentage:>3.0f}%"),
console=console,
) as progress:
self.progress = progress
# 创建总任务
total_task = progress.add_task("[cyan]整体进度...", total=len(files))
# 依次生成每个文件
for idx, file in enumerate(files, 1):
logger.info(f"处理文件 [{idx}/{len(files)}]: {file}")
# 创建子任务(可选)
file_task = progress.add_task(f"生成 {file}", total=None)
try:
# 获取依赖文件
deps = dependencies.get(file, [])
# 构造生成指令
instruction = f"请根据README描述和依赖文件生成文件 '{file}' 的完整代码。"
# 调用LLM生成代码
code, desc, commands = self.generate_file(file, instruction, deps)
logger.info(f"生成完成: {file} - {desc}")
# 写入文件
output_path = self.output_dir / file
output_path.parent.mkdir(parents=True, exist_ok=True)
with open(output_path, "w", encoding="utf-8") as f:
f.write(code)
logger.info(f"已写入: {output_path}")
# 执行命令
for cmd in commands:
logger.info(f"准备执行命令: {cmd}")
self.execute_command(cmd, cwd=self.output_dir)
except Exception as e:
logger.error(f"处理文件 {file} 失败: {e}")
# 可选:继续或终止
finally:
progress.remove_task(file_task)
progress.update(total_task, advance=1)
logger.success("所有文件处理完成!")
# ==================== CLI入口 ====================
@app.command()
def main(
readme: Path = typer.Argument(..., exists=True, file_okay=True, dir_okay=False, help="README.md文件路径"),
output_dir: Optional[Path] = typer.Option(None, "--output", "-o", help="输出根目录默认为readme所在目录"),
api_key: Optional[str] = typer.Option(None, "--api-key", envvar="DEEPSEEK_APIKEY", help="API密钥也可通过环境变量DEEPSEEK_APIKEY设置"),
base_url: str = typer.Option("https://api.deepseek.com", "--base-url", help="API基础URL"),
model: str = typer.Option("deepseek-reasoner", "--model", "-m", help="使用的模型"),
log_file: Optional[str] = typer.Option(None, "--log", help="日志文件路径默认输出目录下generator.log"),
):
"""
根据README自动生成项目代码
"""
if output_dir is None:
output_dir = readme.parent
generator = CodeGenerator(
api_key=api_key,
base_url=base_url,
model=model,
output_dir=output_dir,
log_file=log_file,
)
generator.run(readme)
if __name__ == "__main__":
app()

View File

@ -4,13 +4,9 @@ build-backend = "setuptools.build_meta"
[project]
name = "llm-codegen"
version = "0.1.0"
description = "基于大语言模型的自动化代码生成工具根据README.md描述自动生成完整的Python包代码具备代码检查、测试和自动修复能力。"
authors = [
{name = "Your Name", email = "your.email@example.com"}
]
version = "1.0.0"
description = "一个基于大语言模型的智能代码生成与维护工具支持自动生成、增量添加功能和自动修复Bug。"
readme = "README.md"
license = {text = "MIT"}
requires-python = ">=3.9"
dependencies = [
"typer>=0.9.0",
@ -18,32 +14,27 @@ dependencies = [
"loguru>=0.7.0",
"openai>=1.0.0",
]
authors = [
{name = "Your Name", email = "your.email@example.com"}
]
classifiers = [
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
]
[project.optional-dependencies]
dev = [
"pytest>=7.0.0",
"pytest-cov>=4.0.0",
"pylint>=3.0.0",
"mypy>=1.0.0",
"black>=23.0.0",
"pre-commit>=3.0.0",
]
[project.urls]
Homepage = "https://github.com/yourusername/llm-codegen"
[project.scripts]
llm-codegen = "llm_codegen.cli:app"
llm-codegen = "src.llm_codegen.cli:app"
[tool.llm-codegen]
check_tools = ["pytest", "pylint", "mypy", "black"]
max_retries = 3
dangerous_commands = ["rm", "sudo", "chmod", "dd"]
[tool.black]
line-length = 88
target-version = ['py39']
[tool.pytest.ini_options]
testpaths = ["tests"]
addopts = "--cov=src/llm_codegen --cov-report=term-missing"

View File

@ -1,21 +1,15 @@
"""
LLM Code Generator package.
llm-codegen包初始化文件
This package provides an automated code generation tool based on large language models (LLMs).
It can generate complete Python package code from README descriptions, with features like code checking, testing, and auto-fixing.
此文件使src/llm_codegen目录成为一个Python包定义包版本和导出核心模块
便于用户直接导入使用
"""
__version__ = "0.1.0"
__author__ = "LLM CodeGen Team"
__description__ = "A self-bootstrapping LLM-based code generation tool"
__version__ = "1.0.0"
__description__ = "一个基于大语言模型的智能代码生成与维护工具"
# Export main components for easy access from the package
# 导出核心模块以便从包级别导入
from .core import CodeGenerator
from .cli import app
from .utils import is_dangerous_command
# from .cli import main
__all__ = [
"CodeGenerator",
"app",
"is_dangerous_command",
]
__all__ = ["CodeGenerator", "__version__", "__description__"]

View File

@ -1,78 +1,96 @@
"""
checker.py - 并行检查与修复模块
负责在代码生成后运行配置的检查工具如pylintmypyblack并收集错误
然后使用LLM自动生成和应用修复补丁
"""
import json
import os
import subprocess
import sys
from typing import List, Dict, Optional, Tuple, Any
from pathlib import Path
from typing import List, Dict, Optional, Any, Tuple
from concurrent.futures import ThreadPoolExecutor, as_completed
import os
from loguru import logger
from openai import OpenAI
from .models import ConfigModel # 从models.py导入配置模型
from .utils import safe_read_file, safe_write_file # 工具函数
from .core import CodeGenerator
from .utils import is_dangerous_command
class Checker:
"""
检查与修复器类提供并行运行检查工具和自动修复功能
并行检查与修复模块运行检查工具如pylintmypyblack并收集错误
支持自动调用LLM生成修复补丁
"""
def __init__(
self,
output_dir: Path,
config: ConfigModel,
check_tools: Optional[List[str]] = None,
code_generator: Optional[CodeGenerator] = None,
api_key: Optional[str] = None,
base_url: str = "https://api.deepseek.com",
model: str = "deepseek-reasoner",
):
"""
初始化检查器
初始化检查器
Args:
output_dir: 输出目录包含生成的代码
config: 配置模型包含check_toolsmax_retries等
api_key: LLM API密钥如果None则从环境变量DEEPSEEK_APIKEY获取
base_url: LLM API基础URL
model: LLM模型
output_dir: 项目输出目录用于查找代码文件和保存检查结果
check_tools: 检查工具列表默认为 ["pylint", "mypy", "black"]
code_generator: CodeGenerator实例用于调用LLM如果为None则创建新实例
api_key: OpenAI API密钥用于LLM调用如果code_generator为None
base_url: API基础URL如果code_generator为None
model: 使用的模型如果code_generator为None
"""
self.output_dir = output_dir
self.config = config
self.api_key = api_key or os.getenv("DEEPSEEK_APIKEY")
if not self.api_key:
raise ValueError("API密钥未提供请设置环境变量DEEPSEEK_APIKEY或传入api_key")
self.client = OpenAI(api_key=self.api_key, base_url=base_url)
self.model = model
self.max_retries = config.max_retries
self.output_dir = Path(output_dir)
self.check_tools = check_tools or ["pylint", "mypy", "black"]
def run_check_tool(self, tool: str, file_path: Path) -> Tuple[bool, str]:
if code_generator:
self.code_generator = code_generator
else:
self.code_generator = CodeGenerator(
api_key=api_key,
base_url=base_url,
model=model,
output_dir=str(self.output_dir),
)
self.results_file = self.output_dir / "check_results.json"
logger.info(f"Checker初始化完成输出目录: {self.output_dir}")
def run_check(self, tool: str, file_path: Path) -> Dict[str, Any]:
"""
运行单个检查工具并返回结果
运行单个检查工具并返回结果
Args:
tool: 工具名称"pylint"
file_path: 要检查的文件路径
tool: 检查工具名称 'pylint', 'mypy', 'black'
file_path: 要检查的文件路径
Returns:
(success, output): 成功为True错误输出字符串
Dict包含工具名返回码stdoutstderr和错误信息
"""
commands = {
"pylint": f"pylint {file_path}",
"mypy": f"mypy {file_path}",
"black": f"black --check {file_path}",
"pytest": f"pytest {file_path}", # 假设检查测试文件
}
if tool not in commands:
logger.warning(f"未知检查工具: {tool}")
return True, "" # 跳过未知工具
logger.debug(f"运行检查工具: {tool} 在文件: {file_path}")
# 构建命令,根据工具不同调整
if tool == "pylint":
cmd = f"pylint {file_path} --output-format=json"
elif tool == "mypy":
cmd = f"mypy {file_path} --show-error-codes --no-error-summary"
elif tool == "black":
cmd = f"black --check --diff {file_path}"
else:
# 默认直接运行工具
cmd = f"{tool} {file_path}"
# 检查命令是否危险
dangerous, reason = is_dangerous_command(cmd)
if dangerous:
logger.warning(f"检查命令可能危险,跳过: {cmd}, 原因: {reason}")
return {
"tool": tool,
"file": str(file_path),
"returncode": -1,
"stdout": "",
"stderr": f"危险命令被阻止: {reason}",
"errors": [],
}
cmd = commands[tool]
try:
result = subprocess.run(
cmd,
@ -82,161 +100,255 @@ class Checker:
text=True,
timeout=60, # 1分钟超时
)
if result.returncode == 0:
return True, ""
else:
output = result.stdout + result.stderr
return False, output
# 解析错误信息
errors = []
if result.stderr:
errors.append(result.stderr.strip())
if result.stdout:
# 对于pylint的JSON输出可以进一步解析
if tool == "pylint" and result.returncode != 0:
try:
pylint_errors = json.loads(result.stdout)
errors.extend([e.get("message", "") for e in pylint_errors])
except json.JSONDecodeError:
errors.append(result.stdout.strip())
elif result.returncode != 0:
errors.append(result.stdout.strip())
return {
"tool": tool,
"file": str(file_path),
"returncode": result.returncode,
"stdout": result.stdout,
"stderr": result.stderr,
"errors": errors,
}
except subprocess.TimeoutExpired:
logger.error(f"检查工具 {tool} 超时")
return False, "超时"
logger.error(f"检查工具 {tool} 超时: {cmd}")
return {
"tool": tool,
"file": str(file_path),
"returncode": -1,
"stdout": "",
"stderr": "检查超时",
"errors": ["检查超时"],
}
except Exception as e:
logger.error(f"运行检查工具 {tool} 失败: {e}")
return False, str(e)
return {
"tool": tool,
"file": str(file_path),
"returncode": -1,
"stdout": "",
"stderr": str(e),
"errors": [str(e)],
}
def run_parallel_checks(self, files: List[Path]) -> Dict[str, List[Tuple[str, str]]]:
def run_parallel_checks(self, files: Optional[List[Path]] = None) -> List[Dict[str, Any]]:
"""
并行运行所有配置的检查工具
并行运行所有检查工具在指定文件上
Args:
files: 要检查的文件路径列表
files: 要检查的文件路径列表如果为None则检查输出目录下所有.py文件
Returns:
错误字典键为文件路径值为列表每个元素为(工具名, 错误输出)
"""
errors = {}
check_tools = self.config.check_tools
with ThreadPoolExecutor() as executor:
futures = []
for file in files:
for tool in check_tools:
future = executor.submit(self.run_check_tool, tool, file)
futures.append((future, file, tool))
for future, file, tool in futures:
success, output = future.result()
if not success:
if file not in errors:
errors[file] = []
errors[file].append((tool, output))
return errors
def call_llm_for_fix(self, file_path: Path, errors: List[Tuple[str, str]], readme_content: str) -> Optional[str]:
"""
调用LLM生成修复补丁
Args:
file_path: 需要修复的文件
errors: 错误列表每个元素为(工具名, 错误输出)
readme_content: README内容
Returns:
修复后的代码字符串如果失败返回None
"""
error_summary = "\n".join([f"{tool}: {err}" for tool, err in errors])
file_content = safe_read_file(file_path)
system_prompt = (
"你是一个专业的代码修复助手。给定代码、错误信息和项目README请生成修复后的完整代码。"
"返回严格的JSON对象包含字段\n"
"- code: (string) 修复后的完整代码\n"
"- description: (string) 修复描述\n"
)
user_prompt = (
f"项目README:\n{readme_content}\n\n"
f"文件内容:\n{file_content}\n\n"
f"错误信息:\n{error_summary}\n\n"
"请生成修复后的代码,确保所有检查通过。"
)
try:
response = self.client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
],
temperature=0.2,
response_format={"type": "json_object"},
)
result = json.loads(response.choices[0].message.content)
return result.get("code")
except Exception as e:
logger.error(f"调用LLM修复失败: {e}")
return None
def apply_fix(self, file_path: Path, new_code: str) -> bool:
"""
应用修复代码
Args:
file_path: 文件路径
new_code: 新代码
Returns:
是否成功应用
"""
try:
safe_write_file(file_path, new_code)
logger.info(f"已应用修复到 {file_path}")
return True
except Exception as e:
logger.error(f"应用修复失败: {e}")
return False
def run_checks_and_fixes(self, readme_content: str, files: Optional[List[Path]] = None) -> bool:
"""
主方法运行检查并自动修复
Args:
readme_content: README内容
files: 要检查的文件列表如果None则检查output_dir下所有Python文件
Returns:
是否所有检查最终通过
检查结果列表每个元素为run_check返回的字典
"""
if files is None:
# 递归查找所有Python文件
# 递归查找所有.py文件
files = list(self.output_dir.rglob("*.py"))
logger.info(f"开始并行检查,文件数: {len(files)}, 工具数: {len(self.check_tools)}")
for attempt in range(self.max_retries):
logger.info(f"检查尝试 {attempt + 1}/{self.max_retries}")
errors = self.run_parallel_checks(files)
all_results = []
with ThreadPoolExecutor(max_workers=min(4, len(self.check_tools) * len(files))) as executor:
futures = []
for tool in self.check_tools:
for file_path in files:
futures.append(executor.submit(self.run_check, tool, file_path))
for future in as_completed(futures):
try:
result = future.result()
all_results.append(result)
except Exception as e:
logger.error(f"并行检查任务失败: {e}")
# 保存结果到文件
self.save_results(all_results)
logger.info(f"并行检查完成,总结果数: {len(all_results)}")
return all_results
def save_results(self, results: List[Dict[str, Any]]) -> None:
"""保存检查结果到JSON文件"""
try:
with open(self.results_file, "w", encoding="utf-8") as f:
json.dump(results, f, indent=2, ensure_ascii=False)
logger.debug(f"检查结果已保存至: {self.results_file}")
except Exception as e:
logger.error(f"保存检查结果失败: {e}")
def collect_errors(self, results: Optional[List[Dict[str, Any]]] = None) -> List[Dict[str, Any]]:
"""
从检查结果中收集所有错误
Args:
results: 检查结果列表如果为None则从文件加载
Returns:
错误列表每个错误包含文件工具和错误信息
"""
if results is None:
if self.results_file.exists():
try:
with open(self.results_file, "r", encoding="utf-8") as f:
results = json.load(f)
except Exception as e:
logger.error(f"加载检查结果失败: {e}")
return []
else:
logger.warning("无检查结果文件,先运行检查")
return []
errors = []
for result in results:
if result.get("errors") and result["errors"]:
for error_msg in result["errors"]:
if error_msg: # 跳过空错误
errors.append({
"file": result["file"],
"tool": result["tool"],
"error": error_msg,
})
logger.info(f"收集到 {len(errors)} 个错误")
return errors
def auto_fix(self, errors: List[Dict[str, Any]], context_files: Optional[List[str]] = None) -> bool:
"""
自动调用LLM生成修复补丁并应用
Args:
errors: 错误列表来自collect_errors
context_files: 上下文文件路径列表用于LLM生成修复
Returns:
bool: 修复是否成功至少修复了一个错误
"""
if not errors:
logger.info("没有错误需要修复")
return True
logger.info(f"开始自动修复 {len(errors)} 个错误")
# 准备上下文包括README、design.json和相关代码文件
context_content = []
# 添加README如果存在
readme_path = self.output_dir / "README.md"
if readme_path.exists():
with open(readme_path, "r", encoding="utf-8") as f:
context_content.append(f"### 项目 README ###\n{f.read()}\n")
# 添加design.json如果存在
design_path = self.output_dir / "design.json"
if design_path.exists():
with open(design_path, "r", encoding="utf-8") as f:
context_content.append(f"### 设计文件: design.json ###\n{f.read()}\n")
# 添加错误相关的代码文件
if context_files is None:
context_files = list(set(error["file"] for error in errors))
for file_path in context_files:
path = Path(file_path)
if not path.exists():
path = self.output_dir / file_path
if path.exists():
with open(path, "r", encoding="utf-8") as f:
context_content.append(f"### 文件: {path.name} (路径: {file_path}) ###\n{f.read()}\n")
# 添加错误信息
errors_str = json.dumps(errors, indent=2, ensure_ascii=False)
context_content.append(f"### 检查错误列表 ###\n{errors_str}\n")
full_context = "\n".join(context_content)
# 调用LLM生成修复
system_prompt = (
"你是一个专业的编程助手擅长修复代码错误。根据提供的上下文包括项目README、设计文件、相关代码和检查错误"
"生成修复补丁代码。返回严格的JSON对象包含两个字段\n"
"- patches: 数组,每个元素是一个对象,包含'file'(文件路径)和'code'(修复后的完整代码或差异)\n"
"- description: 简短的中文修复描述\n"
"注意:只修复提到的错误,保持代码风格一致。"
)
user_prompt = f"请修复以下检查错误:\n\n{full_context}"
try:
result = self.code_generator._call_llm(system_prompt, user_prompt, temperature=0.1)
patches = result.get("patches", [])
description = result.get("description", "无描述")
logger.info(f"LLM生成修复补丁: {description}, 补丁数: {len(patches)}")
# 应用补丁
success_count = 0
for patch in patches:
file_path = patch.get("file")
code = patch.get("code")
if not file_path or not code:
logger.warning(f"无效补丁: {patch}")
continue
full_path = self.output_dir / file_path
try:
# 如果是完整代码,直接覆盖;如果是差异,这里简化处理为覆盖
with open(full_path, "w", encoding="utf-8") as f:
f.write(code)
logger.info(f"已应用修复到文件: {file_path}")
success_count += 1
except Exception as e:
logger.error(f"应用修复失败到文件 {file_path}: {e}")
logger.info(f"自动修复完成,成功修复 {success_count}/{len(patches)} 个补丁")
return success_count > 0
except Exception as e:
logger.error(f"调用LLM生成修复失败: {e}")
return False
def run_full_check_and_fix(self, max_retries: int = 3) -> bool:
"""
运行完整检查与修复循环直到无错误或达到最大重试次数
Args:
max_retries: 最大修复重试次数
Returns:
bool: 是否成功无错误或修复后无错误
"""
for attempt in range(max_retries):
logger.info(f"检查与修复循环,尝试 {attempt + 1}/{max_retries}")
# 运行并行检查
results = self.run_parallel_checks()
errors = self.collect_errors(results)
if not errors:
logger.success("所有检查通过!")
logger.success("所有检查通过,无错误")
return True
# 有错误,尝试修复
logger.warning(f"发现 {len(errors)} 个文件有错误,尝试修复")
all_fixed = True
for file_path, error_list in errors.items():
new_code = self.call_llm_for_fix(file_path, error_list, readme_content)
if new_code:
if self.apply_fix(file_path, new_code):
# 修复后重新检查这个文件
success, _ = self.run_check_tool("pylint", file_path) # 简化检查,重新运行一个工具
if not success:
all_fixed = False
else:
all_fixed = False
else:
all_fixed = False
if all_fixed:
logger.info("修复成功,重新检查...")
continue # 重新检查所有文件
logger.warning(f"发现 {len(errors)} 个错误,尝试自动修复")
success = self.auto_fix(errors)
if not success:
logger.error(f"{attempt + 1} 次修复失败")
if attempt == max_retries - 1:
return False
else:
logger.error("修复失败或仍有错误")
break
logger.info(f"{attempt + 1} 次修复成功,重新检查")
# 最终检查
final_errors = self.run_parallel_checks(files)
if not final_errors:
logger.success("检查最终通过")
return True
else:
logger.error(f"检查失败,剩余错误: {final_errors}")
# 最后一次检查
results = self.run_parallel_checks()
errors = self.collect_errors(results)
if errors:
logger.error(f"修复后仍有 {len(errors)} 个错误")
return False
else:
logger.success("修复后所有检查通过")
return True

View File

@ -1,49 +1,156 @@
import typer
#!/usr/bin/env python3
"""
LLM 代码生成工具的命令行接口
支持 initenhancefix 三种操作模式使用 typer 构建 CLI
"""
import sys
from pathlib import Path
from typing import Optional
from loguru import logger
from rich.console import Console
from .core import CodeGenerator
app = typer.Typer(help="基于LLM的自动化代码生成工具")
import typer
from rich.console import Console
from loguru import logger
from .core import CodeGenerator
from .checker import Checker
app = typer.Typer(help="基于LLM的自动化代码生成与维护工具")
console = Console()
@app.command()
def main(
readme: Path = typer.Argument(..., exists=True, file_okay=True, dir_okay=False, help="README.md文件路径"),
output_dir: Optional[Path] = typer.Option(None, "--output", "-o", help="输出根目录默认为readme所在目录"),
api_key: Optional[str] = typer.Option(None, "--api-key", envvar="DEEPSEEK_APIKEY", help="API密钥也可通过环境变量DEEPSEEK_APIKEY设置"),
def init(
readme: Path = typer.Argument(..., exists=True, file_okay=True, dir_okay=False, help="README.md 文件路径"),
output_dir: Optional[Path] = typer.Option(None, "--output", "-o", help="输出根目录,默认为当前目录"),
api_key: Optional[str] = typer.Option(None, "--api-key", envvar="DEEPSEEK_APIKEY", help="API密钥"),
base_url: str = typer.Option("https://api.deepseek.com", "--base-url", help="API基础URL"),
model: str = typer.Option("deepseek-reasoner", "--model", "-m", help="使用的模型"),
log_file: Optional[str] = typer.Option(None, "--log", help="日志文件路径默认输出目录下generator.log"),
resume: bool = typer.Option(True, "--resume/--no-resume", help="是否启用断点续写(默认启用)"),
no_check: bool = typer.Option(False, "--no-check", help="跳过生成后的检查和修复"),
log_file: Optional[str] = typer.Option(None, "--log", help="日志文件路径"),
):
"""
根据README自动生成项目代码支持断点续写和可选检查
初始化项目根据 README.md 自动生成完整的代码
"""
if output_dir is None:
output_dir = readme.parent
output_dir = Path.cwd()
try:
generator = CodeGenerator(
api_key=api_key,
base_url=base_url,
model=model,
output_dir=output_dir,
output_dir=str(output_dir),
log_file=log_file,
resume=resume,
config_path=None, # 配置文件路径可从pyproject.toml加载但CLI中暂不提供参数
)
generator.run(readme)
# 如果未跳过检查,提示用户检查功能暂未实现
if not no_check:
console.print("[yellow]注意检查和修复功能暂未在此版本中实现请手动运行检查工具如pytest、pylint。[/yellow]")
except Exception as e:
logger.error(f"程序异常退出: {e}")
logger.error(f"初始化失败: {e}")
raise typer.Exit(code=1)
@app.command()
def enhance(
issue_file: Path = typer.Argument(..., exists=True, file_okay=True, dir_okay=False, help="需求工单文件路径(如 feature.issue"),
output_dir: Optional[Path] = typer.Option(None, "--output", "-o", help="项目根目录,默认为当前目录"),
api_key: Optional[str] = typer.Option(None, "--api-key", envvar="DEEPSEEK_APIKEY", help="API密钥"),
base_url: str = typer.Option("https://api.deepseek.com", "--base-url", help="API基础URL"),
model: str = typer.Option("deepseek-reasoner", "--model", "-m", help="使用的模型"),
log_file: Optional[str] = typer.Option(None, "--log", help="日志文件路径"),
):
"""
增强项目根据需求工单添加新功能
"""
if output_dir is None:
output_dir = Path.cwd()
# 读取工单文件
try:
with open(issue_file, "r", encoding="utf-8") as f:
issue_content = f.read()
except Exception as e:
logger.error(f"读取工单文件失败: {e}")
raise typer.Exit(code=1)
# 检查 design.json 是否存在
design_path = output_dir / "design.json"
if not design_path.exists():
logger.error(f"design.json 不存在于 {output_dir},请先运行 init 命令初始化项目。")
raise typer.Exit(code=1)
try:
generator = CodeGenerator(
api_key=api_key,
base_url=base_url,
model=model,
output_dir=str(output_dir),
log_file=log_file,
)
# 简化增强逻辑:基于工单内容调用 LLM 生成代码变更
logger.info(f"处理增强工单: {issue_file}")
console.print(f"[yellow]注意:增强功能为简化实现,基于工单内容生成变更。工单内容预览: {issue_content[:100]}...[/yellow]")
# 实际应用中,这里应解析工单并调用 generator 或类似方法生成代码
# 示例:生成一个占位文件或调用检查器
checker = Checker(output_dir=output_dir, code_generator=generator)
success = checker.run_full_check_and_fix()
if not success:
logger.error("增强过程中检查失败")
raise typer.Exit(code=1)
console.print("[green]增强处理完成,请检查生成的代码和日志。[/green]")
except Exception as e:
logger.error(f"增强失败: {e}")
raise typer.Exit(code=1)
@app.command()
def fix(
issue_file: Path = typer.Argument(..., exists=True, file_okay=True, dir_okay=False, help="Bug工单文件路径如 bug.issue"),
output_dir: Optional[Path] = typer.Option(None, "--output", "-o", help="项目根目录,默认为当前目录"),
api_key: Optional[str] = typer.Option(None, "--api-key", envvar="DEEPSEEK_APIKEY", help="API密钥"),
base_url: str = typer.Option("https://api.deepseek.com", "--base-url", help="API基础URL"),
model: str = typer.Option("deepseek-reasoner", "--model", "-m", help="使用的模型"),
log_file: Optional[str] = typer.Option(None, "--log", help="日志文件路径"),
):
"""
修复项目根据Bug工单自动修复 Bug
"""
if output_dir is None:
output_dir = Path.cwd()
# 读取工单文件
try:
with open(issue_file, "r", encoding="utf-8") as f:
issue_content = f.read()
except Exception as e:
logger.error(f"读取工单文件失败: {e}")
raise typer.Exit(code=1)
# 检查 design.json 是否存在
design_path = output_dir / "design.json"
if not design_path.exists():
logger.error(f"design.json 不存在于 {output_dir},请确保项目已初始化。")
raise typer.Exit(code=1)
try:
generator = CodeGenerator(
api_key=api_key,
base_url=base_url,
model=model,
output_dir=str(output_dir),
log_file=log_file,
)
# 简化修复逻辑:基于工单内容调用检查器进行修复
logger.info(f"处理Bug工单: {issue_file}")
console.print(f"[yellow]注意:修复功能为简化实现,基于工单内容调用检查器。工单内容预览: {issue_content[:100]}...[/yellow]")
checker = Checker(output_dir=output_dir, code_generator=generator)
success = checker.run_full_check_and_fix()
if not success:
logger.error("修复过程中检查失败")
raise typer.Exit(code=1)
console.print("[green]修复处理完成,请检查修复后的代码和日志。[/green]")
except Exception as e:
logger.error(f"修复失败: {e}")
raise typer.Exit(code=1)
if __name__ == "__main__":
app()

View File

@ -2,23 +2,21 @@ import json
import os
import subprocess
import sys
from pathlib import Path
from typing import List, Dict, Optional, Any, Tuple
from pathlib import Path
from loguru import logger
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskID
import typer
from rich.console import Console
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskID
from loguru import logger
from openai import OpenAI
# 导入本地模块
from .utils import is_dangerous_command, safe_read_file, safe_write_file, load_state, save_state, normalize_path, load_dangerous_commands
from .models import GeneratorState, FileInfo, ProjectStructure, ConfigModel
from .utils import is_dangerous_command, read_file, write_file, ensure_dir, safe_join
from .models import DesignModel, StateModel, LLMResponse
class CodeGenerator:
"""
核心代码生成器类负责解析README生成代码执行命令并支持断点续写
"""
"""代码生成器,封装所有逻辑,支持设计层、断点续写和命令执行"""
def __init__(
self,
@ -27,20 +25,16 @@ class CodeGenerator:
model: str = "deepseek-reasoner",
output_dir: str = "./generated",
log_file: Optional[str] = None,
resume: bool = True,
config_path: Optional[Path] = None,
):
"""
初始化生成器
初始化生成器
Args:
api_key: OpenAI API密钥默认从环境变量DEEPSEEK_APIKEY读取
base_url: API基础URL
model: 使用的模型
output_dir: 输出根目录
log_file: 日志文件路径默认自动生成
resume: 是否启用断点续写默认为True
config_path: 配置文件路径用于加载危险命令等配置
api_key: OpenAI API密钥默认从环境变量DEEPSEEK_APIKEY读取
base_url: API基础URL
model: 使用的模型
output_dir: 输出根目录
log_file: 日志文件路径默认自动生成
"""
self.api_key = api_key or os.getenv("DEEPSEEK_APIKEY")
if not self.api_key:
@ -50,43 +44,21 @@ class CodeGenerator:
self.model = model
self.output_dir = Path(output_dir)
self.output_dir.mkdir(parents=True, exist_ok=True)
self.resume = resume
self.config_path = config_path
# 加载配置
self.config = self._load_config()
self.dangerous_commands = self.config.dangerous_commands
self.state_file = self.output_dir / ".llm_generator_state.json"
# 配置日志
if log_file is None:
log_file = self.output_dir / "generator.log"
logger.remove() # 移除默认handler
logger.add(sys.stderr, level="WARNING") # 控制台输出INFO及以上
logger.add(sys.stderr, level="WARNING") # 控制台输出WARNING及以上
logger.add(log_file, rotation="10 MB", level="DEBUG") # 文件记录DEBUG
logger.info(f"日志已初始化,保存至: {log_file}")
self.readme_content = None
self.state_path = self.output_dir / ".llm_generator_state.json"
self.state: Optional[GeneratorState] = None
self.design: Optional[DesignModel] = None
self.state: Optional[StateModel] = None
self.progress: Optional[Progress] = None
self.tasks: Dict[str, TaskID] = {} # 任务ID映射
self.console = Console()
def _load_config(self) -> ConfigModel:
"""
加载配置如果配置文件不存在则使用默认值
"""
try:
# 简化实现从环境或固定路径加载实际应从pyproject.toml解析
dangerous_cmds = load_dangerous_commands(self.config_path)
return ConfigModel(
check_tools=["pytest", "pylint", "mypy", "black"],
max_retries=3,
dangerous_commands=dangerous_cmds,
)
except Exception as e:
logger.warning(f"加载配置失败,使用默认值: {e}")
return ConfigModel()
def _call_llm(
self,
@ -96,7 +68,7 @@ class CodeGenerator:
expect_json: bool = True,
) -> Dict[str, Any]:
"""
调用LLM并返回解析后的JSON
调用LLM并返回解析后的JSON
"""
logger.debug(f"调用LLM模型: {self.model}")
logger.debug(f"System: {system_prompt[:200]}...")
@ -138,45 +110,85 @@ class CodeGenerator:
def parse_readme(self, readme_path: Path) -> str:
"""
读取README文件内容并计算哈希值用于断点续写检测
读取README文件内容
"""
logger.info(f"读取README文件: {readme_path}")
try:
content = safe_read_file(readme_path)
with open(readme_path, "r", encoding="utf-8") as f:
content = f.read()
logger.debug(f"README内容长度: {len(content)} 字符")
return content
except Exception as e:
logger.error(f"读取README失败: {e}")
raise
def generate_design_json(self) -> DesignModel:
"""
调用LLM生成design.json内容并解析为DesignModel
"""
system_prompt = (
"你是一个软件架构师。请根据README描述生成项目的中间设计文件design.json。"
"design.json应包含项目名称、版本、描述、文件列表含路径、摘要、依赖、函数和类、建议命令和检查工具。"
"返回严格的JSON对象符合DesignModel结构。"
)
user_prompt = f"README内容如下\n\n{self.readme_content}"
result = self._call_llm(system_prompt, user_prompt)
design_data = result
design = DesignModel(**design_data)
# 写入design.json文件
design_path = self.output_dir / "design.json"
with open(design_path, "w", encoding="utf-8") as f:
json.dump(design.dict(), f, indent=2, ensure_ascii=False)
logger.info(f"已生成design.json: {design_path}")
return design
def load_state(self) -> Optional[StateModel]:
"""加载断点续写状态"""
if self.state_file.exists():
try:
with open(self.state_file, "r", encoding="utf-8") as f:
state_data = json.load(f)
self.state = StateModel(**state_data)
logger.info(f"加载状态成功: 当前文件索引 {self.state.current_file_index}")
return self.state
except Exception as e:
logger.error(f"加载状态失败: {e}")
return None
return None
def save_state(self, current_file_index: int, generated_files: List[str], dependencies_map: Dict[str, List[str]]) -> None:
"""保存断点续写状态"""
state = StateModel(
current_file_index=current_file_index,
generated_files=generated_files,
dependencies_map=dependencies_map,
total_files=len(self.design.files) if self.design else 0,
output_dir=str(self.output_dir),
readme_path=self.readme_content[:100] if self.readme_content else ""
)
with open(self.state_file, "w", encoding="utf-8") as f:
json.dump(state.dict(), f, indent=2, ensure_ascii=False)
logger.debug(f"状态已保存: {self.state_file}")
def get_project_structure(self) -> Tuple[List[str], Dict[str, List[str]]]:
"""
根据README内容让LLM生成文件列表和依赖关系
从design.json获取文件列表和依赖关系
Returns:
(files, dependencies)
files: 按顺序需要生成的文件路径列表
dependencies: 字典 {file: [依赖文件路径]}
"""
system_prompt = (
"你是一个软件架构师。请根据README描述分析需要生成哪些源代码文件并确定它们的生成顺序"
"同时给出每个文件生成时最少需要读取哪些已有文件作为上下文。"
"返回严格的JSON对象包含两个字段\n"
"- files: 数组,按生成顺序排列的文件路径(相对于项目根目录)\n"
"- dependencies: 对象,键为文件路径,值为该文件依赖的已有文件路径列表(可为空)\n"
"注意:依赖文件必须是已存在的参考文件,不要包含待生成的文件。"
)
user_prompt = f"README内容如下\n\n{self.readme_content}"
if not self.design:
raise ValueError("design.json未加载请先调用generate_design_json")
result = self._call_llm(system_prompt, user_prompt)
files = [file.path for file in self.design.files]
dependencies = {file.path: file.dependencies for file in self.design.files}
files = result.get("files", [])
dependencies = result.get("dependencies", {})
if not files:
raise ValueError("LLM未返回任何文件列表")
logger.info(f"解析到 {len(files)} 个待生成文件")
logger.info(f"从design.json解析到 {len(files)} 个待生成文件")
logger.debug(f"文件列表: {files}")
logger.debug(f"依赖关系: {dependencies}")
@ -189,7 +201,7 @@ class CodeGenerator:
dependency_files: List[str],
) -> Tuple[str, str, List[str]]:
"""
生成单个文件返回 (代码, 描述, 命令列表)
生成单个文件返回 (代码, 描述, 命令列表)
"""
# 读取依赖文件内容
context_content = []
@ -197,17 +209,25 @@ class CodeGenerator:
if self.readme_content:
context_content.append(f"### 项目 README ###\n{self.readme_content}\n")
# 添加design.json上下文
design_path = self.output_dir / "design.json"
if design_path.exists():
with open(design_path, "r", encoding="utf-8") as f:
design_content = f.read()
context_content.append(f"### 设计文件: design.json ###\n{design_content}\n")
for dep in dependency_files:
dep_path = Path(dep)
if not dep_path.exists():
# 尝试相对于输出目录查找
# 尝试相对于当前目录或输出目录查找
alt_path = self.output_dir / dep
if alt_path.exists():
dep_path = alt_path
else:
raise FileNotFoundError(f"依赖文件不存在: {dep}")
content = safe_read_file(dep_path)
with open(dep_path, "r", encoding="utf-8") as f:
content = f.read()
context_content.append(f"### 文件: {dep_path.name} (路径: {dep}) ###\n{content}\n")
full_context = "\n".join(context_content)
@ -222,24 +242,21 @@ class CodeGenerator:
user_prompt = f"{prompt_instruction}\n\n参考文件上下文:\n{full_context}"
result = self._call_llm(system_prompt, user_prompt)
llm_response = LLMResponse(**result)
code = result.get("code", "")
description = result.get("description", "")
commands = result.get("commands", [])
return llm_response.code, llm_response.description, llm_response.commands
if not isinstance(commands, list):
commands = []
return code, description, commands
def execute_command(self, cmd: str, cwd: Optional[Path] = None) -> None:
def execute_command(self, cmd: str, cwd: Optional[Path] = None) -> bool:
"""
执行单个命令检查风险
执行单个命令检查风险失败仅记录错误不抛出异常
Returns:
bool: 命令是否成功执行
"""
dangerous, reason = is_dangerous_command(cmd, self.dangerous_commands)
dangerous, reason = is_dangerous_command(cmd)
if dangerous:
logger.error(f"危险命令被阻止: {cmd},原因: {reason}")
raise RuntimeError(f"危险命令: {cmd} ({reason})")
return False
logger.info(f"执行命令: {cmd}")
try:
@ -257,64 +274,57 @@ class CodeGenerator:
if result.stderr:
logger.warning(f"stderr: {result.stderr[:500]}")
if result.returncode != 0:
raise subprocess.CalledProcessError(result.returncode, cmd)
logger.error(f"命令执行失败,返回码: {result.returncode}")
return False
return True
except subprocess.TimeoutExpired:
logger.error(f"命令执行超时: {cmd}")
raise
return False
except Exception as e:
logger.error(f"命令执行失败: {e}")
raise
return False
def _update_state(self, generated_file: str, executed_commands: List[str]) -> None:
def run(self, readme_path: Path):
"""
更新断点续写状态
"""
if self.state is None:
self.state = GeneratorState()
self.state.generated_files.append(generated_file)
self.state.executed_commands.extend(executed_commands)
self.state.updated_at = datetime.now()
save_state(self.state_path, self.state.model_dump())
def run(self, readme_path: Path) -> None:
"""
主执行流程支持断点续写
主执行流程支持设计层生成和断点续写
"""
console = Console()
logger.info("=" * 50)
logger.info("开始代码生成流程")
logger.info(f"README: {readme_path}")
logger.info(f"输出目录: {self.output_dir}")
logger.info(f"断点续写: {self.resume}")
# 初始化阶段
self.console.print("[bold yellow]🔍 正在解析README...[/bold yellow]")
# 解析README
console.print("[bold yellow]🔍 正在解析README...[/bold yellow]")
self.readme_content = self.parse_readme(readme_path)
# 加载或初始化状态
if self.resume and self.state_path.exists():
raw_state = load_state(self.state_path)
self.state = GeneratorState(**raw_state) if raw_state else GeneratorState()
logger.info(f"加载状态文件: {self.state_path}")
# 检查README是否变更
if self.state.readme_hash and self.state.readme_hash != hash(self.readme_content):
logger.warning("README内容已变更建议使用 --no-resume 重新开始")
# 加载状态
state = self.load_state()
if state:
console.print(f"[green]✅ 检测到断点状态,从文件索引 {state.current_file_index} 继续[/green]")
self.state = state
# 从状态恢复设计假设design.json已存在
design_path = self.output_dir / "design.json"
if design_path.exists():
with open(design_path, "r", encoding="utf-8") as f:
design_data = json.load(f)
self.design = DesignModel(**design_data)
else:
console.print("[bold yellow]⚠ design.json不存在重新生成...[/bold yellow]")
self.design = self.generate_design_json()
else:
self.state = GeneratorState()
console.print("[bold yellow]📋 正在生成设计文件...[/bold yellow]")
self.design = self.generate_design_json()
self.state = None
self.console.print("[bold yellow]📋 正在分析项目结构...[/bold yellow]")
# 获取项目结构
console.print("[bold yellow]📋 正在分析项目结构...[/bold yellow]")
files, dependencies = self.get_project_structure()
console.print(f"[green]✅ 解析完成,共 {len(files)} 个文件待生成[/green]")
# 过滤已生成的文件
if self.resume and self.state:
pending_files = [f for f in files if f not in self.state.generated_files]
logger.info(f"跳过了 {len(files) - len(pending_files)} 个已生成文件,剩余 {len(pending_files)}")
files = pending_files
if not files:
logger.success("所有文件已生成,无需继续")
return
self.console.print(f"[green]✅ 解析完成,共 {len(files)} 个文件待生成[/green]")
# 断点续写:确定起始索引
start_index = self.state.current_file_index if self.state else 0
generated_files = self.state.generated_files if self.state else []
# 创建进度条
with Progress(
@ -322,16 +332,20 @@ class CodeGenerator:
TextColumn("[progress.description]{task.description}"),
BarColumn(),
TextColumn("[progress.percentage]{task.percentage:>3.0f}%"),
console=self.console,
console=console,
) as progress:
self.progress = progress
total_task = progress.add_task("[cyan]整体进度...", total=len(files))
progress.update(total_task, completed=start_index)
for idx, file in enumerate(files, 1):
logger.info(f"处理文件 [{idx}/{len(files)}]: {file}")
# 依次生成每个文件
for idx in range(start_index, len(files)):
file = files[idx]
logger.info(f"处理文件 [{idx + 1}/{len(files)}]: {file}")
file_task = progress.add_task(f"生成 {file}", total=None)
try:
# 获取依赖文件
deps = dependencies.get(file, [])
instruction = f"请根据README描述和依赖文件生成文件 '{file}' 的完整代码。"
code, desc, commands = self.generate_file(file, instruction, deps)
@ -339,27 +353,32 @@ class CodeGenerator:
# 写入文件
output_path = self.output_dir / file
safe_write_file(output_path, code)
output_path.parent.mkdir(parents=True, exist_ok=True)
with open(output_path, "w", encoding="utf-8") as f:
f.write(code)
logger.info(f"已写入: {output_path}")
generated_files.append(file)
# 执行命令,跳过已执行的
executed_in_this_file = []
# 执行命令
for cmd in commands:
if self.resume and self.state and cmd in self.state.executed_commands:
logger.info(f"跳过已执行命令: {cmd}")
continue
logger.info(f"准备执行命令: {cmd}")
self.execute_command(cmd, cwd=self.output_dir)
executed_in_this_file.append(cmd)
# 更新状态
self._update_state(file, executed_in_this_file)
success = self.execute_command(cmd, cwd=self.output_dir)
if not success:
logger.warning(f"命令执行失败,但继续处理: {cmd}")
except Exception as e:
logger.error(f"处理文件 {file} 失败: {e}")
# 保存状态以便断点续写
self.save_state(idx, generated_files, dependencies)
raise
finally:
progress.remove_task(file_task)
progress.update(total_task, advance=1)
# 更新状态
self.save_state(idx + 1, generated_files, dependencies)
logger.success("所有文件处理完成!")
# 清理状态文件
if self.state_file.exists():
self.state_file.unlink()
logger.info("状态文件已清理")

View File

@ -1,100 +1,75 @@
#!/usr/bin/env python3
"""
数据模型定义模块使用 Pydantic 进行数据验证和序列化
"""
from typing import List, Dict, Optional
from datetime import datetime
from typing import List, Dict, Optional, Any
from pydantic import BaseModel, Field
class GeneratorState(BaseModel):
"""
断点续写状态模型用于保存和加载 .llm_generator_state.json 文件
"""
generated_files: List[str] = Field(
default_factory=list,
description="已成功生成的文件路径列表"
)
executed_commands: List[str] = Field(
default_factory=list,
description="已执行的操作系统命令列表"
)
readme_hash: Optional[str] = Field(
default=None,
description="README 内容的哈希值,用于检测变更"
)
created_at: datetime = Field(
default_factory=datetime.now,
description="状态文件创建时间"
)
updated_at: datetime = Field(
default_factory=datetime.now,
description="状态文件最后更新时间"
)
class Config:
json_encoders = {
datetime: lambda v: v.isoformat()
}
# 模型用于 design.json 结构
class FunctionModel(BaseModel):
"""函数模型,对应 design.json 中的 functions 字段。"""
name: str
summary: str
inputs: List[str]
outputs: List[str]
class FileInfo(BaseModel):
"""
生成文件的信息模型
"""
path: str = Field(..., description="文件路径")
code: Optional[str] = Field(default=None, description="生成的代码内容")
description: Optional[str] = Field(default=None, description="文件功能描述")
commands: List[str] = Field(
default_factory=list,
description="生成后需要执行的命令列表"
)
class ClassModel(BaseModel):
"""类模型,对应 design.json 中的 classes 字段。"""
name: str
summary: str
methods: List[str]
class ProjectStructure(BaseModel):
"""
项目结构模型包括文件列表和依赖关系
"""
files: List[str] = Field(
...,
description="按生成顺序排列的文件路径列表"
)
dependencies: Dict[str, List[str]] = Field(
default_factory=dict,
description="文件依赖关系,键为文件路径,值为依赖文件列表"
)
class FileModel(BaseModel):
"""文件模型,对应 design.json 中的 files 字段。"""
path: str
summary: str
dependencies: List[str] = Field(default_factory=list)
functions: List[FunctionModel] = Field(default_factory=list)
classes: List[ClassModel] = Field(default_factory=list)
class CheckResult(BaseModel):
"""
检查工具的结果模型
"""
tool: str = Field(..., description="检查工具名称,如 'pylint'")
passed: bool = Field(..., description="检查是否通过")
errors: List[str] = Field(
default_factory=list,
description="错误信息列表"
)
warnings: List[str] = Field(
default_factory=list,
description="警告信息列表"
)
class DesignModel(BaseModel):
"""设计模型,对应 design.json 的根结构。"""
project_name: str
version: str
description: str
files: List[FileModel]
commands: List[str] = Field(default_factory=list)
check_tools: List[str] = Field(default_factory=list)
class ConfigModel(BaseModel):
"""
配置模型对应 pyproject.toml 中的 [tool.llm-codegen] 部分
"""
check_tools: List[str] = Field(
default=["pytest", "pylint", "mypy", "black"],
description="要运行的检查工具列表"
)
max_retries: int = Field(
default=3,
description="修复的最大重试次数"
)
dangerous_commands: List[str] = Field(
default=["rm", "sudo", "chmod", "dd"],
description="危险命令列表"
)
# 模型用于工单
class FeatureIssue(BaseModel):
"""需求工单模型,基于 README 中的模板。"""
name: str
description: str
affected_files: Optional[List[str]] = Field(default_factory=list)
acceptance_criteria: List[str]
class BugIssue(BaseModel):
"""Bug 工单模型,基于 README 中的模板。"""
name: str
description: str
steps_to_reproduce: List[str]
expected_behavior: str
actual_behavior: str
affected_files: Optional[List[str]] = Field(default_factory=list)
# 模型用于断点续写状态
class StateModel(BaseModel):
"""状态模型,用于保存生成过程中的断点状态。"""
current_file_index: int = 0
generated_files: List[str] = Field(default_factory=list)
dependencies_map: Dict[str, List[str]] = Field(default_factory=dict)
total_files: int
output_dir: str
readme_path: str
# 可选:通用响应模型,用于 LLM 调用
class LLMResponse(BaseModel):
"""LLM 响应模型,用于解析 generate_file 方法的返回。"""
code: str
description: str
commands: List[str] = Field(default_factory=list)

View File

@ -1,130 +1,86 @@
"""
utils.py - 工具函数模块
包含危险命令判断文件操作状态管理等通用函数
"""
import json
from typing import Tuple
import os
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
# 危险命令列表,可配置
DANGEROUS_COMMANDS = ["rm", "sudo", "chmod", "dd", "mkfs", "> /dev/sda", "format"]
ALLOWED_COMMANDS = [] # 可设置白名单,为空则只检查黑名单
def is_dangerous_command(cmd: str, dangerous_commands: Optional[List[str]] = None) -> Tuple[bool, str]:
def is_dangerous_command(cmd: str) -> Tuple[bool, str]:
"""
判断命令是否危险
判断命令是否危险
Args:
cmd: 要检查的命令字符串
dangerous_commands: 危险命令关键词列表如果为None则使用默认列表
cmd: 命令字符串
Returns:
Tuple[bool, str]: (是否危险, 原因)
"""
if dangerous_commands is None:
# 默认危险命令列表,可以从配置读取
dangerous_commands = ["rm", "sudo", "chmod", "dd", "mkfs", "> /dev/sda", "format"]
cmd_lower = cmd.lower()
for danger in dangerous_commands:
for danger in DANGEROUS_COMMANDS:
if danger in cmd_lower:
return True, f"包含危险关键词 '{danger}'"
return False, ""
def load_dangerous_commands(config_path: Optional[Path] = None) -> List[str]:
def read_file(file_path: str) -> str:
"""
从配置文件加载危险命令列表简化实现实际应从pyproject.toml读取
读取文件内容
Args:
config_path: 配置文件路径默认为None表示使用默认列表
file_path: 文件路径
Returns:
List[str]: 危险命令关键词列表
"""
# 在实际实现中应使用tomli库解析配置这里返回默认列表
return ["rm", "sudo", "chmod", "dd", "mkfs", "> /dev/sda", "format"]
def safe_read_file(file_path: Path, encoding: str = "utf-8") -> str:
"""
安全读取文件内容
Args:
file_path: 文件路径
encoding: 文件编码默认为utf-8
Returns:
str: 文件内容
Raises:
FileNotFoundError: 如果文件不存在
IOError: 如果读取失败
str: 文件内容
"""
try:
with open(file_path, 'r', encoding=encoding) as f:
with open(file_path, 'r', encoding='utf-8') as f:
return f.read()
except FileNotFoundError:
raise FileNotFoundError(f"文件不存在: {file_path}")
except Exception as e:
raise IOError(f"读取文件失败 {file_path}: {e}")
raise IOError(f"读取文件失败: {file_path}, 错误: {e}")
def safe_write_file(file_path: Path, content: str, encoding: str = "utf-8") -> None:
def write_file(file_path: str, content: str) -> None:
"""
安全写入文件内容确保目录存在
写入文件内容
Args:
file_path: 文件路径
content: 要写入的内容
encoding: 文件编码默认为utf-8
file_path: 文件路径
content: 要写入的内容
"""
file_path.parent.mkdir(parents=True, exist_ok=True)
with open(file_path, 'w', encoding=encoding) as f:
f.write(content)
def load_state(state_path: Path) -> Dict[str, Any]:
"""
加载断点续写状态文件
Args:
state_path: 状态文件路径.llm_generator_state.json
Returns:
Dict[str, Any]: 状态数据如果文件不存在或解析失败则返回空字典
"""
if not state_path.exists():
return {}
try:
with open(state_path, 'r', encoding='utf-8') as f:
return json.load(f)
except (json.JSONDecodeError, IOError):
return {}
path = Path(file_path)
path.parent.mkdir(parents=True, exist_ok=True)
with open(file_path, 'w', encoding='utf-8') as f:
f.write(content)
except Exception as e:
raise IOError(f"写入文件失败: {file_path}, 错误: {e}")
def save_state(state_path: Path, state: Dict[str, Any]) -> None:
def ensure_dir(directory: str) -> None:
"""
保存断点续写状态文件
确保目录存在如果不存在则创建
Args:
state_path: 状态文件路径
state: 状态数据
directory: 目录路径
"""
with open(state_path, 'w', encoding='utf-8') as f:
json.dump(state, f, indent=2, ensure_ascii=False)
os.makedirs(directory, exist_ok=True)
def normalize_path(path: str, base_dir: Optional[Path] = None) -> Path:
def safe_join(base_path: str, *paths: str) -> str:
"""
规范化路径相对于基础目录
安全地拼接路径防止目录遍历攻击
Args:
path: 路径字符串
base_dir: 基础目录默认为None使用当前工作目录
base_path: 基础路径
*paths: 要拼接的部分
Returns:
Path: 规范化后的Path对象
str: 拼接后的绝对路径
"""
if base_dir is None:
base_dir = Path.cwd()
return (base_dir / path).resolve()
full_path = os.path.abspath(os.path.join(base_path, *paths))
base_abs = os.path.abspath(base_path)
if not full_path.startswith(base_abs):
raise ValueError(f"路径拼接越界: {full_path} 不在 {base_abs}")
return full_path

View File

@ -1,6 +1,5 @@
"""
Initialization file for the tests package of the LLM code generation tool.
Tests package for llm-codegen.
This file marks the 'tests' directory as a Python package, enabling proper
imports and test discovery with pytest.
This file initializes the tests package, allowing it to be recognized as a Python package.
"""

284
tests/test_checker.py Normal file
View File

@ -0,0 +1,284 @@
import json
import subprocess
import pytest
from pathlib import Path
from src.llm_codegen.checker import Checker
from src.llm_codegen.core import CodeGenerator
from src.llm_codegen.utils import is_dangerous_command
# ---------- Fake 对象 ----------
class FakeCodeGenerator:
"""假的 CodeGenerator用于替代真实的 LLM 调用"""
def __init__(self, return_value=None):
self._call_llm_called = False
self._call_llm_args = None
self.return_value = return_value or {"patches": [], "description": "模拟修复"}
def _call_llm(self, system_prompt, user_prompt, temperature=0.1):
self._call_llm_called = True
self._call_llm_args = (system_prompt, user_prompt, temperature)
return self.return_value
# ---------- Fixtures ----------
@pytest.fixture
def fake_code_generator():
"""返回一个假的 CodeGenerator 实例"""
return FakeCodeGenerator()
@pytest.fixture
def checker(fake_code_generator, tmp_path):
"""创建 Checker 实例,使用临时目录和假的 code_generator"""
output_dir = tmp_path / "test_output"
output_dir.mkdir()
return Checker(
output_dir=output_dir,
check_tools=["pylint", "mypy", "black"],
code_generator=fake_code_generator,
)
# ---------- 测试 ----------
class TestChecker:
"""测试 Checker 类的功能(无 mock"""
def test_init(self, checker, tmp_path):
"""测试初始化方法"""
assert checker.output_dir == tmp_path / "test_output"
assert checker.check_tools == ["pylint", "mypy", "black"]
assert checker.results_file == checker.output_dir / "check_results.json"
assert isinstance(checker.code_generator, FakeCodeGenerator)
def test_run_check_success(self, checker, monkeypatch):
"""测试 run_check 方法成功运行检查工具"""
file_path = Path("test_file.py")
# 模拟危险检测返回安全
def fake_dangerous(cmd):
return (False, "")
monkeypatch.setattr("src.llm_codegen.checker.is_dangerous_command", fake_dangerous)
# 模拟 subprocess.run 返回成功
def fake_run(cmd, *args, **kwargs):
return subprocess.CompletedProcess(
args=cmd,
returncode=0,
stdout="",
stderr=""
)
monkeypatch.setattr(subprocess, "run", fake_run)
result = checker.run_check("pylint", file_path)
assert result["tool"] == "pylint"
assert result["file"] == str(file_path)
assert result["returncode"] == 0
assert result["errors"] == []
def test_run_check_dangerous_command(self, checker, monkeypatch):
"""测试 run_check 处理危险命令"""
file_path = Path("test_file.py")
# 替换 is_dangerous_command 返回危险
def fake_dangerous(cmd):
return (True, "包含危险关键词 'rm'")
monkeypatch.setattr("src.llm_codegen.checker.is_dangerous_command", fake_dangerous)
result = checker.run_check("rm -rf /", file_path)
assert result["returncode"] == -1
assert "危险命令被阻止" in result["stderr"]
def test_run_check_timeout(self, checker, monkeypatch):
"""测试 run_check 处理超时"""
file_path = Path("test_file.py")
# 模拟危险检测返回安全
def fake_dangerous(cmd):
return (False, "")
monkeypatch.setattr("src.llm_codegen.checker.is_dangerous_command", fake_dangerous)
# 让 subprocess.run 抛出超时异常
def fake_run_timeout(*args, **kwargs):
raise subprocess.TimeoutExpired(cmd="pylint", timeout=60)
monkeypatch.setattr(subprocess, "run", fake_run_timeout)
result = checker.run_check("pylint", file_path)
assert result["returncode"] == -1
assert "检查超时" in result["stderr"]
def test_run_parallel_checks(self, checker, tmp_path, monkeypatch):
"""测试并行运行检查"""
test_file = tmp_path / "test.py"
test_file.write_text("print('hello')\n")
# 替换 run_check 方法,避免真正执行
fake_results = [
{"tool": "pylint", "file": str(test_file), "returncode": 0, "stdout": "", "stderr": "", "errors": []},
{"tool": "mypy", "file": str(test_file), "returncode": 0, "stdout": "", "stderr": "", "errors": []},
{"tool": "black", "file": str(test_file), "returncode": 0, "stdout": "", "stderr": "", "errors": []}
]
call_count = 0
def fake_run_check(tool, file):
nonlocal call_count
call_count += 1
return fake_results[call_count - 1]
monkeypatch.setattr(checker, "run_check", fake_run_check)
results = checker.run_parallel_checks([test_file])
assert len(results) == 3
assert all(r["returncode"] == 0 for r in results)
assert call_count == 3
def test_save_results(self, checker, tmp_path):
"""测试保存检查结果"""
results = [{"tool": "pylint", "file": "file1.py", "returncode": 0}]
checker.save_results(results)
results_file = checker.output_dir / "check_results.json"
assert results_file.exists()
with open(results_file, 'r') as f:
loaded = json.load(f)
assert loaded == results
def test_collect_errors(self, checker, tmp_path):
"""测试收集错误"""
results = [
{
"tool": "pylint",
"file": "file1.py",
"returncode": 1,
"stdout": "",
"stderr": "",
"errors": ["未使用的导入"],
},
{
"tool": "mypy",
"file": "file2.py",
"returncode": 0,
"stdout": "",
"stderr": "",
"errors": [],
},
]
checker.save_results(results)
errors = checker.collect_errors()
assert len(errors) == 1
assert errors[0]["file"] == "file1.py"
assert errors[0]["tool"] == "pylint"
assert errors[0]["error"] == "未使用的导入"
def test_collect_errors_no_results(self, checker):
"""测试收集错误时无结果文件"""
errors = checker.collect_errors()
assert errors == []
def test_auto_fix(self, checker, tmp_path):
"""测试自动修复错误"""
errors = [{"file": "test.py", "tool": "pylint", "error": "未使用的导入"}]
# 文件应放在 output_dir 下
test_file = checker.output_dir / "test.py"
test_file.parent.mkdir(parents=True, exist_ok=True)
test_file.write_text("import os\nprint('hi')\n")
# 设置假的 _call_llm 返回值
fake_return = {
"patches": [{"file": "test.py", "code": "print('hi')\n"}],
"description": "移除未使用的导入",
}
checker.code_generator.return_value = fake_return
success = checker.auto_fix(errors, context_files=["test.py"])
assert success is True
with open(test_file, 'r') as f:
assert f.read() == "print('hi')\n"
assert checker.code_generator._call_llm_called is True
def test_auto_fix_no_errors(self, checker):
"""测试自动修复无错误时"""
success = checker.auto_fix([])
assert success is True
def test_run_full_check_and_fix(self, checker, monkeypatch):
"""测试完整检查与修复循环"""
# 替换相关方法,模拟行为
fake_results = []
fake_errors_1 = [{"error": "err"}]
fake_errors_2 = []
fake_fix_success = True
call_checks = 0
call_collect = 0
call_fix = 0
def fake_run_parallel_checks():
nonlocal call_checks
call_checks += 1
return fake_results
def fake_collect_errors(results=None):
nonlocal call_collect
call_collect += 1
if call_collect == 1:
return fake_errors_1
else:
return fake_errors_2
def fake_auto_fix(errors, context_files=None):
nonlocal call_fix
call_fix += 1
return fake_fix_success
monkeypatch.setattr(checker, "run_parallel_checks", fake_run_parallel_checks)
monkeypatch.setattr(checker, "collect_errors", fake_collect_errors)
monkeypatch.setattr(checker, "auto_fix", fake_auto_fix)
result = checker.run_full_check_and_fix(max_retries=2)
assert result is True
assert call_checks == 2
assert call_collect == 2
assert call_fix == 1
def test_run_full_check_and_fix_failure(self, checker, monkeypatch):
"""测试完整检查与修复循环失败"""
fake_results = []
fake_errors = [{"error": "err"}]
fake_fix_success = False
call_checks = 0
call_collect = 0
call_fix = 0
def fake_run_parallel_checks():
nonlocal call_checks
call_checks += 1
return fake_results
def fake_collect_errors(results=None):
nonlocal call_collect
call_collect += 1
return fake_errors
def fake_auto_fix(errors, context_files=None):
nonlocal call_fix
call_fix += 1
return fake_fix_success
monkeypatch.setattr(checker, "run_parallel_checks", fake_run_parallel_checks)
monkeypatch.setattr(checker, "collect_errors", fake_collect_errors)
monkeypatch.setattr(checker, "auto_fix", fake_auto_fix)
result = checker.run_full_check_and_fix(max_retries=1)
assert result is False
assert call_checks == 1
assert call_fix == 1

152
tests/test_cli.py Normal file
View File

@ -0,0 +1,152 @@
import pytest
from typer.testing import CliRunner
from unittest.mock import Mock, patch
import sys
from pathlib import Path
from loguru import logger
# 测试 CLI 命令
runner = CliRunner()
def test_cli_init_success():
"""测试 init 命令成功执行"""
from src.llm_codegen.cli import app # 假设从项目根目录运行测试
# 模拟 CodeGenerator 和其方法,避免实际调用 API
with patch('src.llm_codegen.cli.CodeGenerator') as mock_generator:
mock_instance = Mock()
mock_instance.run = Mock()
mock_generator.return_value = mock_instance
# 创建一个虚拟的 README 文件用于测试
test_readme = Path("test_readme.md")
test_readme.write_text("# Test Project\n\nA test project for CLI.")
result = runner.invoke(app, ["init", str(test_readme), "--output", "./test_output"])
# 清理
test_readme.unlink()
assert result.exit_code == 0
assert "初始化失败" not in result.stdout
mock_generator.assert_called_once()
mock_instance.run.assert_called_once_with(test_readme)
def test_cli_init_failure_no_readme():
"""测试 init 命令当 README 不存在时失败"""
from src.llm_codegen.cli import app
result = runner.invoke(app, ["init", "nonexistent.md"])
assert result.exit_code != 0 # 应该退出码非零
def test_cli_enhance_success():
"""测试 enhance 命令成功执行(简化版,基于工单)"""
from src.llm_codegen.cli import app
# 模拟依赖文件和环境
with patch('src.llm_codegen.cli.CodeGenerator') as mock_generator, \
patch('src.llm_codegen.cli.Checker') as mock_checker, \
patch('pathlib.Path.exists') as mock_exists:
mock_exists.return_value = True # 模拟 design.json 存在
mock_instance = Mock()
mock_instance.run_full_check_and_fix = Mock(return_value=True)
mock_checker.return_value = mock_instance
mock_generator.return_value = Mock()
# 创建一个虚拟的工单文件
test_issue = Path("test_feature.issue")
test_issue.write_text("name: Add feature\ndescription: Test feature")
result = runner.invoke(app, ["enhance", str(test_issue), "--output", "./test_output"])
# 清理
test_issue.unlink()
assert result.exit_code == 0
assert "增强失败" not in result.stdout
mock_checker.assert_called_once()
mock_instance.run_full_check_and_fix.assert_called_once()
def test_cli_fix_success():
"""测试 fix 命令成功执行(简化版,基于工单)"""
from src.llm_codegen.cli import app
with patch('src.llm_codegen.cli.CodeGenerator') as mock_generator, \
patch('src.llm_codegen.cli.Checker') as mock_checker, \
patch('pathlib.Path.exists') as mock_exists:
mock_exists.return_value = True
mock_instance = Mock()
mock_instance.run_full_check_and_fix = Mock(return_value=True)
mock_checker.return_value = mock_instance
mock_generator.return_value = Mock()
test_issue = Path("test_bug.issue")
test_issue.write_text("name: Fix bug\ndescription: Test bug")
result = runner.invoke(app, ["fix", str(test_issue), "--output", "./test_output"])
test_issue.unlink()
assert result.exit_code == 0
assert "修复失败" not in result.stdout
mock_checker.assert_called_once()
mock_instance.run_full_check_and_fix.assert_called_once()
def test_cli_help():
"""测试 CLI 帮助命令"""
from src.llm_codegen.cli import app
result = runner.invoke(app, ["--help"])
assert result.exit_code == 0
assert "基于LLM的自动化代码生成与维护工具" in result.stdout
# 测试子命令帮助
result = runner.invoke(app, ["init", "--help"])
assert result.exit_code == 0
assert "README.md 文件路径" in result.stdout
def test_cli_enhance_no_design():
"""测试 enhance 命令当 design.json 不存在时失败"""
from src.llm_codegen.cli import app
with patch('pathlib.Path.exists') as mock_exists:
mock_exists.return_value = False # 模拟 design.json 不存在
test_issue = Path("test_feature.issue")
test_issue.write_text("name: Test")
result = runner.invoke(app, ["enhance", str(test_issue)])
test_issue.unlink()
assert result.exit_code != 0
def test_cli_fix_no_design():
"""测试 fix 命令当 design.json 不存在时失败"""
from src.llm_codegen.cli import app
with patch('pathlib.Path.exists') as mock_exists:
mock_exists.return_value = False
test_issue = Path("test_bug.issue")
test_issue.write_text("name: Test")
result = runner.invoke(app, ["fix", str(test_issue)])
test_issue.unlink()
assert result.exit_code != 0
if __name__ == "__main__":
pytest.main([__file__])

View File

@ -1,184 +1,264 @@
import pytest
from unittest.mock import Mock, patch, MagicMock
from pathlib import Path
import json
import os
import sys
from datetime import datetime
import subprocess
from pathlib import Path
import pytest
from src.llm_codegen.core import CodeGenerator
from src.llm_codegen.models import GeneratorState, ConfigModel
from src.llm_codegen.utils import is_dangerous_command
from src.llm_codegen.models import DesignModel
# ---------- Fake 类 ----------
class FakeChatCompletion:
"""模拟 OpenAI 的 chat.completions.create 返回值"""
def __init__(self, content):
self.choices = [FakeChoice(FakeMessage(content))]
class FakeChoice:
def __init__(self, message):
self.message = message
class FakeMessage:
def __init__(self, content):
self.content = content
self.reasoning_content = None
class FakeOpenAIClient:
"""假的 OpenAI 客户端,用于替换真实客户端"""
def __init__(self):
self.chat = FakeChat()
class FakeChat:
def __init__(self):
self.completions = FakeCompletions()
class FakeCompletions:
def __init__(self):
self.create_called = False
self.create_kwargs = None
self.create_return_value = None
def create(self, *args, **kwargs):
self.create_called = True
self.create_kwargs = kwargs
if self.create_return_value is None:
return FakeChatCompletion(json.dumps({"content": "default"}))
return self.create_return_value
# ---------- Fixtures ----------
@pytest.fixture
def fake_openai_client(monkeypatch):
"""用假的 OpenAI 客户端替换真实的客户端"""
fake_client = FakeOpenAIClient()
monkeypatch.setattr("src.llm_codegen.core.OpenAI", lambda *args, **kwargs: fake_client)
return fake_client
@pytest.fixture
def code_generator(tmp_path, monkeypatch, fake_openai_client):
"""创建 CodeGenerator 实例,使用临时输出目录,并设置环境变量"""
monkeypatch.setenv("DEEPSEEK_APIKEY", "fake-api-key")
generator = CodeGenerator(output_dir=str(tmp_path / "test_output"))
return generator
# ---------- 测试类 ----------
class TestCodeGenerator:
"""测试 CodeGenerator 核心类的单元测试。"""
"""测试 CodeGenerator 类(无 mock"""
@pytest.fixture
def mock_openai_client(self):
"""模拟 OpenAI 客户端。"""
with patch('src.llm_codegen.core.OpenAI') as mock:
client = Mock()
mock.return_value = client
yield client
def test_init_success(self, code_generator, tmp_path, fake_openai_client):
"""测试初始化成功"""
assert code_generator.api_key == "fake-api-key"
assert code_generator.model == "deepseek-reasoner"
assert code_generator.output_dir == tmp_path / "test_output"
# 验证客户端被替换为我们的 fake 客户端
assert code_generator.client is fake_openai_client
@pytest.fixture
def generator(self, mock_openai_client, tmp_path):
"""创建 CodeGenerator 实例,使用临时目录和模拟 API。"""
output_dir = tmp_path / "output"
return CodeGenerator(
api_key="test-api-key",
base_url="https://api.deepseek.com",
model="deepseek-reasoner",
output_dir=str(output_dir),
resume=False,
def test_init_no_api_key(self, monkeypatch):
"""测试没有 API 密钥时抛出错误"""
monkeypatch.delenv("DEEPSEEK_APIKEY", raising=False)
with pytest.raises(ValueError, match="必须提供API密钥"):
CodeGenerator()
def test_parse_readme_success(self, code_generator, tmp_path):
"""测试解析 README 文件成功"""
readme_path = tmp_path / "README.md"
readme_path.write_text("# Test README\nThis is a test.")
content = code_generator.parse_readme(readme_path)
assert content == "# Test README\nThis is a test."
def test_parse_readme_file_not_found(self, code_generator):
"""测试 README 文件不存在时抛出错误"""
with pytest.raises(Exception):
code_generator.parse_readme(Path("nonexistent.md"))
def test_generate_design_json(self, code_generator, monkeypatch):
"""测试生成 design.json"""
code_generator.readme_content = "# Test Project\nA test project."
# 模拟 _call_llm 的返回值
mock_response = {
"project_name": "test-project",
"version": "1.0.0",
"description": "A test project",
"files": [],
"commands": [],
"check_tools": []
}
def fake_call_llm(system_prompt, user_prompt, temperature=0.2, expect_json=True):
return mock_response
monkeypatch.setattr(code_generator, "_call_llm", fake_call_llm)
design = code_generator.generate_design_json()
assert isinstance(design, DesignModel)
assert design.project_name == "test-project"
# 验证文件已写入
design_path = code_generator.output_dir / "design.json"
assert design_path.exists()
with open(design_path) as f:
saved = json.load(f)
assert saved["project_name"] == "test-project"
def test_generate_file_with_dependencies(self, code_generator, monkeypatch, tmp_path):
"""测试生成文件,有依赖文件"""
# 创建依赖文件
dep_path = tmp_path / "dep.py"
dep_path.write_text("# Dependency file")
code_generator.output_dir = tmp_path
code_generator.readme_content = "# README"
# 模拟 _call_llm 的返回值
llm_response = {
"code": "print('Hello, world!')",
"description": "测试文件",
"commands": []
}
def fake_call_llm(system_prompt, user_prompt, temperature=0.2, expect_json=True):
return llm_response
monkeypatch.setattr(code_generator, "_call_llm", fake_call_llm)
code, desc, commands = code_generator.generate_file(
file_path="test.py",
prompt_instruction="生成测试文件",
dependency_files=[str(dep_path)]
)
def test_init(self, generator, tmp_path):
"""测试初始化。"""
assert generator.api_key == "test-api-key"
assert generator.model == "deepseek-reasoner"
assert generator.output_dir == tmp_path / "output"
assert generator.resume is False
assert isinstance(generator.config, ConfigModel)
assert generator.dangerous_commands == ["rm", "sudo", "chmod", "dd", "mkfs", "> /dev/sda", "format"]
assert generator.state is None
assert code == "print('Hello, world!')"
assert desc == "测试文件"
assert commands == []
def test_parse_readme(self, generator, tmp_path):
"""测试读取 README 文件。"""
readme_path = tmp_path / "README.md"
readme_content = "# Test Project\nThis is a test README."
readme_path.write_text(readme_content)
def test_execute_command_success(self, code_generator, monkeypatch):
"""测试执行命令成功"""
def fake_run(cmd, *args, **kwargs):
return subprocess.CompletedProcess(args=cmd, returncode=0, stdout="", stderr="")
monkeypatch.setattr(subprocess, "run", fake_run)
result = generator.parse_readme(readme_path)
assert result == readme_content
success = code_generator.execute_command("echo test")
assert success is True
def test_get_project_structure(self, generator, mock_openai_client):
"""测试获取项目结构,模拟 LLM 响应。"""
generator.readme_content = "# Test README"
mock_response = {
"files": ["src/__init__.py", "src/core.py"],
"dependencies": {"src/core.py": ["src/__init__.py"]}
def test_execute_command_dangerous(self, code_generator, monkeypatch):
"""测试阻止危险命令"""
def fake_dangerous(cmd):
return (True, "包含危险关键词")
monkeypatch.setattr("src.llm_codegen.core.is_dangerous_command", fake_dangerous)
success = code_generator.execute_command("rm -rf /")
assert success is False
def test_execute_command_failure(self, code_generator, monkeypatch):
"""测试命令执行失败"""
def fake_run(cmd, *args, **kwargs):
return subprocess.CompletedProcess(args=cmd, returncode=1, stdout="", stderr="")
monkeypatch.setattr(subprocess, "run", fake_run)
success = code_generator.execute_command("false")
assert success is False
def test_run_with_state_resume(self, code_generator, monkeypatch, tmp_path):
"""测试断点续写"""
# 创建状态文件
state_file = tmp_path / ".llm_generator_state.json"
state_data = {
"current_file_index": 1,
"generated_files": ["file1.py"],
"dependencies_map": {},
"total_files": 3,
"output_dir": str(tmp_path),
"readme_path": "test"
}
mock_openai_client.chat.completions.create.return_value.choices[0].message.content = json.dumps(mock_response)
state_file.write_text(json.dumps(state_data))
files, dependencies = generator.get_project_structure()
assert files == ["src/__init__.py", "src/core.py"]
assert dependencies == {"src/core.py": ["src/__init__.py"]}
mock_openai_client.chat.completions.create.assert_called_once()
def test_generate_file(self, generator, mock_openai_client, tmp_path):
"""测试生成单个文件,模拟依赖文件和 LLM 响应。"""
generator.readme_content = "# Test README"
dep_file = tmp_path / "dep.txt"
dep_file.write_text("Dependency content")
mock_response = {
"code": "print('Hello, World!')",
"description": "测试文件生成",
"commands": ["echo 'test'"]
# 创建设计文件
design_path = tmp_path / "design.json"
design_data = {
"project_name": "test",
"version": "1.0.0",
"description": "test",
"files": [
{"path": "file1.py", "summary": "", "dependencies": [], "functions": [], "classes": []},
{"path": "file2.py", "summary": "", "dependencies": [], "functions": [], "classes": []},
{"path": "file3.py", "summary": "", "dependencies": [], "functions": [], "classes": []}
],
"commands": [],
"check_tools": []
}
mock_openai_client.chat.completions.create.return_value.choices[0].message.content = json.dumps(mock_response)
design_path.write_text(json.dumps(design_data))
code, desc, commands = generator.generate_file(
"test.py",
"生成测试文件",
[str(dep_file)]
code_generator.output_dir = tmp_path
code_generator.state_file = state_file
# 模拟内部方法
def fake_parse_readme(path):
return "# README"
monkeypatch.setattr(code_generator, "parse_readme", fake_parse_readme)
def fake_generate_file(file_path, prompt_instruction, dependency_files):
return ("code", "desc", [])
monkeypatch.setattr(code_generator, "generate_file", fake_generate_file)
def fake_execute_command(cmd, cwd=None):
return True
monkeypatch.setattr(code_generator, "execute_command", fake_execute_command)
# 运行,预期不抛出异常
code_generator.run(Path(tmp_path / "README.md"))
# 验证状态文件被清理
assert not state_file.exists()
def test_run_without_state(self, code_generator, monkeypatch, tmp_path):
"""测试没有状态时的首次运行"""
code_generator.output_dir = tmp_path
# 模拟 parse_readme
def fake_parse_readme(path):
return "# README"
monkeypatch.setattr(code_generator, "parse_readme", fake_parse_readme)
# 模拟 generate_design_json 返回设计
fake_design = DesignModel(
project_name="test",
version="1.0.0",
description="test",
files=[], # 无文件,简化流程
commands=[],
check_tools=[]
)
assert code == "print('Hello, World!')"
assert desc == "测试文件生成"
assert commands == ["echo 'test'"]
mock_openai_client.chat.completions.create.assert_called_once()
def fake_generate_design_json():
return fake_design
monkeypatch.setattr(code_generator, "generate_design_json", fake_generate_design_json)
def test_execute_command_safe(self, generator, tmp_path):
"""测试执行安全命令。"""
with patch('subprocess.run') as mock_run:
mock_run.return_value.returncode = 0
mock_run.return_value.stdout = "output"
mock_run.return_value.stderr = ""
# 模拟 get_project_structure
def fake_get_project_structure():
return [], {}
monkeypatch.setattr(code_generator, "get_project_structure", fake_get_project_structure)
generator.execute_command("echo 'test'", cwd=tmp_path)
mock_run.assert_called_once_with(
"echo 'test'",
shell=True,
cwd=tmp_path,
capture_output=True,
text=True,
timeout=300
)
def test_execute_command_dangerous(self, generator):
"""测试阻止危险命令。"""
with pytest.raises(RuntimeError, match="危险命令"):
generator.execute_command("rm -rf /") # 假设在危险命令列表中
def test_run_without_resume(self, generator, mock_openai_client, tmp_path):
"""测试完整运行流程,禁用断点续写。"""
readme_path = tmp_path / "README.md"
readme_path.write_text("# Test README")
generator.readme_content = "# Test README"
# 模拟 get_project_structure 响应
mock_structure = {
"files": ["file1.py", "file2.py"],
"dependencies": {}
}
mock_openai_client.chat.completions.create.side_effect = [
Mock(choices=[Mock(message=Mock(content=json.dumps(mock_structure)))]),
Mock(choices=[Mock(message=Mock(content=json.dumps({"code": "code1", "description": "desc1", "commands": []})))]),
Mock(choices=[Mock(message=Mock(content=json.dumps({"code": "code2", "description": "desc2", "commands": []})))])
]
with patch('src.llm_codegen.core.safe_write_file') as mock_write, \
patch('src.llm_codegen.core.safe_read_file') as mock_read, \
patch('src.llm_codegen.core.save_state') as mock_save:
mock_read.return_value = "content"
generator.run(readme_path)
# 验证文件生成和状态保存
assert mock_write.call_count == 2
assert mock_save.called
def test_run_with_resume(self, generator, mock_openai_client, tmp_path):
"""测试断点续写功能。"""
generator.resume = True
generator.state = GeneratorState(generated_files=["file1.py"], executed_commands=[])
readme_path = tmp_path / "README.md"
readme_path.write_text("# Test README")
generator.readme_content = "# Test README"
mock_structure = {
"files": ["file1.py", "file2.py"],
"dependencies": {}
}
mock_openai_client.chat.completions.create.return_value.choices[0].message.content = json.dumps(mock_structure)
with patch('src.llm_codegen.core.safe_write_file') as mock_write, \
patch('src.llm_codegen.core.safe_read_file') as mock_read:
mock_read.return_value = "content"
generator.run(readme_path)
# 只应生成 file2.py跳过 file1.py
assert mock_write.call_count == 1
def test_load_config_default(self, generator):
"""测试加载默认配置。"""
config = generator._load_config()
assert isinstance(config, ConfigModel)
assert config.check_tools == ["pytest", "pylint", "mypy", "black"]
assert config.max_retries == 3
def test_update_state(self, generator, tmp_path):
"""测试更新状态文件。"""
generator.state_path = tmp_path / "state.json"
generator.state = GeneratorState()
with patch('src.llm_codegen.core.save_state') as mock_save:
generator._update_state("new_file.py", ["cmd1"])
assert generator.state.generated_files == ["new_file.py"]
assert generator.state.executed_commands == ["cmd1"]
mock_save.assert_called_once()
if __name__ == "__main__":
pytest.main([__file__, "-v"])
# 运行,预期不抛出异常
code_generator.run(Path(tmp_path / "README.md"))