PyTestPilot

Phase 4 of the report SWE_520_Project-Phase-4.pdf is in the root directory.

PyTestPilot is an empirical evaluation framework that generates Python unit tests using open-weight Large Language Models (LLMs) and compares them against traditional property-based and random testing baselines.

Setup Instructions

This project is intended to be run within an isolated Python virtual environment.

1. Create and Activate a Virtual Environment

# Create a virtual environment named .venv
python3 -m venv .venv

# Activate the virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
# .venv\Scripts\activate

2. Install Dependencies

Ensure you are in the project root (where this README is located) and your virtual environment is activated, then run:

pip install -r requirements.txt

3. Configure API Keys

Copy the example environment file to .env:

cp .env.example .env

By default, the .env file uses $OPENROUTER_API_KEY, which will pull the API key from your system's environment variables if you already have it exported.

Alternatively, you can edit the .env file and paste your API key directly:

OPENROUTER_API_KEY=sk-or-v1-...

Running the Pipeline

PyTestPilot uses an evaluate command that runs the full end-to-end pipeline: AST-based context extraction, test generation, isolated sandbox execution, and metrics aggregation.

1. Run Full Evaluation on a Repository

To run the full pipeline on one of the included repositories:

python src/cli.py evaluate repos/toolz/ --methods llm-deepseek,hypothesis,random

2. View Results

After the pipeline completes, you can find the aggregated metrics and a human-readable report in the evaluation_results/ directory:

evaluation_results/report.md: A summary of pass rates and coverage.
evaluation_results/project_evaluation.json: Detailed raw metrics for all functions.

3. Quick Test (Single Function)

To quickly verify the setup on a single function:

python src/cli.py evaluate repos/toolz/toolz/itertoolz.py --function-id first --methods llm-deepseek

Architecture & Code Layout

Below is a tree view of the _DISTRO code layout and the role of each directory:

_DISTRO/
├── src/                 # Core framework source code
│   ├── adaptive_refiner/  # Pipeline Stage 5: Test refinement
│   ├── ast_parser/        # Pipeline Stage 1: AST context extraction
│   ├── data_structures/   # Core data models
│   ├── llm_client/        # Pipeline Stage 3: LLM interaction
│   ├── metrics_aggregator/# Pipeline Stage 6: Metrics calculation
│   ├── prompt_engineer/   # Pipeline Stage 2: Prompt assembly
│   ├── report_generator/  # Pipeline Stage 7: Reporting
│   ├── test_executor/     # Pipeline Stage 4: Test execution & baselines
│   ├── utils/             # Utilities like logging
│   └── cli.py             # Command Line Interface entry point
├── repos/               # Target repositories for empirical evaluation
├── scripts/             # Auxiliary helper scripts
├── tests/               # Framework unit tests
├── evaluation_results/  # Output directory for pipeline reports/metrics
├── analyze.py           # Script to analyze raw test output
└── generate_charts.py   # Script to generate visual evaluation charts

Directory Descriptions

src/: Contains the core logic for the PyTestPilot framework, including all seven stages of the automated LLM-based test generation pipeline.
repos/: Houses the target repositories (toolz and python_patterns) used as fixtures during empirical evaluations.
scripts/: Auxiliary scripts for parsing output, computing test coverage, aggregating metrics, and verifying AST contexts.
tests/: Unit tests verifying the functionality of PyTestPilot components (e.g., adaptive refiner, metrics aggregator).
evaluation_results/: Output directory storing execution results, including detailed raw JSON metrics and aggregated human-readable Markdown reports.
analyze.py & generate_charts.py: Scripts deployed alongside the pipeline to parse test execution output and render performance charts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTestPilot

Setup Instructions

1. Create and Activate a Virtual Environment

2. Install Dependencies

3. Configure API Keys

Running the Pipeline

1. Run Full Evaluation on a Repository

2. View Results

3. Quick Test (Single Function)

Architecture & Code Layout

Directory Descriptions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
evaluation_results		evaluation_results
repos		repos
scripts		scripts
src		src
tests		tests
.DS_Store		.DS_Store
.env.example		.env.example
.gitignore		.gitignore
520-Project-Presentation.pdf		520-Project-Presentation.pdf
README.md		README.md
SWE_520_Project-Phase-4.pdf		SWE_520_Project-Phase-4.pdf
analyze.py		analyze.py
generate_charts.py		generate_charts.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PyTestPilot

Setup Instructions

1. Create and Activate a Virtual Environment

2. Install Dependencies

3. Configure API Keys

Running the Pipeline

1. Run Full Evaluation on a Repository

2. View Results

3. Quick Test (Single Function)

Architecture & Code Layout

Directory Descriptions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages