The robot who refactors: /[^_^]\
Robofactor is a DSPy-powered tool to analyze, plan, and refactor Python code. It leverages a modern stack to programmatically assess and improve code quality through a structured, multi-step process.
The core technologies driving Robofactor include:
- DSPy (
dspy-ai
): The project is built on the DSPy framework, which provides a structured way to program with language models. It is used to generate refactoring plans and implement code changes. - Railway-Oriented Pipelines (
returns
): The evaluation process is constructed as a robust pipeline using thereturns
library. This allows for a series of checks (syntax, quality, functional correctness) where any failure gracefully halts the process and returns a descriptive error. - Code Quality Analysis (
flake8
): Code quality is programmatically measured usingflake8
, providing objective metrics to evaluate the effectiveness of the refactoring. - Rich CLI (
rich
): All terminal output, from the refactoring process to the final evaluation results, is formatted for clarity and readability using therich
library.
- AI-Powered Refactoring: Leverages a
CodeRefactor
module built with DSPy (dspy_modules.py
) to intelligently analyze and generate refactoring suggestions for Python code snippets. - Comprehensive Evaluation Pipeline: Ensures the quality and correctness of refactored code through a multi-stage process (
evaluation.py
). This pipeline includes syntax validation (check_syntax
), quality scoring usingflake8
and AST analysis (check_code_quality
), and functional correctness checks against provided test cases (check_functional_correctness
). - Advanced Code Analysis: Performs deep static analysis of Python code by parsing it into an Abstract Syntax Tree (AST). The
function_extraction.py
module is dedicated to extracting detailed information about functions, decorators, and parameters directly from the source code structure. - DSPy Model Optimization: Features the ability to compile and optimize the underlying DSPy program for improved performance and accuracy. This can be triggered using the
--optimize
flag in the CLI (main.py
). - Interactive CLI: Provides a user-friendly command-line interface built with
typer
. It usesrich
to deliver clear, well-formatted, and colorized output for refactoring plans and evaluation results (main.py
,ui.py
). - MLflow Integration: Comes with built-in support for experiment tracing using MLflow. Users can configure the MLflow tracking URI and experiment name via CLI arguments (
--mlflow-uri
,--mlflow-experiment
) to log and monitor refactoring runs (main.py
).
Before you begin, ensure you have Python 3.10 or newer installed on your system. This project uses uv
for fast and efficient dependency management.
To install Robofactor for regular use, clone the repository and run the following command from the project root:
make install
This command uses uv
to install the package and its required dependencies.
If you plan to contribute to the project, you will need to install the development dependencies, which include tools for testing, linting, and type-checking. Use the following command:
make install-dev
This will install all dependencies, including the development-specific ones listed in pyproject.toml
.
Robofactor is a command-line tool designed to analyze and refactor a single Python file.
To refactor a Python file, run the tool with the path to your script. By default, it performs a dry run, printing the proposed changes to the console without modifying the original file.
robofactor path/to/your/file.py
-
Analyze the Code (Dry Run)
Run Robofactor on a script to see the proposed refactoring. The tool will display the original code, the refactoring plan, the refactored code, and an evaluation of the changes.
robofactor src/my_app/utils.py
-
Apply the Changes
If you are satisfied with the proposed changes, you can write them back to the original file using the
--write
flag.robofactor --write src/my_app/utils.py
Here are some of the key arguments and options available. The descriptions are based on the output of robofactor --help
.
Argument / Option | Description |
---|---|
PATH |
The path to the Python file you want to refactor. |
--write |
Write the refactored code back to the original file. |
--optimize |
Force re-optimization of the underlying DSPy model. |
--dog-food |
A special mode to make Robofactor refactor its own source code. |
--task-llm <MODEL> |
Specify the language model for the main refactoring task. |
--tracing / --no-tracing |
Enable or disable MLflow tracing for experiment tracking. |
--mlflow-uri <URI> |
Set the MLflow tracking server URI (default: http://127.0.0.1:5000 ). |
--mlflow-experiment <NAME> |
Set the MLflow experiment name (default: robofactor ). |
For a complete list of all available options, run:
robofactor --help
Robofactor follows a structured, multi-stage process to analyze, refactor, and evaluate Python code. The architecture is designed to be robust and transparent, leveraging modern tools for each step.
-
Code Parsing & Extraction The process begins by parsing the target Python file. Using Python's built-in
ast
(Abstract Syntax Tree) module, the tool traverses the code's structure. As detailed insrc/robofactor/function_extraction.py
, it identifies every function and extracts comprehensive metadata, including its name, parameters, decorators, and docstring. This creates a structured representation of the code to be refactored. -
LLM-Powered Refactoring with DSPy The extracted function code is then passed to a
dspy.Module
, specifically theCodeRefactor
class found insrc/robofactor/dspy_modules.py
. This module contains a sophisticated prompt that instructs a Large Language Model (LLM) to analyze the provided code snippet, identify areas for improvement, and generate a refactored version. The LLM's goal is to enhance code quality, readability, and performance while preserving its original functionality. -
Programmatic Evaluation Pipeline Once the LLM returns the refactored code, it undergoes a rigorous, automated evaluation pipeline defined in
src/robofactor/evaluation.py
. This pipeline, built using thereturns
library for robust error handling (railway-oriented programming), consists of several checks:- Syntax Check: Verifies that the generated code is valid Python.
- Quality Check: Uses
flake8
to score the code against PEP 8 standards and other common issues. - Functional Correctness: Executes the refactored code against a set of predefined test cases to ensure it still produces the correct output. If any step fails, the pipeline short-circuits and reports the failure.
-
Rich Terminal Display Finally, the results of the refactoring and evaluation are presented to the user in the terminal. The
src/robofactor/ui.py
module uses therich
library to create clear, well-formatted tables and panels that display the original code, the refactored code, the LLM's reasoning, and the detailed evaluation scores.
To contribute to Robofactor, you'll need to set up a local development environment. This project uses uv
for fast dependency management and a Makefile
to provide convenient shortcuts for common tasks.
First, clone the repository:
git clone https://github.com/ethan-wickstrom/robofactor.git
cd robofactor
To install all dependencies, including development tools like ruff
, mypy
, and pytest
, run the following command. This will create a virtual environment and install all required packages.
make install-dev
The Makefile
includes several targets to streamline the development workflow:
- Run all checks: To ensure code quality before committing, run all linters, type-checkers, and tests at once.
make check
- Run tests: Execute the test suite using pytest.
make test
- Linting: Check for code style issues and automatically apply fixes using Ruff.
make lint
- Formatting: Format the code using Ruff Formatter and isort.
make format
- Type-checking: Perform static type analysis with mypy.
make type-check
Contributions are welcome! If you find a bug, have a feature request, or want to contribute to the code, please open an issue on our GitHub repository.
Please check the existing issues to see if your suggestion has already been discussed.