Skip to content
/ Quark Public

Query-based All-in-one Research Kit for painless experiments configuration, setup, reproduce and summary

License

Notifications You must be signed in to change notification settings

codes1gn/Quark

Repository files navigation

Query-based All-in-one Research Kit (QUARK)

Python 3.9+ License: MIT

The Query-based All-in-one Research Kit (QUARK) is a comprehensive toolkit designed to revolutionize how researchers, data scientists, and developers manage their experimental workflows. With a focus on PyTorch-based research projects, QUARK provides a unified platform that handles everything from experiment configuration to result summarization, ensuring reproducibility and efficiency throughout the research lifecycle.

🌟 Key Features

πŸ“Š Experiment Management

  • Query-based Configuration: Intuitive interfaces for setting up experiments
  • Automated Environment Setup: Seamless configuration of dependencies and resources
  • Experiment Versioning: Track and manage different versions of your experiments
  • Result Summarization: Generate comprehensive reports of experimental outcomes

πŸ”§ Technical Capabilities

  • Environment Isolation: Built-in support for virtual environments and Docker containers
  • Plugin System: Extensible architecture for custom functionality
  • Task Automation: Powered by Invoke for efficient task management
  • Native Code Integration: CMake support for C++/native code components

πŸ” Research Tools

  • Data Processing: Efficient data handling with numpy and msgpack
  • Validation: Robust data validation using Pydantic
  • Documentation: Integrated MkDocs with Material theme for beautiful documentation
  • Testing: Comprehensive testing infrastructure with PyTest

πŸš€ Getting Started

Prerequisites

  1. Python Environment:

    # Ensure Python 3.9 or higher is installed
    python --version
  2. System Requirements:

    • Python 3.9+
    • invoke (for task automation)
    • venv (for environment management)

Installation

  1. Using Poetry (Recommended):

    # Install dependencies and set up the project
    poetry install
  2. Verify Installation:

    quark --help

πŸ“– Project Structure

quark/
β”œβ”€β”€ quark/              # Core package
β”‚   β”œβ”€β”€ coordinator/    # Coordination logic
β”‚   └── tasks.py       # Task definitions
β”œβ”€β”€ plugins/           # Plugin system
β”œβ”€β”€ utility/           # Utility functions
β”œβ”€β”€ tools/            # Development tools
β”œβ”€β”€ experiments/      # Experiment configurations
β”œβ”€β”€ environments/     # Environment settings
β”œβ”€β”€ tests/           # Test suite
β”œβ”€β”€ data/            # Data storage
└── runtime/         # Runtime configurations

πŸ› οΈ Usage

Core Commands

# View all available commands
quark --help

# Bootstrap the environment (sets up both PyTorch and TensorFlow)
quark bootstrap

# Install dependencies
quark install

# Build the project
quark build

# Run tests
quark test                    # Run all tests
quark unittest               # Run unit tests for all frameworks
quark unittest-torch         # Run PyTorch specific tests
quark unittest-tf           # Run TensorFlow specific tests

# Format code (Python with black/isort, C++ with clang-format)
quark format

# Clean up virtual environments
quark clean

Benchmark and Experiments

# Run benchmarks
quark bench <task-name> --task-dir=<directory>

# Run the QUARK engine tests
quark quark-engine-test

Plugin Management

# Get available plugins
quark get-plugins

# Pull plugins from repositories
quark pull-plugins

# Build plugins
quark build-plugins

# Run plugin tests
quark catz-smoke-test            # Test Catzilla plugins
quark serialisation-smoke-test   # Test serialization plugins

Development Workflow

  1. Initial Setup:

    # Bootstrap the environment
    quark bootstrap
    
    # Install dependencies
    quark install
  2. Development Cycle:

    # Format your code
    quark format
    
    # Run tests
    quark test
    
    # Build the project
    quark build
  3. Plugin Development:

    # Build and test plugins
    quark build-plugins
    quark catz-smoke-test

All commands are integrated into the quark CLI - there's no need to manually run pytest, poetry, or other tools directly. The QUARK command-line interface handles all necessary tool interactions for you.

🀝 Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“š Documentation

Comprehensive documentation is available through MkDocs:

# Build documentation
mkdocs build

# Serve documentation locally
mkdocs serve

πŸ”§ Development

Setting Up Development Environment

# Install development dependencies
poetry install --with dev

# Run tests
pytest tests/

# Build documentation
mkdocs build

Code Style

  • Follow PEP 8 guidelines
  • Use type hints
  • Write comprehensive docstrings
  • Include unit tests for new features

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Contact

Albert Shi - [email protected]

Project Link: https://github.com/codes1gn/quark


Made with ❀️ by the QUARK team

πŸ”„ Comparison with Other Tools

QUARK differentiates itself from other experiment tracking and MLOps tools in several key ways:

πŸ“Š Compared to Traditional Experiment Trackers

Feature QUARK MLflow Weights & Biases DVC
Query-based Configuration βœ… ❌ ❌ ❌
Automated Environment Management βœ… Limited Limited Limited
Integrated Data Version Control βœ… ❌ Limited βœ…
Framework Agnostic βœ… βœ… βœ… βœ…
Built-in Reproducibility βœ… Limited Limited βœ…

🌟 QUARK's Unique Advantages

  1. Query-First Approach

    • Intuitive query-based interface for experiment configuration
    • Natural language-like syntax for defining experiments
    • Reduced cognitive load compared to traditional configuration files
  2. Unified Research Environment

    • Seamless integration of experiment tracking, data versioning, and model management
    • Consistent workflow across different ML frameworks
    • Built-in support for both PyTorch and TensorFlow ecosystems
  3. Advanced Reproducibility

    • Automatic environment snapshots
    • Complete experiment lineage tracking
    • Deterministic experiment reproduction
  4. Scalable Architecture

    • Designed for large-scale research projects
    • Efficient handling of distributed training
    • Cloud-native architecture with Kubernetes support
  5. Research-Oriented Features

    • First-class support for academic research workflows
    • Built-in citation and paper tracking
    • Easy experiment sharing and collaboration

πŸ” When to Choose QUARK

QUARK is particularly well-suited for:

  • Research Teams who need robust experiment tracking with academic workflow support
  • Production ML Teams requiring seamless transition from research to deployment
  • Organizations looking for a unified platform that scales with their ML initiatives
  • Projects that require extensive experimentation and rigorous reproducibility
  • Teams working across multiple ML frameworks and environments

About

Query-based All-in-one Research Kit for painless experiments configuration, setup, reproduce and summary

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published