|
| 1 | +# BirdXplorer Development Guide |
| 2 | + |
| 3 | +This document provides comprehensive guidance for developers who want to contribute to the BirdXplorer project. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +BirdXplorer is a software tool that helps users explore community notes data on X (formerly known as Twitter). The project consists of several components: |
| 8 | + |
| 9 | +- **API**: A FastAPI-based web service that provides endpoints for querying community notes data |
| 10 | +- **ETL**: Extract, Transform, Load processes for community notes data |
| 11 | +- **Common**: Shared code and utilities used across the project |
| 12 | + |
| 13 | +## Prerequisites |
| 14 | + |
| 15 | +Before you begin development, ensure you have the following installed: |
| 16 | + |
| 17 | +- [Python](https://www.python.org/) (v3.10.12) |
| 18 | +- [PostgreSQL](https://www.postgresql.org/) (v15.4) |
| 19 | +- [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/) (for local development) |
| 20 | +- [Git](https://git-scm.com/) |
| 21 | + |
| 22 | +## Repository Structure |
| 23 | + |
| 24 | +``` |
| 25 | +BirdXplorer/ |
| 26 | +├── api/ # FastAPI web service |
| 27 | +│ ├── birdxplorer_api/ # API source code |
| 28 | +│ ├── tests/ # API tests |
| 29 | +│ ├── Dockerfile # Production Docker configuration |
| 30 | +│ ├── Dockerfile.dev # Development Docker configuration |
| 31 | +│ └── pyproject.toml # API package configuration |
| 32 | +├── common/ # Shared code and utilities |
| 33 | +│ ├── birdxplorer_common/ # Common source code |
| 34 | +│ ├── tests/ # Common tests |
| 35 | +│ └── pyproject.toml # Common package configuration |
| 36 | +├── etl/ # Extract, Transform, Load processes |
| 37 | +│ ├── src/ # ETL source code |
| 38 | +│ ├── tests/ # ETL tests |
| 39 | +│ └── pyproject.toml # ETL package configuration |
| 40 | +├── migrate/ # Database migration scripts |
| 41 | +├── docs/ # Documentation |
| 42 | +├── scripts/ # Utility scripts |
| 43 | +└── compose.yml # Docker Compose configuration |
| 44 | +``` |
| 45 | + |
| 46 | +## Getting Started |
| 47 | + |
| 48 | +### 1. Clone the Repository |
| 49 | + |
| 50 | +```bash |
| 51 | +git clone https://github.com/codeforjapan/BirdXplorer.git |
| 52 | +cd BirdXplorer |
| 53 | +``` |
| 54 | + |
| 55 | +### 2. Set Up Environment Variables |
| 56 | + |
| 57 | +```bash |
| 58 | +cp .env.example .env |
| 59 | +``` |
| 60 | + |
| 61 | +Edit the `.env` file to set the required environment variables: |
| 62 | + |
| 63 | +``` |
| 64 | +BX_STORAGE_SETTINGS__PASSWORD=birdxplorer |
| 65 | +``` |
| 66 | + |
| 67 | +For ETL processes, you may need additional environment variables. Check the `.env.example` file in the ETL directory: |
| 68 | + |
| 69 | +```bash |
| 70 | +cp etl/.env.example etl/.env |
| 71 | +``` |
| 72 | + |
| 73 | +### 3. Development Environment Setup |
| 74 | + |
| 75 | +#### Option 1: Using Docker Compose (Recommended) |
| 76 | + |
| 77 | +The easiest way to get started is to use Docker Compose, which sets up all the required services: |
| 78 | + |
| 79 | +```bash |
| 80 | +docker compose up -d |
| 81 | +``` |
| 82 | + |
| 83 | +This will start: |
| 84 | +- PostgreSQL database |
| 85 | +- API service |
| 86 | +- Migration service |
| 87 | + |
| 88 | +The API will be available at http://localhost:8000. |
| 89 | + |
| 90 | +#### Option 2: Local Development Setup |
| 91 | + |
| 92 | +If you prefer to develop without Docker, you can set up each component individually: |
| 93 | + |
| 94 | +1. Set up a virtual environment: |
| 95 | + |
| 96 | +```bash |
| 97 | +python -m venv venv |
| 98 | +source venv/bin/activate # On Windows: venv\Scripts\activate |
| 99 | +``` |
| 100 | + |
| 101 | +2. Install the project in development mode: |
| 102 | + |
| 103 | +```bash |
| 104 | +pip install -e ".[dev]" |
| 105 | +``` |
| 106 | + |
| 107 | +3. Install each component: |
| 108 | + |
| 109 | +```bash |
| 110 | +# Install common package |
| 111 | +cd common |
| 112 | +pip install -e ".[dev]" |
| 113 | +cd .. |
| 114 | + |
| 115 | +# Install API package |
| 116 | +cd api |
| 117 | +pip install -e ".[dev]" |
| 118 | +cd .. |
| 119 | + |
| 120 | +# Install ETL package |
| 121 | +cd etl |
| 122 | +pip install -e ".[dev]" |
| 123 | +cd .. |
| 124 | +``` |
| 125 | + |
| 126 | +4. Run the API server: |
| 127 | + |
| 128 | +```bash |
| 129 | +cd api |
| 130 | +uvicorn birdxplorer_api.main:app --reload |
| 131 | +``` |
| 132 | + |
| 133 | +### 4. Database Migrations |
| 134 | + |
| 135 | +Database migrations are managed using Alembic. To run migrations: |
| 136 | + |
| 137 | +```bash |
| 138 | +# Using Docker |
| 139 | +docker compose up migrate |
| 140 | + |
| 141 | +# Manually |
| 142 | +cd migrate |
| 143 | +alembic upgrade head |
| 144 | +``` |
| 145 | + |
| 146 | +## Development Workflow |
| 147 | + |
| 148 | +### Code Style and Linting |
| 149 | + |
| 150 | +The project uses the following tools for code quality: |
| 151 | + |
| 152 | +- [Black](https://black.readthedocs.io/) for code formatting |
| 153 | +- [isort](https://pycqa.github.io/isort/) for import sorting |
| 154 | +- [Flake8](https://flake8.pycqa.github.io/) for linting |
| 155 | +- [MyPy](https://mypy.readthedocs.io/) for type checking |
| 156 | + |
| 157 | +You can run all these checks using tox: |
| 158 | + |
| 159 | +```bash |
| 160 | +tox |
| 161 | +``` |
| 162 | + |
| 163 | +Or run them individually: |
| 164 | + |
| 165 | +```bash |
| 166 | +black . |
| 167 | +isort . |
| 168 | +flake8 |
| 169 | +mypy |
| 170 | +``` |
| 171 | + |
| 172 | +### Testing |
| 173 | + |
| 174 | +The project uses pytest for testing. To run tests: |
| 175 | + |
| 176 | +```bash |
| 177 | +# Run all tests |
| 178 | +tox |
| 179 | + |
| 180 | +# Run tests for a specific component |
| 181 | +cd api |
| 182 | +pytest |
| 183 | + |
| 184 | +cd ../common |
| 185 | +pytest |
| 186 | + |
| 187 | +cd ../etl |
| 188 | +pytest |
| 189 | +``` |
| 190 | + |
| 191 | +For data model testing, you need to download community notes data: |
| 192 | + |
| 193 | +```bash |
| 194 | +BX_DATA_DIR=data/20230924 tox |
| 195 | +``` |
| 196 | + |
| 197 | +### API Documentation |
| 198 | + |
| 199 | +The API documentation is available at: |
| 200 | + |
| 201 | +- Swagger UI: http://localhost:8000/docs |
| 202 | +- ReDoc: http://localhost:8000/redoc |
| 203 | +- OpenAPI JSON: http://localhost:8000/openapi.json |
| 204 | + |
| 205 | +### ETL Processes |
| 206 | + |
| 207 | +The ETL processes use Prefect for workflow management. To run ETL processes: |
| 208 | + |
| 209 | +```bash |
| 210 | +cd etl |
| 211 | +python -m birdxplorer_etl.main |
| 212 | +``` |
| 213 | + |
| 214 | +## Contributing |
| 215 | + |
| 216 | +### Pull Request Process |
| 217 | + |
| 218 | +1. Fork the repository |
| 219 | +2. Create a feature branch |
| 220 | +3. Make your changes |
| 221 | +4. Run tests and linting |
| 222 | +5. Submit a pull request |
| 223 | + |
| 224 | +### Commit Message Guidelines |
| 225 | + |
| 226 | +Follow the conventional commits specification: |
| 227 | + |
| 228 | +``` |
| 229 | +<type>(<scope>): <description> |
| 230 | +
|
| 231 | +[optional body] |
| 232 | +
|
| 233 | +[optional footer] |
| 234 | +``` |
| 235 | + |
| 236 | +Types: |
| 237 | +- feat: A new feature |
| 238 | +- fix: A bug fix |
| 239 | +- docs: Documentation changes |
| 240 | +- style: Code style changes (formatting, etc.) |
| 241 | +- refactor: Code changes that neither fix bugs nor add features |
| 242 | +- perf: Performance improvements |
| 243 | +- test: Adding or fixing tests |
| 244 | +- chore: Changes to the build process or auxiliary tools |
| 245 | + |
| 246 | +## Troubleshooting |
| 247 | + |
| 248 | +### Common Issues |
| 249 | + |
| 250 | +#### Database Connection Issues |
| 251 | + |
| 252 | +If you encounter database connection issues, check: |
| 253 | + |
| 254 | +1. PostgreSQL is running |
| 255 | +2. The connection string in your `.env` file is correct |
| 256 | +3. The database user has the necessary permissions |
| 257 | + |
| 258 | +#### Missing Dependencies |
| 259 | + |
| 260 | +If you encounter missing dependencies, ensure you've installed the project with the dev dependencies: |
| 261 | + |
| 262 | +```bash |
| 263 | +pip install -e ".[dev]" |
| 264 | +``` |
| 265 | + |
| 266 | +#### Data Import Issues |
| 267 | + |
| 268 | +For ETL processes, ensure you have the correct data directory set: |
| 269 | + |
| 270 | +```bash |
| 271 | +export BX_DATA_DIR=path/to/data |
| 272 | +``` |
| 273 | + |
| 274 | +## Additional Resources |
| 275 | + |
| 276 | +- [Example Use Cases](./example.md) |
| 277 | +- [API Documentation](https://birdxplorer.onrender.com/docs) |
| 278 | +- [OpenAPI Specification](https://birdxplorer.onrender.com/openapi.json) |
0 commit comments