Skip to content

Commit 942b75d

Browse files
docs: add comprehensive development guide for new developers
Co-Authored-By: Hal Seki <[email protected]>
1 parent 8ee036a commit 942b75d

File tree

1 file changed

+278
-0
lines changed

1 file changed

+278
-0
lines changed

docs/DEVELOPMENT.md

Lines changed: 278 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,278 @@
1+
# BirdXplorer Development Guide
2+
3+
This document provides comprehensive guidance for developers who want to contribute to the BirdXplorer project.
4+
5+
## Project Overview
6+
7+
BirdXplorer is a software tool that helps users explore community notes data on X (formerly known as Twitter). The project consists of several components:
8+
9+
- **API**: A FastAPI-based web service that provides endpoints for querying community notes data
10+
- **ETL**: Extract, Transform, Load processes for community notes data
11+
- **Common**: Shared code and utilities used across the project
12+
13+
## Prerequisites
14+
15+
Before you begin development, ensure you have the following installed:
16+
17+
- [Python](https://www.python.org/) (v3.10.12)
18+
- [PostgreSQL](https://www.postgresql.org/) (v15.4)
19+
- [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/) (for local development)
20+
- [Git](https://git-scm.com/)
21+
22+
## Repository Structure
23+
24+
```
25+
BirdXplorer/
26+
├── api/ # FastAPI web service
27+
│ ├── birdxplorer_api/ # API source code
28+
│ ├── tests/ # API tests
29+
│ ├── Dockerfile # Production Docker configuration
30+
│ ├── Dockerfile.dev # Development Docker configuration
31+
│ └── pyproject.toml # API package configuration
32+
├── common/ # Shared code and utilities
33+
│ ├── birdxplorer_common/ # Common source code
34+
│ ├── tests/ # Common tests
35+
│ └── pyproject.toml # Common package configuration
36+
├── etl/ # Extract, Transform, Load processes
37+
│ ├── src/ # ETL source code
38+
│ ├── tests/ # ETL tests
39+
│ └── pyproject.toml # ETL package configuration
40+
├── migrate/ # Database migration scripts
41+
├── docs/ # Documentation
42+
├── scripts/ # Utility scripts
43+
└── compose.yml # Docker Compose configuration
44+
```
45+
46+
## Getting Started
47+
48+
### 1. Clone the Repository
49+
50+
```bash
51+
git clone https://github.com/codeforjapan/BirdXplorer.git
52+
cd BirdXplorer
53+
```
54+
55+
### 2. Set Up Environment Variables
56+
57+
```bash
58+
cp .env.example .env
59+
```
60+
61+
Edit the `.env` file to set the required environment variables:
62+
63+
```
64+
BX_STORAGE_SETTINGS__PASSWORD=birdxplorer
65+
```
66+
67+
For ETL processes, you may need additional environment variables. Check the `.env.example` file in the ETL directory:
68+
69+
```bash
70+
cp etl/.env.example etl/.env
71+
```
72+
73+
### 3. Development Environment Setup
74+
75+
#### Option 1: Using Docker Compose (Recommended)
76+
77+
The easiest way to get started is to use Docker Compose, which sets up all the required services:
78+
79+
```bash
80+
docker compose up -d
81+
```
82+
83+
This will start:
84+
- PostgreSQL database
85+
- API service
86+
- Migration service
87+
88+
The API will be available at http://localhost:8000.
89+
90+
#### Option 2: Local Development Setup
91+
92+
If you prefer to develop without Docker, you can set up each component individually:
93+
94+
1. Set up a virtual environment:
95+
96+
```bash
97+
python -m venv venv
98+
source venv/bin/activate # On Windows: venv\Scripts\activate
99+
```
100+
101+
2. Install the project in development mode:
102+
103+
```bash
104+
pip install -e ".[dev]"
105+
```
106+
107+
3. Install each component:
108+
109+
```bash
110+
# Install common package
111+
cd common
112+
pip install -e ".[dev]"
113+
cd ..
114+
115+
# Install API package
116+
cd api
117+
pip install -e ".[dev]"
118+
cd ..
119+
120+
# Install ETL package
121+
cd etl
122+
pip install -e ".[dev]"
123+
cd ..
124+
```
125+
126+
4. Run the API server:
127+
128+
```bash
129+
cd api
130+
uvicorn birdxplorer_api.main:app --reload
131+
```
132+
133+
### 4. Database Migrations
134+
135+
Database migrations are managed using Alembic. To run migrations:
136+
137+
```bash
138+
# Using Docker
139+
docker compose up migrate
140+
141+
# Manually
142+
cd migrate
143+
alembic upgrade head
144+
```
145+
146+
## Development Workflow
147+
148+
### Code Style and Linting
149+
150+
The project uses the following tools for code quality:
151+
152+
- [Black](https://black.readthedocs.io/) for code formatting
153+
- [isort](https://pycqa.github.io/isort/) for import sorting
154+
- [Flake8](https://flake8.pycqa.github.io/) for linting
155+
- [MyPy](https://mypy.readthedocs.io/) for type checking
156+
157+
You can run all these checks using tox:
158+
159+
```bash
160+
tox
161+
```
162+
163+
Or run them individually:
164+
165+
```bash
166+
black .
167+
isort .
168+
flake8
169+
mypy
170+
```
171+
172+
### Testing
173+
174+
The project uses pytest for testing. To run tests:
175+
176+
```bash
177+
# Run all tests
178+
tox
179+
180+
# Run tests for a specific component
181+
cd api
182+
pytest
183+
184+
cd ../common
185+
pytest
186+
187+
cd ../etl
188+
pytest
189+
```
190+
191+
For data model testing, you need to download community notes data:
192+
193+
```bash
194+
BX_DATA_DIR=data/20230924 tox
195+
```
196+
197+
### API Documentation
198+
199+
The API documentation is available at:
200+
201+
- Swagger UI: http://localhost:8000/docs
202+
- ReDoc: http://localhost:8000/redoc
203+
- OpenAPI JSON: http://localhost:8000/openapi.json
204+
205+
### ETL Processes
206+
207+
The ETL processes use Prefect for workflow management. To run ETL processes:
208+
209+
```bash
210+
cd etl
211+
python -m birdxplorer_etl.main
212+
```
213+
214+
## Contributing
215+
216+
### Pull Request Process
217+
218+
1. Fork the repository
219+
2. Create a feature branch
220+
3. Make your changes
221+
4. Run tests and linting
222+
5. Submit a pull request
223+
224+
### Commit Message Guidelines
225+
226+
Follow the conventional commits specification:
227+
228+
```
229+
<type>(<scope>): <description>
230+
231+
[optional body]
232+
233+
[optional footer]
234+
```
235+
236+
Types:
237+
- feat: A new feature
238+
- fix: A bug fix
239+
- docs: Documentation changes
240+
- style: Code style changes (formatting, etc.)
241+
- refactor: Code changes that neither fix bugs nor add features
242+
- perf: Performance improvements
243+
- test: Adding or fixing tests
244+
- chore: Changes to the build process or auxiliary tools
245+
246+
## Troubleshooting
247+
248+
### Common Issues
249+
250+
#### Database Connection Issues
251+
252+
If you encounter database connection issues, check:
253+
254+
1. PostgreSQL is running
255+
2. The connection string in your `.env` file is correct
256+
3. The database user has the necessary permissions
257+
258+
#### Missing Dependencies
259+
260+
If you encounter missing dependencies, ensure you've installed the project with the dev dependencies:
261+
262+
```bash
263+
pip install -e ".[dev]"
264+
```
265+
266+
#### Data Import Issues
267+
268+
For ETL processes, ensure you have the correct data directory set:
269+
270+
```bash
271+
export BX_DATA_DIR=path/to/data
272+
```
273+
274+
## Additional Resources
275+
276+
- [Example Use Cases](./example.md)
277+
- [API Documentation](https://birdxplorer.onrender.com/docs)
278+
- [OpenAPI Specification](https://birdxplorer.onrender.com/openapi.json)

0 commit comments

Comments
 (0)