27 Jan 05:22

MissCrispenCakes

afaf199

Release v2.0.0 - REST API + Documentation Restructure Latest

Latest

Release v2.0.0 - REST API + Documentation Restructure

Release Date: January 26, 2026
Zenodo DOI: 10.5281/zenodo.18318098

Major release adding production-ready REST API (Phase 4) and complete documentation reorganization.

🎉 What's New

REST API (Phase 4 Complete)

Production-ready REST API with 14 endpoints for programmatic data access:

Documents API - List, filter, paginate, sort documents with 9 filter options
Scorecard API - Country indicators, summary statistics, regional filtering
Tags API - Frequency analysis, version management, multi-dimensional filtering
Timeline API - Temporal analysis (tags over time, year × tag matrices)
Export API - CSV downloads with SPDX license headers

Features:

Optional API key authentication with dynamic rate limiting (100-2000 req/hr)
Redis caching with 15min-1hr TTLs
Complete Docker deployment (docker-compose, Nginx, Redis)
104 integration tests (100% pass rate, 100% endpoint coverage)

📖 API Documentation: https://grimdata.org/api/

Documentation Restructure

Complete reorganization with dedicated landing pages:

API Landing Page - Overview, features, quick start examples
Scorecard Landing Page - Design, methodology, data access guides
Projects Landing Page - LittleRainbowRights and SGBV-UPR overviews
Clean professional navigation throughout

Technical Improvements

Test Coverage: Expanded from 124 → 274 tests (170 pipeline + 104 API)
Codebase: Grew from 15,000+ → 21,000+ lines of Python
Dependencies: Updated Flask ecosystem to latest stable versions
CITATION.cff: Updated with REST API keywords, SGBV journal article DOI

📦 Installation

# Clone and install
git clone https://github.com/MissCrispenCakes/DigitalChild.git
cd DigitalChild
pip install -r requirements.txt

# Run API server
pip install -r api_requirements.txt
python run_api.py

# Access at http://localhost:5000
curl http://localhost:5000/api/health

🔧 Changed

Test coverage: 124 → 274 tests
Codebase: 15k → 21k+ lines
Documentation structure reorganized
API docs moved to docs/website/api/
Scorecard docs moved to docs/website/scorecard/
Navigation structure improved

🐛 Fixed

Navigation links consistency
Duplicate content removed
TOC integration in sidebar
404 errors on Projects pages
Scorecard v1.0.0 release notes corrected

📊 Metrics

14 API endpoints operational
194 countries tracked
10 indicators per country
2,543 source URLs validated
274 tests passing (100% success rate)
21,000+ lines of code
75+ documentation files

🔗 Links

Website: https://grimdata.org
API Docs: https://grimdata.org/api/
Scorecard: https://grimdata.org/scorecard/
Deployment Guide: docs/guides/PRODUCTION_DEPLOYMENT.md
Zenodo Archive: https://doi.org/10.5281/zenodo.18318098

📖 Citation

@software{digitalchild2026,
  title = {DigitalChild: Human Rights Data Pipeline for Child and LGBTQ+ Digital Protection},
  author = {Vollmer, S.C. and Vollmer, D.T.},
  year = {2026},
  version = {2.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.18318098},
  note = {Available at: https://grimdata.org. ORCID: 0000-0002-3359-2810 (S.C. Vollmer)}
}

🙏 Acknowledgments

Python 3.12, Flask, BeautifulSoup4, Selenium, pandas, pypdf, pytest
Redis, Docker, Nginx for production infrastructure
GitHub Actions for CI/CD
MkDocs Material for documentation

Full Changelog: https://github.com/MissCrispenCakes/DigitalChild/blob/basecamp/CHANGELOG.md

Assets 2

20 Jan 19:08

MissCrispenCakes

v1.0.1

af77fe3

DigitalChild v1.0.1 - Zenodo Archive Release

DigitalChild v1.0.0 - Initial Public Release

First stable release of the DigitalChild data pipeline for analyzing human rights documents with focus on child and LGBTQ+ digital protection.

🎯 What's Included

Core Pipeline

7 automated scrapers - AU Policy, OHCHR, UPR, UNICEF, ACERWC, ACHPR, manual upload
Multi-format processing - PDF, DOCX, HTML document conversion
Versioned tagging system - 4 tag versions (v1, v2, v3, digital) with 20+ rights themes
Recommendations extraction - Regex-based extraction with versioning and history tracking
Timeline analysis - Global, by-country, and by-region temporal analysis
Comparison analytics - Side-by-side version comparison for tags and recommendations

Scorecard System

194 countries tracked with 10 human rights indicators per country
2,543 authoritative source URLs validated and monitored
Automated validation - URL checking, change detection, link rot monitoring
CSV exports - Summary tables, by-indicator breakdowns, regional analysis

Quality & Testing

124 tests - Comprehensive test coverage (scrapers, processors, validators, scorecard)
68 validator tests - Input validation, path traversal protection, URL validation, file size limits
CI/CD pipeline - Automated testing with GitHub Actions
Pre-commit hooks - Code formatting (black, isort, flake8), markdown linting

Documentation

25+ markdown files - Installation guides, API docs, standards, architecture
CLAUDE.md - Comprehensive development guide for AI assistants
Website deployment - Material for MkDocs with GitHub Pages integration

📊 Dataset Highlights

10 indicators tracked per country:
- AI Policy Status
- Data Protection Law
- LGBTQ Legal Status
- Child Online Protection
- Biometric SIM Registration
- Digital Services Taxation
- Internet Penetration
- Mobile Coverage
- Digital Skills Investment
- Online Content Regulation
Data sources: UNESCO, UNCTAD, ILGA, UNICEF, national governments, treaty bodies

🔧 Technical Specifications

Language: Python 3.12
Key libraries: BeautifulSoup4, Selenium, pandas, PyPDF2, pytest
Lines of code: ~15,000+ (Python, config, tests)
Export formats: CSV, JSON
License: MIT (code), CC BY 4.0 (data)

📝 Citation

@software{digitalchild2026,
  author = {Vollmer, S.C.},
  title = {DigitalChild: Human Rights Data Pipeline for Child and LGBTQ+ Digital Protection},
  year = {2026},
  version = {1.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.XXXXXXX}
}

🌍 Related Projects

GRIMdata.org - Main platform website
LittleRainbowRights.com - Child & LGBTQ+ digital rights project
SGBV-UPR Research - Precursor research on SGBV and Universal Periodic Review

⚠️ Known Limitations

Scorecard sources require periodic manual validation (some URLs change)
PDF extraction may have OCR limitations for scanned documents
Regional coverage currently strongest in Africa (global expansion planned)

🚀 What's Next (Phase 4)

Research dashboard with interactive visualizations
REST API for data access
NLP-based recommendations extraction
Global expansion (Europe, Asia, Americas)

📖 Documentation

Full docs: https://grimdata.org
Installation: See README.md
API docs: See docs/

Note: This is the first public release suitable for research, citation, and replication. Future versions will include dashboard features and expanded geographic coverage.

Assets 2

20 Jan 19:00

MissCrispenCakes

v1.0.0

af77fe3

DigitalChild v1.0.0 - Initial Public Release

First stable release of the DigitalChild data pipeline for analyzing human rights documents with focus on child and LGBTQ+ digital protection.

🎯 What's Included

Core Pipeline

7 automated scrapers - AU Policy, OHCHR, UPR, UNICEF, ACERWC, ACHPR, manual upload
Multi-format processing - PDF, DOCX, HTML document conversion
Versioned tagging system - 4 tag versions (v1, v2, v3, digital) with 20+ rights themes
Recommendations extraction - Regex-based extraction with versioning and history tracking
Timeline analysis - Global, by-country, and by-region temporal analysis
Comparison analytics - Side-by-side version comparison for tags and recommendations

Scorecard System

194 countries tracked with 10 human rights indicators per country
2,543 authoritative source URLs validated and monitored
Automated validation - URL checking, change detection, link rot monitoring
CSV exports - Summary tables, by-indicator breakdowns, regional analysis

Quality & Testing

124 tests - Comprehensive test coverage (scrapers, processors, validators, scorecard)
68 validator tests - Input validation, path traversal protection, URL validation, file size limits
CI/CD pipeline - Automated testing with GitHub Actions
Pre-commit hooks - Code formatting (black, isort, flake8), markdown linting

Documentation

25+ markdown files - Installation guides, API docs, standards, architecture
CLAUDE.md - Comprehensive development guide for AI assistants
Website deployment - Material for MkDocs with GitHub Pages integration

📊 Dataset Highlights

10 indicators tracked per country:
- AI Policy Status
- Data Protection Law
- LGBTQ Legal Status
- Child Online Protection
- Biometric SIM Registration
- Digital Services Taxation
- Internet Penetration
- Mobile Coverage
- Digital Skills Investment
- Online Content Regulation
Data sources: UNESCO, UNCTAD, ILGA, UNICEF, national governments, treaty bodies

🔧 Technical Specifications

Language: Python 3.12
Key libraries: BeautifulSoup4, Selenium, pandas, PyPDF2, pytest
Lines of code: ~15,000+ (Python, config, tests)
Export formats: CSV, JSON
License: MIT (code), CC BY 4.0 (data)

📝 Citation

@software{digitalchild2026,
  author = {Vollmer, S.C.},
  title = {DigitalChild: Human Rights Data Pipeline for Child and LGBTQ+ Digital Protection},
  year = {2026},
  version = {1.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.XXXXXXX}
}

🌍 Related Projects

GRIMdata.org - Main platform website
LittleRainbowRights.com - Child & LGBTQ+ digital rights project
SGBV-UPR Research - Precursor research on SGBV and Universal Periodic Review

⚠️ Known Limitations

Scorecard sources require periodic manual validation (some URLs change)
PDF extraction may have OCR limitations for scanned documents
Regional coverage currently strongest in Africa (global expansion planned)

🚀 What's Next (Phase 4)

Research dashboard with interactive visualizations
REST API for data access
NLP-based recommendations extraction
Global expansion (Europe, Asia, Americas)

📖 Documentation

Full docs: https://grimdata.org
Installation: See README.md
API docs: See docs/

Note: This is the first public release suitable for research, citation, and replication. Future versions will include dashboard features and expanded geographic coverage.

Assets 2

0 Join discussion

Releases: MissCrispenCakes/DigitalChild