Skip to content

Releases: MissCrispenCakes/DigitalChild

Release v2.0.0 - REST API + Documentation Restructure

27 Jan 05:22

Choose a tag to compare

Release v2.0.0 - REST API + Documentation Restructure

Release Date: January 26, 2026
Zenodo DOI: 10.5281/zenodo.18318098

Major release adding production-ready REST API (Phase 4) and complete documentation reorganization.


🎉 What's New

REST API (Phase 4 Complete)

Production-ready REST API with 14 endpoints for programmatic data access:

  • Documents API - List, filter, paginate, sort documents with 9 filter options
  • Scorecard API - Country indicators, summary statistics, regional filtering
  • Tags API - Frequency analysis, version management, multi-dimensional filtering
  • Timeline API - Temporal analysis (tags over time, year × tag matrices)
  • Export API - CSV downloads with SPDX license headers

Features:

  • Optional API key authentication with dynamic rate limiting (100-2000 req/hr)
  • Redis caching with 15min-1hr TTLs
  • Complete Docker deployment (docker-compose, Nginx, Redis)
  • 104 integration tests (100% pass rate, 100% endpoint coverage)

📖 API Documentation: https://grimdata.org/api/

Documentation Restructure

Complete reorganization with dedicated landing pages:

  • API Landing Page - Overview, features, quick start examples
  • Scorecard Landing Page - Design, methodology, data access guides
  • Projects Landing Page - LittleRainbowRights and SGBV-UPR overviews
  • Clean professional navigation throughout

Technical Improvements

  • Test Coverage: Expanded from 124 → 274 tests (170 pipeline + 104 API)
  • Codebase: Grew from 15,000+ → 21,000+ lines of Python
  • Dependencies: Updated Flask ecosystem to latest stable versions
  • CITATION.cff: Updated with REST API keywords, SGBV journal article DOI

📦 Installation

# Clone and install
git clone https://github.com/MissCrispenCakes/DigitalChild.git
cd DigitalChild
pip install -r requirements.txt

# Run API server
pip install -r api_requirements.txt
python run_api.py

# Access at http://localhost:5000
curl http://localhost:5000/api/health

🔧 Changed

  • Test coverage: 124 → 274 tests
  • Codebase: 15k → 21k+ lines
  • Documentation structure reorganized
  • API docs moved to docs/website/api/
  • Scorecard docs moved to docs/website/scorecard/
  • Navigation structure improved

🐛 Fixed

  • Navigation links consistency
  • Duplicate content removed
  • TOC integration in sidebar
  • 404 errors on Projects pages
  • Scorecard v1.0.0 release notes corrected

📊 Metrics

  • 14 API endpoints operational
  • 194 countries tracked
  • 10 indicators per country
  • 2,543 source URLs validated
  • 274 tests passing (100% success rate)
  • 21,000+ lines of code
  • 75+ documentation files

🔗 Links


📖 Citation

@software{digitalchild2026,
  title = {DigitalChild: Human Rights Data Pipeline for Child and LGBTQ+ Digital Protection},
  author = {Vollmer, S.C. and Vollmer, D.T.},
  year = {2026},
  version = {2.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.18318098},
  note = {Available at: https://grimdata.org. ORCID: 0000-0002-3359-2810 (S.C. Vollmer)}
}

🙏 Acknowledgments

  • Python 3.12, Flask, BeautifulSoup4, Selenium, pandas, pypdf, pytest
  • Redis, Docker, Nginx for production infrastructure
  • GitHub Actions for CI/CD
  • MkDocs Material for documentation

Full Changelog: https://github.com/MissCrispenCakes/DigitalChild/blob/basecamp/CHANGELOG.md

DigitalChild v1.0.1 - Zenodo Archive Release

20 Jan 19:08

Choose a tag to compare

DigitalChild v1.0.0 - Initial Public Release

First stable release of the DigitalChild data pipeline for analyzing human rights documents with focus on child and LGBTQ+ digital protection.

🎯 What's Included

Core Pipeline

  • 7 automated scrapers - AU Policy, OHCHR, UPR, UNICEF, ACERWC, ACHPR, manual upload
  • Multi-format processing - PDF, DOCX, HTML document conversion
  • Versioned tagging system - 4 tag versions (v1, v2, v3, digital) with 20+ rights themes
  • Recommendations extraction - Regex-based extraction with versioning and history tracking
  • Timeline analysis - Global, by-country, and by-region temporal analysis
  • Comparison analytics - Side-by-side version comparison for tags and recommendations

Scorecard System

  • 194 countries tracked with 10 human rights indicators per country
  • 2,543 authoritative source URLs validated and monitored
  • Automated validation - URL checking, change detection, link rot monitoring
  • CSV exports - Summary tables, by-indicator breakdowns, regional analysis

Quality & Testing

  • 124 tests - Comprehensive test coverage (scrapers, processors, validators, scorecard)
  • 68 validator tests - Input validation, path traversal protection, URL validation, file size limits
  • CI/CD pipeline - Automated testing with GitHub Actions
  • Pre-commit hooks - Code formatting (black, isort, flake8), markdown linting

Documentation

  • 25+ markdown files - Installation guides, API docs, standards, architecture
  • CLAUDE.md - Comprehensive development guide for AI assistants
  • Website deployment - Material for MkDocs with GitHub Pages integration

📊 Dataset Highlights

  • 10 indicators tracked per country:

    • AI Policy Status
    • Data Protection Law
    • LGBTQ Legal Status
    • Child Online Protection
    • Biometric SIM Registration
    • Digital Services Taxation
    • Internet Penetration
    • Mobile Coverage
    • Digital Skills Investment
    • Online Content Regulation
  • Data sources: UNESCO, UNCTAD, ILGA, UNICEF, national governments, treaty bodies

🔧 Technical Specifications

  • Language: Python 3.12
  • Key libraries: BeautifulSoup4, Selenium, pandas, PyPDF2, pytest
  • Lines of code: ~15,000+ (Python, config, tests)
  • Export formats: CSV, JSON
  • License: MIT (code), CC BY 4.0 (data)

📝 Citation

@software{digitalchild2026,
  author = {Vollmer, S.C.},
  title = {DigitalChild: Human Rights Data Pipeline for Child and LGBTQ+ Digital Protection},
  year = {2026},
  version = {1.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.XXXXXXX}
}

🌍 Related Projects

  • GRIMdata.org - Main platform website
  • LittleRainbowRights.com - Child & LGBTQ+ digital rights project
  • SGBV-UPR Research - Precursor research on SGBV and Universal Periodic Review

⚠️ Known Limitations

  • Scorecard sources require periodic manual validation (some URLs change)
  • PDF extraction may have OCR limitations for scanned documents
  • Regional coverage currently strongest in Africa (global expansion planned)

🚀 What's Next (Phase 4)

  • Research dashboard with interactive visualizations
  • REST API for data access
  • NLP-based recommendations extraction
  • Global expansion (Europe, Asia, Americas)

📖 Documentation


Note: This is the first public release suitable for research, citation, and replication. Future versions will include dashboard features and expanded geographic coverage.

DigitalChild v1.0.0 - Initial Public Release

20 Jan 19:00

Choose a tag to compare

DigitalChild v1.0.0 - Initial Public Release

First stable release of the DigitalChild data pipeline for analyzing human rights documents with focus on child and LGBTQ+ digital protection.

🎯 What's Included

Core Pipeline

  • 7 automated scrapers - AU Policy, OHCHR, UPR, UNICEF, ACERWC, ACHPR, manual upload
  • Multi-format processing - PDF, DOCX, HTML document conversion
  • Versioned tagging system - 4 tag versions (v1, v2, v3, digital) with 20+ rights themes
  • Recommendations extraction - Regex-based extraction with versioning and history tracking
  • Timeline analysis - Global, by-country, and by-region temporal analysis
  • Comparison analytics - Side-by-side version comparison for tags and recommendations

Scorecard System

  • 194 countries tracked with 10 human rights indicators per country
  • 2,543 authoritative source URLs validated and monitored
  • Automated validation - URL checking, change detection, link rot monitoring
  • CSV exports - Summary tables, by-indicator breakdowns, regional analysis

Quality & Testing

  • 124 tests - Comprehensive test coverage (scrapers, processors, validators, scorecard)
  • 68 validator tests - Input validation, path traversal protection, URL validation, file size limits
  • CI/CD pipeline - Automated testing with GitHub Actions
  • Pre-commit hooks - Code formatting (black, isort, flake8), markdown linting

Documentation

  • 25+ markdown files - Installation guides, API docs, standards, architecture
  • CLAUDE.md - Comprehensive development guide for AI assistants
  • Website deployment - Material for MkDocs with GitHub Pages integration

📊 Dataset Highlights

  • 10 indicators tracked per country:

    • AI Policy Status
    • Data Protection Law
    • LGBTQ Legal Status
    • Child Online Protection
    • Biometric SIM Registration
    • Digital Services Taxation
    • Internet Penetration
    • Mobile Coverage
    • Digital Skills Investment
    • Online Content Regulation
  • Data sources: UNESCO, UNCTAD, ILGA, UNICEF, national governments, treaty bodies

🔧 Technical Specifications

  • Language: Python 3.12
  • Key libraries: BeautifulSoup4, Selenium, pandas, PyPDF2, pytest
  • Lines of code: ~15,000+ (Python, config, tests)
  • Export formats: CSV, JSON
  • License: MIT (code), CC BY 4.0 (data)

📝 Citation

@software{digitalchild2026,
  author = {Vollmer, S.C.},
  title = {DigitalChild: Human Rights Data Pipeline for Child and LGBTQ+ Digital Protection},
  year = {2026},
  version = {1.0.0},
  url = {https://github.com/MissCrispenCakes/DigitalChild},
  doi = {10.5281/zenodo.XXXXXXX}
}

🌍 Related Projects

  • GRIMdata.org - Main platform website
  • LittleRainbowRights.com - Child & LGBTQ+ digital rights project
  • SGBV-UPR Research - Precursor research on SGBV and Universal Periodic Review

⚠️ Known Limitations

  • Scorecard sources require periodic manual validation (some URLs change)
  • PDF extraction may have OCR limitations for scanned documents
  • Regional coverage currently strongest in Africa (global expansion planned)

🚀 What's Next (Phase 4)

  • Research dashboard with interactive visualizations
  • REST API for data access
  • NLP-based recommendations extraction
  • Global expansion (Europe, Asia, Americas)

📖 Documentation


Note: This is the first public release suitable for research, citation, and replication. Future versions will include dashboard features and expanded geographic coverage.