Skip to content

Security: MissCrispenCakes/DigitalChild

Security

SECURITY.md

Security Policy

Supported Versions

We release patches for security vulnerabilities for the following versions:

Version Supported
latest (basecamp branch) βœ…
< 1.0 ❌

Reporting a Vulnerability

The DigitalChild project takes security seriously. We appreciate your efforts to responsibly disclose your findings.

πŸ”’ Please Do NOT

  • ❌ Open a public GitHub issue for security vulnerabilities
  • ❌ Post about the vulnerability publicly before we've had a chance to address it
  • ❌ Exploit the vulnerability beyond what is necessary to demonstrate it

βœ… Please DO

1. Report Privately via GitHub Security Advisories

Use GitHub's built-in private vulnerability reporting:

  1. Go to the Security tab of this repository
  2. Click "Report a vulnerability"
  3. Fill out the vulnerability details form
  4. Submit privately - only repository maintainers will see it

Alternative: If you cannot use GitHub Security Advisories, open a discussion in the Security category.

2. Include in Your Report:

  • Description: Clear description of the vulnerability
  • Impact: What could an attacker do? What data is at risk?
  • Steps to Reproduce: Detailed steps to reproduce the issue
  • Proof of Concept: Code or commands demonstrating the vulnerability (if possible)
  • Suggested Fix: If you have ideas for how to fix it
  • Your Contact Info: So we can follow up with questions

Example Report:

Subject: [SECURITY] Path Traversal in File Upload

Description: The file upload functionality in processor X doesn't validate
file paths, allowing directory traversal attacks.

Impact: An attacker could read arbitrary files from the server, potentially
accessing sensitive data like API keys or scraped documents.

Steps to Reproduce:
1. Call processor.upload('../../../etc/passwd')
2. File is written outside intended directory
3. Contents can be read

Proof of Concept:
[Code snippet]

Suggested Fix: Validate and sanitize file paths using os.path.normpath()
and ensure they stay within the intended directory.

πŸ“§ Response Timeline

  • Within 48 hours: We'll acknowledge receipt of your report
  • Within 7 days: We'll provide an initial assessment and expected timeline
  • Within 30 days: We'll aim to release a fix (complex issues may take longer)

We'll keep you informed throughout the process.

πŸŽ‰ After the Fix

Once the vulnerability is fixed:

  • We'll credit you in the fix announcement (unless you prefer to remain anonymous)
  • We'll publish a security advisory on GitHub
  • We'll update affected documentation

πŸ›‘οΈ Security Best Practices

If you're deploying or using this project, follow these security practices:

For Users

  1. Keep Updated: Always use the latest version from the basecamp branch
  2. Review Dependencies: Regularly update dependencies (pip install --upgrade -r requirements.txt)
  3. Validate Input: Don't trust user-provided URLs, file paths, or data
  4. Check Logs: Monitor logs for suspicious activity
  5. Secure Credentials: Never commit API keys or credentials to the repository

For Developers

  1. Input Validation: Always validate and sanitize user input

  2. Use Validators: Use the processors/validators.py module for:

    • URL validation (blocks malicious patterns)
    • Path validation (prevents traversal attacks)
    • File validation (checks size, extension)
  3. Avoid Eval: Never use eval() or exec() on untrusted input

  4. SQL Injection: Use parameterized queries (we don't use SQL, but good practice)

  5. Dependencies: Regularly check for vulnerable dependencies:

    pip install safety
    safety check

For Scrapers

When adding new scrapers:

  • βœ… Validate all URLs before requests
  • βœ… Set timeouts on all HTTP requests
  • βœ… Limit file sizes when downloading
  • βœ… Validate file types after download
  • βœ… Handle errors gracefully (don't expose stack traces)
  • βœ… Respect robots.txt and rate limits
  • ❌ Don't scrape without permission
  • ❌ Don't follow untrusted redirects blindly

Data Security

This project handles sensitive human rights data:

  1. Access Control: Limit who can access scraped documents
  2. Transmission: Use HTTPS for all data transfers
  3. Storage: Be mindful of where data is stored (especially cloud services)
  4. Deletion: Follow data retention policies
  5. Privacy: See docs/DATA_GOVERNANCE.md for detailed policies

πŸ” Known Security Considerations

Current Security Features

βœ… Input Validation:

  • URL validation with malicious pattern blocking (validators.py)
  • Path traversal protection (validators.py)
  • File size limits (configurable, default 100MB)
  • Extension whitelisting

βœ… Dependency Management:

  • Requirements pinned in requirements.txt
  • Pre-commit hooks for code quality

βœ… Code Quality:

  • Automated testing (124 tests)
  • Linting with flake8
  • Type checking encouraged

Areas for Improvement

⚠️ Authentication: No authentication system (not needed for current use case) ⚠️ Rate Limiting: Basic timeout, but no sophisticated rate limiting ⚠️ Secrets Management: Use environment variables for any API keys ⚠️ Logging: Ensure no sensitive data in logs

🚨 Security Incidents

If we discover a security incident:

  1. We'll notify affected users immediately
  2. We'll publish a post-mortem
  3. We'll implement measures to prevent recurrence

πŸ“š Security Resources

πŸ™ Thank You

We appreciate security researchers and users who help keep this project secure. Responsible disclosure benefits everyone in the human rights research community.


Last updated: January 2026

There aren’t any published security advisories