Skip to content

Feature: Content Moderation System Backend #128

@jfrank-summit

Description

@jfrank-summit

Summary

Implement backend API services for a content moderation system that allows users to flag inappropriate content and prevents caching/serving of blocked CIDs in the Auto-Files Gateway. User interface components will be implemented separately in the Auto-Drive Gateway and the Auto-Drive frontend. See auto-drive issue 435 for more details on frontend.

Problem Description

Currently, the Auto-Files Gateway has no mechanism to handle inappropriate or illegal content. Once content is cached by the file-retriever service, it remains accessible until cache expires, with no immediate removal capability. This presents potential legal and operational risks.

Proposed Solution

Add a content flagging system with manual review workflow:

  1. Flagging API: POST /files/:cid/flag endpoint to accept user reports
  2. Review Queue: Store flagged content for admin review
  3. Admin Review Interface: Endpoints for reviewing and approving/rejecting flags
  4. Blocklist Management: Maintain a list of admin-approved blocked CIDs
  5. Cache Integration: Check blocklist before serving/caching files

Technical Approach (Manual Review MVP)

Scope: Backend API Services Only

This implementation focuses on API endpoints only. The user interface for flagging content will be implemented separately in the Auto-Drive Gateway UI.

  • Review Queue Storage: JSON file or database storing pending flagged content with metadata
  • Blocklist Storage: Separate JSON file or database storing admin-approved blocked CIDs
  • Integration: Check blocklist in fileComposer.get() and file controller
  • Error Response: HTTP 451 "Unavailable For Legal Reasons" for blocked content

New API Endpoints

// User flagging
POST /files/:cid/flag          // Flag content for review (authenticated?)

// Admin review queue
GET /admin/flags/pending       // List pending flagged content (admin)
POST /admin/flags/:reportId/approve   // Approve flag → add to blocklist (admin)
POST /admin/flags/:reportId/reject    // Reject flag → keep content active (admin)

// Admin blocklist management
GET /admin/blocklist           // List blocked content (admin)
DELETE /admin/blocklist/:cid   // Remove from blocklist (admin)

Configuration

ENABLE_CONTENT_FLAGGING=true
BLOCKLIST_PATH=./.cache/blocklist.json
FLAGGING_QUEUE_PATH=./.cache/flagging-queue.json
ADMIN_API_KEY=admin_secret_key
MAX_REPORTS_PER_IP=10
MAX_REVIEW_QUEUE_SIZE=1000

Integration Points

  1. Cache Layer: Modify fileComposer.get() to check blocklist first
  2. File Controller: Add blocklist check before serving files
  3. Cache Eviction: Remove flagged content from existing cache
  4. Authentication: Reuse existing auth middleware + new admin auth

UI Integration with Auto-Drive

While this repository provides the backend API services only, the user-facing flagging interface will be implemented in the Auto-Drive Gateway which will provide:

  • File Preview Interface: View files with integrated flagging controls
  • Flagging Form: User-friendly form for reporting inappropriate content
  • Admin Dashboard: Web interface for reviewing flagged content and managing blocklist
  • Integration: JavaScript client to call the flagging APIs from this service

Manual Review Workflow

User Flags Content

  1. User calls POST /files/:cid/flag with reason and description
  2. System validates CID exists and user hasn't exceeded rate limits
  3. Flag added to review queue (not immediately blocked)
  4. User receives confirmation with report ID and estimated review time

Admin Reviews Content

  1. Admin calls GET /admin/flags/pending to see flagged content
  2. Admin examines content, reason, and any supporting information
  3. Admin makes decision: approve (block content) or reject (keep active)
  4. System logs decision with admin ID and reasoning

Content Blocked/Unblocked

  1. If approved: CID added to blocklist, removed from cache if present
  2. If rejected: Flag marked as resolved, content remains accessible
  3. Audit trail maintained for all decisions

Security Considerations

  • Rate limiting on flagging endpoint (prevent spam flagging)
  • Admin-only review and blocklist management
  • IP-based abuse prevention
  • Audit logging for all moderation actions
  • Separate admin authentication key

Alternative Approaches Considered

  1. Immediate Blocking: Faster response but high abuse risk (rejected)
  2. Threshold-Based: Multiple flags required before action
  3. External Service: Better for multi-instance deployments but adds infrastructure

The manual review approach was selected to balance content safety with protection against abuse while maintaining simplicity for the MVP.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions