This repository automatically monitors the LSR (LilyPond Snippet Repository) download page and downloads any new or updated files daily.
The system tracks the following types of files from the LSR download page:
- Database dumps (
lsr-*.mysqldump.gz) - Full database snapshots - Snippet collections (
lsr-snippets-*.tar.gz) - All snippets as individual files - Documentation (
lsr-snippets-docs-*.tar.gz) - Documentation for snippets - Source code (
lsr-*-src.tar.gz) - GPL source code of LSR
- Frequent monitoring: GitHub Actions runs the monitor script every 6 hours
- Change detection: Compares file sizes, modification dates, and MD5 hashes to detect changes
- Smart downloading: Only downloads files that are new or have changed
- SSL handling: Gracefully handles SSL certificate issues (currently bypassed due to expired cert)
- Automatic commits: Commits any new or updated files to the repository
- Artifact storage: Files are also stored as GitHub Actions artifacts
- Metadata tracking: Maintains a
downloads/metadata.jsonfile with download history
.
├── .github/workflows/lsr-monitor.yml # GitHub Actions workflow
├── lsr_monitor.py # Python monitoring script
├── requirements.txt # Python dependencies
├── downloads/ # Downloaded files directory
│ ├── metadata.json # Tracking metadata
│ ├── lsr-YYYY-MM-DD.mysqldump.gz # Database dumps
│ ├── lsr-snippets-all-*.tar.gz # Snippet collections
│ └── ... # Other LSR files
└── README.md # This file
-
Create a new GitHub repository for this project
-
Add the files to your repository:
- Copy the
.github/workflows/lsr-monitor.ymlfile - Copy the
lsr_monitor.pyfile - Copy the
requirements.txtfile - Copy this
README.mdfile
- Copy the
-
Enable GitHub Actions (usually enabled by default)
-
Manual trigger (optional): You can manually trigger the workflow from the Actions tab
Edit the cron expression in .github/workflows/lsr-monitor.yml:
schedule:
- cron: '0 */6 * * *' # Every 6 hours (current setting)Common schedules:
'0 6 * * *'- Daily at 6 AM UTC'0 12 * * *'- Daily at noon UTC'0 6 * * 1'- Weekly on Mondays at 6 AM UTC
The target website currently has SSL certificate issues. You can control SSL verification:
VERIFY_SSL = os.getenv('VERIFY_SSL', 'false').lower() == 'true'Or set the environment variable in the GitHub Actions workflow:
env:
VERIFY_SSL: 'false' # Disable SSL verificationModify the parse_file_list() function in lsr_monitor.py to filter files by name patterns.
Update the DOWNLOADS_DIR variable in lsr_monitor.py.
- Check the Actions tab in your GitHub repository to see workflow runs
- Each run shows detailed logs of what files were checked and downloaded
- Failed runs will show error messages for troubleshooting
You can test the script locally:
# Install dependencies
pip install -r requirements.txt
# Run the monitor script
python lsr_monitor.py
# Or with SSL verification enabled
VERIFY_SSL=true python lsr_monitor.pyThe LSR files can be quite large:
- Database dumps: ~7MB each
- Snippet collections: ~2-3MB each
- These accumulate over time as new versions are released
Consider the GitHub repository size limits and clean up old files periodically if needed.
Workflow not running?
- Check that Actions are enabled in repository settings
- Verify the YAML syntax in the workflow file
Downloads failing?
- Check if the LSR website is accessible
- Review error logs in the Actions tab
- The script includes retry logic for temporary network issues
Files not being detected as changed?
- The script relies on modification dates and file sizes
- If the LSR site doesn't update these properly, files might be missed
- Consider adding hash-based comparison for more reliable detection
Feel free to improve the monitoring script or add features like:
- Email notifications on changes
- Slack/Discord webhooks
- File format validation
- Automatic extraction and indexing of snippet content