This repository provides a Bash script to automate the extraction and analysis of programming assignments downloaded from Moodle using MOSS (Measure Of Software Similarity).
Before running the script, ensure you have the following installed:
zip
7z
convmv
unrar
mktemp
find
basename
mv
mkdir
rmdir
perl
Additionally, you need to manually download the moss.pl
script by registering with MOSS at https://theory.stanford.edu/~aiken/moss/.
- Moodle assignment submissions ZIP file (from the “Download all submissions” option).
- Ensure “Download submissions in folders” is unchecked.
- Create exclusion lists:
exclude_c.txt
— list of common files to exclude for C submissions.exclude_java.txt
— list of common files to exclude for Java submissions.
The folder structure should be something like:
moodle2moss
├── Group1.zip
├── Group2.zip
├── Group3.zip
├── exclude_c.txt
├── exclude_java.txt
├── moodle2moss.sh
└── moss.pl
Each Group<id>.zip
file contains submissions for the same assignment. In this example, there are three files because submission links were provided separately by three different teachers. However, a single ZIP file is sufficient if all submissions are collected together.
The script performs the following steps:
- Validates required tools and files.
- Accepts a language argument (
C
orJava
). - Extracts all student submissions (including nested archives).
- Fixes filename encodings from Windows-1252 to UTF-8.
- Extracts relevant source files (applying exclusions) into
_SRC_
folders. - Compresses all
_SRC_
folders into a single archivesrc_submissions.zip
. - Invokes the
moss.pl
script to analyze similarities.
%> bash moodle2moss.sh <C|Java>
Example:
%> bash moodle2moss.sh Java
submissions/
: directory with all extracted and processed student submissions.src_submissions.zip
: archive containing all collected source files.- A URL to view MOSS results will be printed at the end.
The script removes temporary files automatically. You may delete the submissions/
directory manually if desired.
- Ensure
moss.pl
is placed in the same directory as the script. - Avoid spaces in folder and filenames when possible.
- The script automatically handles name collisions and deeply nested archives.
This script is released under the MIT License.