Dataset and Jupyter notebooks to reproduce the work: "Maximizing Efficiency of Dataset Compression for Machine Learning Potentials With Information Theory", from B. Yu et al. The raw dataset is hosted at Zenodo within the DOI 10.5281/zenodo.17536234. This repository contains the Jupyter notebooks used to analyze and plot the data, creating the figures in the main publication.