This is the KDD'25 repository for submitting paper CompressGNN: Accelerating Graph Neural Network Training via Hierarchical Compression.
The project's code base is organized in the following directory structure.
.
├── benchmark
├── dataset
├── experiment
├── genData
├── install.sh
├── layer
├── LICENSE
├── loader
├── model
├── README.md
├── requirements.txt
├── src
├── test
└── third_party
To use this repository, please follow the steps below to install the required dependencies.
- Python (version 3.8.13)
- pip (version 23.0.1)
- cuda (version 11.6)
-
Clone the repository
-
Navigate to the project directory
-
Install the required dependencies using pip:
pip install -r requirements.txt
This command will install all the necessary libraries and packages, including:
- numpy
- pytorch
- Pytorch Geometric (PyG)
- Deep Graph Library (DGL)
- torch_scatter
- torch_sparse
- pybind
- ...
- Install CompressGNN
bash install.sh
We have uploaded the small dataset Cora
and cnr-2000
.
Users can generate a dataset in a format that meets our data specifications from WebGraph and PyTorch Geometric.
.
├── csr_elist.npy
├── csr_vlist.npy
├── edge.npy
├── features.npy
├── labels.npy
├── test_mask.npy
├── train_mask.npy
└── val_mask.npy
cd genData
python createTorchDataset.py <input data folder> <output data folder> coo/csr
cd genData
python createCompressDataset <input data folder> <output data folder> coo/csr
cd genData
python datagen_feature.py --data=xxx.pt --scale_factor=length --output=xxx.pt
cd genData
bash preprocess.sh
bash preprocess_compress.sh
bash datagen_feature.sh
cd benchmark/end2end
bash run.sh
- Speedup
cd benchmark/propagate/propagate
bash speedup.sh
- Performance with different feature dimension
cd benchmark/propagate/propagate
bash feature_scale.sh
- Peak memory
cd benchmark/propagate/peak_memory
bash run.sh
- Time and accuracy
cd benchmark/transform/time_accu
bash run.sh
- Time breakdown
cd benchmark/transform/time_breakdown
bash run.sh