Official code for NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise accepted by NeurIPS 2024. NoisyGL is a comprehensive benchmark for Graph Neural Networks under label noise (GLN). GLN is a family of robust Graph Neural Network (GNN) models, with a particular focus on performance in the presence of label noise.
NoisyGL provides a fair and comprehensive platform to evaluate existing LLN and GLN works and facilitate future GLN research.
NoisyGL offers the following features:
- A unified data loader module for diverse datasets. You can customize the configuration file of the dataset (located in config/_dataset) to modify data splitting and preprocessing strategies.
- Generic noise injection schemes. These schemes (utils.labelnoise), widely used in previous studies, can comprehensively evaluate the robustness of each method.
- Generic Base_predictor class. NoisyGL provides a generic implementation template and API for different GLN predictors (predictors.Base_predictor). You can develop your methods by overriding specific methods.
- Integrated hyperparameter optimization tool. NoisyGL integrates Neural Network Intelligence (NNI) provided by Microsoft (hyperparam_opt.py). You can easily optimize and update hyperparameters for each method based on the instructions in the README.
The above features provide you with convenience and freedom when using our library. You can modify the implementation details of specific methods, or add new modules to implement your novel methods within the framework we provide easily.
Note: NoisyGL depends on PyTorch, PyTorch Geometric, PyTorch Sparse and PyTorch Cluster. To streamline the installation, NoisyGL does NOT install these libraries for you. Please install them from the above links for running NoisyGL.
- Python 3.11+
- torch>=2.1.0
- pyg>=2.5.0
- torch_sparse>=0.6.18
- torch_cluster>=1.6.2
- pandas
- scipy
- scikit-learn
- ruamel
- ruamel.yaml
- nni
- matplotlib
- numpy
- xlsxwriter
python total_exp.py --runs 10 --methods gcn gin --datasets cora citeseer pubmed --noise_type clean uniform pair --noise_rate 0.1 0.2 --device cuda:0 --seed 3000
By running the command above, two methods 'gcn' and 'gin' will be tested on three datasets 'cora', 'citeseer', and 'pubmed' under different types and rates of label noise. Each experiment will run 10 times and the total results will be saved at ./log and named by the current timestamp. You can customize the combination of method, data, noise type, and noise rate by changing the corresponding arguments.
python single_exp.py --method gcn --data cora --noise_type uniform --noise_rate 0.1 --device cuda:0 --seed 3000
This command runs a single experiment in debug mode and is usually used for debugging. By running this, detailed experiment information will be printed on the terminal, which can be used to locate the problem.
When designing your customized predictor, you can add code blocks that only execute in debug mode in the following way:
if self.conf.training['debug']:
print("break point")
python hyperparam_opt.py --method gcn --data cora --noise_type uniform --noise_rate 0.1 --device cuda:0 --max_trial_number 20 --trial_concurrency 4 --port 8081 --update_config True
By running the command above, an NNI manager will run on http://localhost:8081, then automatically run 20 HPO trails, each trail call 'single_exp.py' with different hyperparameters. After all HPO trials are finished, a new config file with optimized hyperparameters will overwrite the original one at "./config/gcn/gcn_cora.yaml". You can optimize hyperparameters for different methods on various datasets and noise types by changing the corresponding arguments.
Method available :
gcn
, smodel
, forward
, backward
, coteaching
, sce
, jocor
, apl
, dgnn
, cp
, nrgnn
, unionnet
, rtgnn
, clnode
, cgnn
, pignn
, rncgln
, crgnn
, lcat
Dataset available :
cora
, citeseer
, pubmed
, amazoncom
, amazonpho
, dblp
, blogcatalog
, flickr
, amazon-ratings
, roman-empire
Dataset | # Nodes | # Edges | # Feat. | # Classes | # Homophily | Avg. # degree |
---|---|---|---|---|---|---|
Cora | 2,708 | 5,278 | 1,433 | 7 | 0.81 | 3.90 |
Citeseer | 3,327 | 4,552 | 3,703 | 6 | 0.74 | 2.74 |
Pubmed | 19,717 | 44,324 | 500 | 3 | 0.80 | 4.50 |
Amazon-Computers | 13,752 | 491,722 | 767 | 10 | 0.78 | 35.8 |
Amazon-Photos | 7,650 | 238,162 | 745 | 8 | 0.83 | 31.1 |
DBLP | 17,716 | 105,734 | 1,639 | 4 | 0.83 | 5.97 |
BlogCatalog | 5,196 | 343,486 | 8,189 | 6 | 0.40 | 66.1 |
Flickr | 7,575 | 239,738 | 12,047 | 9 | 0.24 | 63.3 |
Amazon-ratings | 24,492 | 93,050 | 300 | 5 | 0.38 | 7.60 |
Roman-empire | 22,662 | 32,927 | 300 | 18 | 0.05 | 2.90 |
noise type :
clean
, pair
, uniform
, random (new)
Test accuracy of LLN and GLN methods on DBLP dataset under 30% pair and uniform noise, respectively (10 Runs).
If our work could help your research, please cite: NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise
@article{wang2024noisygl,
title={NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise},
author={Zhonghao Wang and Danyu Sun and Sheng Zhou and Haobo Wang and Jiapei Fan and Longtao Huang and Jiajun Bu},
year={2024},
eprint={2406.04299},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2406.04299},
}
ID | Paper | Method | Conference/Journal |
---|---|---|---|
1 | Training deep neural-networks using a noise adaptation layer | S-model | ICLR 2017 |
2 | Making deep neural networks robust to label noise: A loss correction approach | Forward | CVPR 2017 |
3 | Making deep neural networks robust to label noise: A loss correction approach | Backward | CVPR 2017 |
4 | Co-teaching: Robust training of deep neural networks with extremely noisy labels | Co-teaching | NeurIPS 2018, |
5 | Symmetric Cross Entropy for Robust Learning With Noisy Labels | SCE | ICCV 2019 |
6 | Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization | JoCoR | CVPR 2020 |
7 | Normalized Loss Functions for Deep Learning with Noisy Labels | APL | ICLR 2020 |
ID | Paper | Method | Conference/Journal |
---|---|---|---|
1 | Learning Graph Neural Networks with Noisy Labels | D-GNN | ICLR 2019 |
2 | Adversarial label-flipping attack and defense for graph neural networks | LafAK/CP | ICDM 2020 |
3 | NRGNN: Learning a Label Noise Resistant Graph Neural Network on Sparsely and Noisily Labeled Graphs | NRGNN | KDD 2021 |
4 | Unified Robust Training for Graph Neural Networks Against Label Noise | Union-Net | PAKDD 2021 |
5 | Robust training of graph neural networks via noise governance | RTGNN | WSDM 2023 |
6 | CLNode: Curriculum Learning for Node Classification | CLNode | WSDM 2023 |
7 | Learning on Graphs under Label Noise | CGNN | ICASSP 2023 |
8 | Noise-robust Graph Learning by Estimating and Leveraging Pairwise Interactions | PIGNN | TMLR 2023 |
9 | Robust Node Classification on Graph Data with Graph and Label Noise | RNCGLN | AAAI 2024 |
10 | Contrastive learning of graphs under label noise | CRGNN | Neural Netw. 2024 |