NoisyGL

Official code for NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise accepted by NeurIPS 2024. NoisyGL is a comprehensive benchmark for Graph Neural Networks under label noise (GLN). GLN is a family of robust Graph Neural Network (GNN) models, with a particular focus on performance in the presence of label noise.

Overview of the Benchmark

NoisyGL provides a fair and comprehensive platform to evaluate existing LLN and GLN works and facilitate future GLN research.

Why NoisyGL ?

NoisyGL offers the following features:

A unified data loader module for diverse datasets. You can customize the configuration file of the dataset (located in config/_dataset) to modify data splitting and preprocessing strategies.
Generic noise injection schemes. These schemes (utils.labelnoise), widely used in previous studies, can comprehensively evaluate the robustness of each method.
Generic Base_predictor class. NoisyGL provides a generic implementation template and API for different GLN predictors (predictors.Base_predictor). You can develop your methods by overriding specific methods.
Integrated hyperparameter optimization tool. NoisyGL integrates Neural Network Intelligence (NNI) provided by Microsoft (hyperparam_opt.py). You can easily optimize and update hyperparameters for each method based on the instructions in the README.

The above features provide you with convenience and freedom when using our library. You can modify the implementation details of specific methods, or add new modules to implement your novel methods within the framework we provide easily.

Installation

Note: NoisyGL depends on PyTorch, PyTorch Geometric, PyTorch Sparse and PyTorch Cluster. To streamline the installation, NoisyGL does NOT install these libraries for you. Please install them from the above links for running NoisyGL.

Required Dependencies:

Python 3.11+
torch>=2.1.0
pyg>=2.5.0
torch_sparse>=0.6.18
torch_cluster>=1.6.2
pandas
scipy
scikit-learn
ruamel
ruamel.yaml
nni
matplotlib
numpy
xlsxwriter

Quick Start

Run comprehensive benchmark.

python total_exp.py --runs 10 --methods gcn gin --datasets cora citeseer pubmed --noise_type clean uniform pair --noise_rate 0.1 0.2 --device cuda:0 --seed 3000

By running the command above, two methods 'gcn' and 'gin' will be tested on three datasets 'cora', 'citeseer', and 'pubmed' under different types and rates of label noise. Each experiment will run 10 times and the total results will be saved at ./log and named by the current timestamp. You can customize the combination of method, data, noise type, and noise rate by changing the corresponding arguments.

Run single experiment.

python single_exp.py --method gcn --data cora --noise_type uniform --noise_rate 0.1 --device cuda:0 --seed 3000

This command runs a single experiment in debug mode and is usually used for debugging. By running this, detailed experiment information will be printed on the terminal, which can be used to locate the problem.

When designing your customized predictor, you can add code blocks that only execute in debug mode in the following way:

if self.conf.training['debug']:
    print("break point")

Hyperparameter optimization.

python hyperparam_opt.py --method gcn --data cora --noise_type uniform --noise_rate 0.1 --device cuda:0 --max_trial_number 20 --trial_concurrency 4 --port 8081 --update_config True

By running the command above, an NNI manager will run on http://localhost:8081, then automatically run 20 HPO trails, each trail call 'single_exp.py' with different hyperparameters. After all HPO trials are finished, a new config file with optimized hyperparameters will overwrite the original one at "./config/gcn/gcn_cora.yaml". You can optimize hyperparameters for different methods on various datasets and noise types by changing the corresponding arguments.

Method available ： gcn, smodel, forward, backward, coteaching, sce, jocor, apl, dgnn, cp, nrgnn, unionnet, rtgnn, clnode, cgnn, pignn, rncgln, crgnn, lcat

Dataset available ： cora, citeseer, pubmed, amazoncom, amazonpho, dblp, blogcatalog, flickr, amazon-ratings, roman-empire

Dataset	# Nodes	# Edges	# Feat.	# Classes	# Homophily	Avg. # degree
Cora	2,708	5,278	1,433	7	0.81	3.90
Citeseer	3,327	4,552	3,703	6	0.74	2.74
Pubmed	19,717	44,324	500	3	0.80	4.50
Amazon-Computers	13,752	491,722	767	10	0.78	35.8
Amazon-Photos	7,650	238,162	745	8	0.83	31.1
DBLP	17,716	105,734	1,639	4	0.83	5.97
BlogCatalog	5,196	343,486	8,189	6	0.40	66.1
Flickr	7,575	239,738	12,047	9	0.24	63.3
Amazon-ratings	24,492	93,050	300	5	0.38	7.60
Roman-empire	22,662	32,927	300	18	0.05	2.90

noise type ： clean, pair, uniform, random (new)

Performance overview

Test accuracy of LLN and GLN methods on DBLP dataset under 30% pair and uniform noise, respectively (10 Runs).

Citation

If our work could help your research, please cite: NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise

@article{wang2024noisygl,
      title={NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise}, 
      author={Zhonghao Wang and Danyu Sun and Sheng Zhou and Haobo Wang and Jiapei Fan and Longtao Huang and Jiajun Bu},
      year={2024},
      eprint={2406.04299},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2406.04299}, 
}

Reference

LLN:

ID	Paper	Method	Conference/Journal
1	Training deep neural-networks using a noise adaptation layer	S-model	ICLR 2017
2	Making deep neural networks robust to label noise: A loss correction approach	Forward	CVPR 2017
3	Making deep neural networks robust to label noise: A loss correction approach	Backward	CVPR 2017
4	Co-teaching: Robust training of deep neural networks with extremely noisy labels	Co-teaching	NeurIPS 2018,
5	Symmetric Cross Entropy for Robust Learning With Noisy Labels	SCE	ICCV 2019
6	Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization	JoCoR	CVPR 2020
7	Normalized Loss Functions for Deep Learning with Noisy Labels	APL	ICLR 2020

GLN:

ID	Paper	Method	Conference/Journal
1	Learning Graph Neural Networks with Noisy Labels	D-GNN	ICLR 2019
2	Adversarial label-flipping attack and defense for graph neural networks	LafAK/CP	ICDM 2020
3	NRGNN: Learning a Label Noise Resistant Graph Neural Network on Sparsely and Noisily Labeled Graphs	NRGNN	KDD 2021
4	Unified Robust Training for Graph Neural Networks Against Label Noise	Union-Net	PAKDD 2021
5	Robust training of graph neural networks via noise governance	RTGNN	WSDM 2023
6	CLNode: Curriculum Learning for Node Classification	CLNode	WSDM 2023
7	Learning on Graphs under Label Noise	CGNN	ICASSP 2023
8	Noise-robust Graph Learning by Estimating and Leveraging Pairwise Interactions	PIGNN	TMLR 2023
9	Robust Node Classification on Graph Data with Graph and Label Noise	RNCGLN	AAAI 2024
10	Contrastive learning of graphs under label noise	CRGNN	Neural Netw. 2024

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
config		config
img		img
log		log
paper		paper
predictor		predictor
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hyperparam_opt.py		hyperparam_opt.py
single_exp.py		single_exp.py
total_exp.py		total_exp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NoisyGL

Overview of the Benchmark

Why NoisyGL ?

Installation

Required Dependencies:

Quick Start

Run comprehensive benchmark.

Run single experiment.

Hyperparameter optimization.

Performance overview

Citation

Reference

LLN:

GLN:

About

Releases

Packages

Languages

License

ALLnxiglan/NoisyGL

Folders and files

Latest commit

History

Repository files navigation

NoisyGL

Overview of the Benchmark

Why NoisyGL ?

Installation

Required Dependencies:

Quick Start

Run comprehensive benchmark.

Run single experiment.

Hyperparameter optimization.

Performance overview

Citation

Reference

LLN:

GLN:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages