This repository provides the code for Context-based Statement-Level Vulnerability Localization.
To download the testing dataset used for evaluation in our experiments, run the following commands:
gdown https://drive.google.com/uc?id=1ZGIdzKdlzyjX7wSJbP0AfMf5BFovRv1g
To download the training and validation dataset used for evaluation in our experiments, run the following commands:
gdown https://drive.google.com/uc?id=1dvvZeynTCNdLSBdX7H3wEnRKIZWyILlv
gdow https://drive.google.com/uc?id=11pyuNbkop_5uk10uAoNr4__Tpww65HXb
For more information of our dataset, please refer to LineVul and Big-Vul.
We provide python source code for training and testing the vulnerability localization models. The source files can be found here. We recommend to use Google Colab to execute the Jupiter notebook COSTA.ipynb.
Please modify hyper-parameters such as batch_size, epoch, vector_length, etc. to fit your own experiments.
We use Joern to analyze source code. The python script for reading CPG nodes and edges can be found here.