Welcome to the repo for ConspirED: A Dataset for Cognitive Traits of Conspiracy Theories and Large Language Model Safety.
ConspirED is a dataset for identifying cognitive traits of conspiracy theories in text. It contains annotated conspiracy snippets labeled with the CONSPIR traits of conspiratorial ideation.
The repository is organized as follows:
arxiv2025-conspired/
βββ data/ # Dataset files
β βββ context_training.xlsx # Training split
β βββ context_testing.xlsx # Test split
β βββ val_splits/ # Validation splits
β βββ LICENSE-CC-BY-4.0.txt # CC-BY-4.0 license for datasets
βββ .github/ # GitHub Actions and workflows
βββ static/ # Project page assets
βββ conspir_tils.py # Utility functions for prompting experiments
βββ main.py # Main script for LLM prompting experiments
βββ train_clf.py # Script for fine-tuning LaGoNN classifiers
βββ lagonn.py # LaGoNN model implementation
βββ setup_utils.py # Setup and evaluation utilities
βββ finetuning_environment.yml # Conda environment for LaGoNN
βββ prompting_environment.yml # Conda environment for prompting
βββ README.md # This file
βββ LICENSE # Apache 2.0 license for code
βββ NOTICE.txt # Copyright notices
βββ .gitignore # Git ignore rules
βββ .nojekyll # GitHub Pages configuration
βββ index.html # Project landing page
Contains all processed dataset files:
Each dataset file contains the following columns:
doc_id: Unique identifier for the source documentsnippet: The annotated text snippet exhibiting conspiratorial thinkingcontext500: Surrounding context (500 tokens) around the snippetcontext1000: Surrounding context (1000 tokens) around the snippetlabels: Multi-hot encoded list of conspiracy traits (0/1 for each of 6 traits)consolidated_trait: Human-readable list of trait names present in the snippetdominant_consol_trait: The most salient/dominant trait in the snippetsingle_dominant_one_hot_dm: One-hot encoded dominant trait vectorOverallTrait: Original annotation of overall conspiracy traitsDominantTrait: Original annotation of the dominant traitJustification: Annotator's justification for trait assignmentConfidence: Annotator confidence scoreConspiracyTheoryorMainstream: Classification of source as conspiracy theory or mainstreamannotated_text: Original annotated text with markuplinkingpassage: Context linking the snippet to broader narrativebegin/end: Character offsets of the snippet in the source documentname: Source annotator where relevantid: Snippet identifierLabel: Additional label informationremove_row: Flag for data quality filtering
The six conspiracy traits are (in order): Contradictory, Overriding suspicion, Nefarious intent, Persecuted victim, Immune to evidence, and Re-interpreting randomness.
To set up the environment, use the provided Conda YAML files:
finetuning_environment.ymlβ for training and evaluating LaGoNN.prompting_environment.ymlβ for prompting experiments with LLaMA or GPT models.
Use the following commands:
conda env create -f finetuning_environment.yml
conda activate lagonn-envconda env create -f prompting_environment.yml
conda activate prompting-envIf you plan to use OpenAI models (GPT-4, etc.), you need to set your API key as an environment variable:
export OPENAI_API_KEY='your-api-key-here'For permanent configuration, add this line to your ~/.bashrc or ~/.zshrc file.
Use the train_clf.py script to fine-tune LaGoNN classifiers on the ConspirED dataset.
| Argument | Description |
|---|---|
--model |
HuggingFace model to fine-tune (e.g., paraphrase-mpnet-base-v2). |
--model_seed |
Random seed for model initialization. |
--num_iter |
Number of LaGoNN message-passing iterations. |
--epochs |
Number of training epochs. |
--multilab |
Set to True for multi-label classification; False for single-label. |
--lagonn_config |
Graph configuration (e.g., LABEL, TEXT, etc.). |
--lagonn_mode |
Name of the experimental setup (e.g., LAGONN_EXP). |
--NUM_NEIGHBORS |
Number of neighbors per node in the graph. |
--DISTANCE_PRECISION |
Optional: precision mode for node distance (default: None). |
--context |
Whether to include surrounding context in the input. |
--window |
Token window size used when --context is enabled. |
python train_clf.py \
--model paraphrase-mpnet-base-v2 \
--model_seed 4 \
--num_iter 17 \
--epochs 3 \
--multilab True \
--lagonn_config LABEL \
--lagonn_mode LAGONN_EXP \
--NUM_NEIGHBORS 1 \
--setfit False \
--context True \
--window 1000The main.py script runs prompting experiments to identify conspiratorial traits using LLaMA, GPT, or other LLMs.
| Argument | Description |
|---|---|
--strategy |
Prompting strategy used (e.g., what_to_look_for). |
--icl |
In-context learning setup: zero_shot, few_shot_similar, few_shot_dissimilar, or few_shot_both. |
--k |
Number of examples used in few-shot prompting. |
--cot |
Whether to use chain-of-thought prompting (True or False). |
--dev |
Run on development set instead of test set (True or False). |
--model |
Path to the local LLaMA model (if not using OpenAI). |
--context |
Whether to include surrounding context in the prompt (True or False). |
--window |
Token window size for included context (ignored if --context=False). |
--openai |
Whether to use OpenAI API models (True) or local models (False). |
--openai_model |
OpenAI model name (e.g., gpt-4o.). |
python main.py \
--strategy what_to_look_for \
--icl few_shot_both \
--k 20 \
--cot True \
--dev False \
--model path/to/llama-2-7b-chat \
--context True \
--window 1000 \
--openai FalseAfter running experiments, results will be automatically saved to disk in JSON format with the following structure:
Results are saved to: llm_trainclf_jsons/{seed}/{model}/{strategy}/{icl}/{k}/{cot}/{dev}/{context}/{window}/{openai}/{openai_model}/
Each experiment produces JSON files containing:
- Classification report: Per-class precision, recall, and F1-scores for each conspiracy trait
- Aggregated metrics: Macro, micro, samples, and weighted averages for:
- F1-score
- Precision
- Recall
Relaxed evaluation files (*_relaxed_results.json) assess whether the model correctly identifies the dominant trait when multiple traits are present. This evaluation considers a prediction correct if the model assigns a probability β₯ 0.5 to the dominant trait, providing a less strict measure of model performance focused on identifying the most salient conspiracy trait in each instance.
Both standard and relaxed evaluation results are printed to the console during execution and saved as JSON files for further analysis.
This repository uses dual licensing:
- Code: Licensed under the Apache License 2.0 (see
LICENSE) - Data: Licensed under Creative Commons Attribution 4.0 International (CC-BY-4.0) (see
data/LICENSE-CC-BY-4.0.txt)
If our work was helpful for your work, please be so kind as to cite us:
@article{bates2025conspired,
title={ConspirED: A Dataset for Cognitive Traits of Conspiracy Theories and Large Language Model Safety},
author={Bates, Luke and Glockner, Max and Nakov, Preslav and Gurevych, Iryna},
journal={arXiv preprint arXiv:2508.20468},
year={2025},
url={https://arxiv.org/abs/2508.20468}
}- Maintainer: Luke Bates ([email protected])
- UKP Lab: https://www.ukp.tu-darmstadt.de
- TU Darmstadt: https://www.tu-darmstadt.de
Don't hesitate to send us an email or report an issue if something is broken or if you have further questions.
This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.