This project explores how arguments are formed and expressed in everyday communication. By building tools that automatically detect claims, premises, and their relationships, we aim to make debates, online discussions, and political discourse easier to follow and analyze.
To make these structures easier to understand, we provide a graph visualization that highlights how claims are supported or opposed by premises, and whether they take a pro or con stance.
We also tested various open-source language models and investigated whether fine-tuning them with argumentative data collected from the internet could improve results. Details of this work are documented here.
Our open-source implementation is designed for practical use across different domains, from legal texts and policy discussions to social media analysis. The outcome is a working prototype that can process real-world data and present argument structures in a clear, visual format.
Our Argument Mining pipeline links a web-based frontend, a backend API, and machine learning models to turn unstructured argument texts into structured argumentative graphs.
-
Client Access
Users open the web application in their browser, which runs as a Single Page Application (SPA). -
Web Server
The SPA (built with Vue/TypeScript) is served as static files (HTML, CSS, JS) via NGINX. -
Frontend (SPA)
The SPA handles user input (free text or PDFs) and communicates with the backend API. It then displays an interactive graph of the argumentative discourse units (ADUs) and their stance relations (pro or con). -
API Service
A Python/FastAPI backend exposes HTTP endpoints for processing. It orchestrates inference, manages requests, and returns structured JSON data. -
Machine Learning Models
The API loads pre-trained or fine-tuned NLP models (e.g., BERT-based) to detect claims, premises, and stance relationships. The output is graph data that can be visualized by the frontend.
Our project involved the creation of eight repositories. The following two are crucial for reproducing our pipeline and therefore more in detail explained both in this README (Pipeline Setup Guide) and in their repository wikis:
- armin-app: This repository houses the web frontend, built using Vue.
- argument-mining-api: Boots up the argument mining models and provides API endpoints to feed text into them. It delivers ADUs (Claims, Premises) and the Stance Relationships between those as a result.
In addition, we developed other repositories for developing and testing purposes:
- training-zoo: Dedicated to training our decoder and encoder models.
- synapse-sparks: Here, we tested and engineered prompts for Large Language Models.
- benchmark: Contains the benchmark data used to compare the results of our models.
- prototype-pipeline: Holds our initial pipeline implementation, which was replaced by a more effective solution.
- argument-mining-db: This repository hosts our MariaDB database which was used for storing our training data.
- prototype-graph-visualization: Contains testing of graph visualization.
To run our Argument Mining pipeline, we created this step-by-step manual to guide you through it.
To begin, you'll want to set up the argument-mining-api repository. This repository contains the core logic, models, and API endpoints.
Once the API Server is setup and running, you can begin setting up the frontend using the armin-app repository. It connects to the endpoints from the argument-mining-api and provides a simpler interface to use the pre-trained models or different versions of OpenAI’s GPT models.