Skip to content

Machine Learning Models

Tim Fischer edited this page Dec 8, 2025 · 15 revisions

The Discourse Analysis Tool Suite heavily relies on various Machine Learning Models, especially during document pre-processing. Here, we list all models that are currently in use.

Document Preprocessing

OCR & PDF Processing

Docling runs on GPU, served by Docling-serve

Language Detection

GlotLID runs on CPU, served by Ray

Named Entity Recognition

Spacy runs on CPU, served by Ray

Object Detection

DETR runs on GPU, served by Ray

Automatic Video & Audio Transcriptions

Whisper runs on GPU, served by Ray

Large Language Models

Gemma 3 27B is used for:

  • Image Captioning during document preprocessing
  • LLM Assistant (metadata extraction, document tagging, sentence annotation, span annotation)
  • Automatic memo generation
  • RAG chat in Perspectives extension
  • Perspectives Document Rewriting

Gemma3 runs on GPU, served by vLLM

Embedding Models

Similarity Search

CLIP runs on GPU, served by Ray

Document Embeddings

Arctic Embed runs on GPU, served by vLLM

Instruction-tuned Embedding Models

Context size: ?? Instruction-tuned embedding models are run on GPU, served & trained by GPU workers on demand

Concept over Time Analysis

Context size: The default context size of the model is used. Since only sentences are processed, input texts should not be truncated.

The COTA embedding model runs on GPU, served & trained by GPU workers on demand

Classification Models

Classification models can be selected and fine-tuned by the user for specific tasks. Currently, we only support text modality. We offer a selection of the following models:

Context size: The default context size of the chosen model is used. Input texts that are too large are chunked during training dataset creation.

Base Transformer models

Transformer models are fine-tuned on demand to create document tagging and span classification models.

Embedding models

Embedding models are fine-tuned on demand to create sentence classification models.

All classification models run on GPU, served & trained by GPU workers on demand

Analysis Models

Analysis models can be run on demand to enrich documents further.

Coreference Resolution (DE)

Quotation Detection (DE)

All analysis models run on GPU, served by GPU workers on demand.

Clone this wiki locally