SemDH2024-GreekNewTestamentNames

This repo contains the notebooks used for sourcing data for A Corpus of Biblical Names in the Greek New Testament to Study the Additions, Omissions, and Variations across Different Manuscripts, which was submitted to SemDH 2024: First International Workshop of Semantic Digital Humanities.

Structure of this Repository

.devcontainer/                    Directory containing devcontainer Dockerfile and config file
data/                             General directory for downloaded and generated data
  |-- publish/                    Directory of cleaned up lists (will be generated by 05_pub_prep.ipynb)
  |-- tables/                     Directory containing manually curated lists
  |   `-- names.csv               List of manually curated names
  |-- parsed/                     Parsing data of TEI transcription files
  `-- tmp/                        Intermediate and temporary files
notebooks/                        Directory of jupyter notebooks
nt-manuscripts/                   Directory of python scripts used for downloading manuscript metadata from NTVMR
nt-transcripts/                   Directory of python scripts used for downloading transcription files from IGNTP and NTVMR
na28-crawler/                     Directory containing a crawler to get all NA28 verses (no annotaions).
ecm-crawler/                      Directory containing a crawler to get all ECM verses (no annotaions).
.python-version                   Python version indicator
README                            This README
requirements.txt                  Requirements for Python environment
run_notebooks.sh                  Script to run notebooks by selecting tasks

Install and Use

The recommended Python version for this repo is 3.12.1 (see .python-version). Dockerimages with Python preinstalled can be found on Dockerhub. Alternatively you can setup and run a virtual Python environment. We also provide a devcontainer in this repository.

In your Python environment run pip install -r requirements.txt from the projects root directory to install Jupyter. This will enable you to run the notebooks. When using the devcontainer this is not needed.

For ease of use, run run_notebooks.sh from the projects root directory. During the initial run you will be required to select all steps (one to eight). This will always take multiple hours.

SPARQL Queries

We have utilized a SPARQL query for retrieving an initial list of biblical names in the New Testament.

Endpoint: https://database.factgrid.de/query

SELECT ?Person ?PersonLabel ?noted ?notedLabel ?GenderLabel ?link ?book
WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
  
  ?Person wdt:P2 wd:Q8811.
  ?Person wdt:P143 ?noted.
  ?noted wdt:P8 ?book.

  FILTER (?book IN (wd:Q74942, wd:Q74943, wd:Q74944, wd:Q74945, wd:Q74946, wd:Q74947, wd:Q74948, wd:Q74949, wd:Q74950, wd:Q74951, wd:Q74952, wd:Q74953, wd:Q74954, wd:Q74955, wd:Q74956, wd:Q74957, wd:Q74958, wd:Q74959, wd:Q74960,  wd:Q74961, wd:Q74962, wd:Q74963, wd:Q74964, wd:Q74965, wd:Q74966, wd:Q74967, wd:Q74968)) 
  
  OPTIONAL { ?Person wdt:P154 ?Gender. }
  OPTIONAL { ?link schema:about ?Person ; schema:isPartOf <https://www.wikidata.org/> . }
}
ORDER BY (?PersonLabel)

Updates and Refinements

There will be/have been updates on this repo. Please have a look at the release tags for previous versions.

How to Cite

If you use this code or data in your research, please cite:

@inproceedings{Werner2024,
  title = {A Corpus of Biblical Names in the Greek New Testament to Study the Additions, Omissions, and Variations across Different Manuscripts},
  author = {Christoph Werner and Zacharias Shoukry and Soham Al-Suadi and Frank Krüger},
  url = {https://ceur-ws.org/Vol-3724/paper6.pdf},
  crossref = {SemDH2024},
  year     = {2024},
  abstract = {The analysis of textual variants of verses in the New Testament across different manuscripts has mainly been done by close reading with manual effort. With the increasing number of transcriptions of the different manuscripts, quantitative analyses (so-called distant reading) can be used to search for patterns of omission, addition, or other variations, to formulate novel hypotheses to be investigated by close reading. In this work, we present a corpus of biblical names including spelling variation and inflections and their mentions in the transcriptions of the New Testament. By integrating and semantically enriching the data collected from different sources, we established a corpus that can be used for the quantitative study of omission, addition, and variation of such biblical names. To illustrate the corpus, we implement some use cases and show that well-known cases can be quantitatively reproduced. The corpus and all code are published under open licenses to enable reproduction, update, and maintenance.},
  keywords = {New Testament,Biblical Names,Textual Variation Units},
}

@proceedings{SemDH2024,
  booktitle = {Semantic Digital Humanities 2024},
  year = {2024},
  editor = {Oleksandra Bruns and Andrea Poltronieri and Lise Stork and Tabea Tietz},
  series = {CEUR Workshop Proceedings},
  address = {Aachen},
  issn = {1613-0073},
  url = {https://ceur-ws.org/Vol-3724/},
  venue = {Hersonissos, Greece},
  eventdate = {2024-05-27},
  title = {Proceedings of the First International Workshop of Semantic Digital Humanities (SemDH 2024)}
}

Versions of Generated Data on Zenodo

Version v1 from Mar 15, 2024
Version v2 from May 17, 2024
Version v3 from Jul 10, 2024
Version v4 from Jul 02, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SemDH2024-GreekNewTestamentNames

Structure of this Repository

Install and Use

SPARQL Queries

Updates and Refinements

How to Cite

Versions of Generated Data on Zenodo

About

Uh oh!

Releases 4

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.devcontainer		.devcontainer
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
data/tables		data/tables
ecm-crawler		ecm-crawler
na28-crawler		na28-crawler
notebooks		notebooks
nt-manuscripts		nt-manuscripts
nt-transcripts		nt-transcripts
.editorconfig		.editorconfig
.gitignore		.gitignore
.python-version		.python-version
LICENCE		LICENCE
README.md		README.md
requirements.txt		requirements.txt
run_notebooks.sh		run_notebooks.sh

License

chr-werner/SemDH2024-GreekNewTestamentNames

Folders and files

Latest commit

History

Repository files navigation

SemDH2024-GreekNewTestamentNames

Structure of this Repository

Install and Use

SPARQL Queries

Updates and Refinements

How to Cite

Versions of Generated Data on Zenodo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Languages

Packages