Skip to content

centre-for-humanities-computing/chr-book-ads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Reading Beyond the Center

This repository contains code for embeddings, plots and results of our paper:

"Reading Beyond the Center. Modeling Book Encounters in the Danish Periphery (1800-1850)" which will be presented at CHR2025.

Useful directions πŸ“Œ

Some useful directions:

  • /src/ contains scripts to create embeddings
  • /figures/ contains the figures generated by the notebooks
  • /notebooks/ contains the notebooks used for the analysis

Data & paper πŸ“

The dataset used in this paper is available at huggingface, which is an earlier version and subset of this dataset.

The trained embeddings are also available at huggingface.

Please cite our [paper](link coming soon) if you use the code, dataset or embeddings:

Project Organization πŸ—οΈ

β”œβ”€β”€ LICENSE                    <- Open-source license if one is chosen.
β”‚
β”œβ”€β”€ README.md                  <- The top-level README for developers using this project.
β”‚
β”œβ”€β”€ src/                       
β”‚   β”‚
β”‚   β”œβ”€β”€ process_articles.py    <- Code to get embeddings from newspaper article chunks.
β”‚   β”œβ”€β”€ mean_pooling.py        <- Code to get average embeddings from newspaper articles.
β”‚   └── merge_text_embs.py     <- Merge texts and embeddings. 
β”‚   
β”‚
β”œβ”€β”€ data/                      <- Data used for the analysis in notebooks.
β”‚
β”œβ”€β”€ prompt_optimization/       <- Data related to the prompt optimization task with GPT.
β”‚
β”‚
β”œβ”€β”€ notebooks/                 <- Jupyter notebooks.
β”‚   β”‚
β”‚   β”œβ”€β”€ classify_articles.ipynb                <- Notebook to classify article types.
β”‚   β”œβ”€β”€ explore_and_find_book_ads.ipynb        <- Notebook to get descriptive statistics and create a subset of book advertisements.
β”‚   β”œβ”€β”€ create_gold_book_announcements.ipynb   <- Notebook to create gold standard book announcements.
β”‚   β”œβ”€β”€ classify_book_announcements.ipynb      <- Notebook to classify book announcements.  
β”‚   β”œβ”€β”€ api_gpt.ipynb                          <- Notebook to annotate book titles with GPT.
β”‚   └── analyse_titles.ipynb                   <- Notebook to analyse book titles and do statistical tests.
β”‚
└── figures/                      <- Generated graphics and figures used in the paper.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published