Skip to content
/ cLCTM Public

Latent concept topic model (LCTM), but with contextualized word embeddings. Implementation in Python, using PyTorch and Transformers.

License

Notifications You must be signed in to change notification settings

mokawi/cLCTM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cLCTM

Latent concept topic model (LCTM), but with contextualized word embeddings. Implementation in Python. Token embeddings are learned with Transformers, the Gibbs sampler is optimized for numba (there is also a pure python Gibbs sampler, but it's slow). Uses Faiss to speed up inference and initialization.

This is a very bares-bones implementation.

To do:

  • More details in how it works
  • Functions to retrieve topic top tokens, most similar word/concepts
  • pyLDAvis
  • ...

About

Latent concept topic model (LCTM), but with contextualized word embeddings. Implementation in Python, using PyTorch and Transformers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages