Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dynamic topic model instead of regular LDA #7

Closed
juanrloaiza opened this issue May 18, 2022 · 2 comments
Closed

Implement dynamic topic model instead of regular LDA #7

juanrloaiza opened this issue May 18, 2022 · 2 comments

Comments

@juanrloaiza
Copy link
Owner

No description provided.

@juanrloaiza
Copy link
Owner Author

Gensim has an implementation of DTM, but it is incredibly slow. This has been reported as an issue, but no solution has been found yet. Not even changing LdaSeqModel to use LdaMulticore helps.

There is a pull request that improves this implementation, but it hasn't been merged yet:

It is therefore recommended to still use the old DTM wrapper in Gensim 3.8.3 to use the binary from Blei-lab. This requires two files:

  • dtm-linux64/darwin64: Blei-lab's C implementation of DTM pre-compiled, depends on the system (could also be compiled from source).
  • dtmmodel.py: Gensim's 3.8.3 wrapper (no longer supported in >4.0)

The Gensim 3.8.3 wrapper is included in notebooks/utils, but the binary must be downloaded for each OS.

Finally, I commented the LdaSeqModel code in case the PR above gets merged soon.

"""
self.ldaseq = LdaSeqModel(
corpus=corpus_bows,
time_slice=self.corpus_obj.get_time_slices(time_window=time_window),
num_topics=self.num_topics,
id2word=self.id2word,
passes=15,
initialize="ldamodel",
lda_model=self.lda,
random_state=seed,
)"""
self.ldaseq = DtmModel(
dtm_path="utils/dtm-linux64",
corpus=corpus_bows,
time_slices=self.corpus_obj.get_time_slices(time_window=time_window),
num_topics=self.num_topics,
id2word=self.id2word,
rng_seed=0,
)

@juanrloaiza
Copy link
Owner Author

We used Blei's binary. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant