Open
Description
Looking into ldaseqmodel.py, see that chunksize specified is not passed to ldamodel:
"if corpus is not None and time_slice is not None:
self.max_doc_len = max(len(line) for line in corpus)
if initialize == 'gensim':
lda_model = ldamodel.LdaModel(
corpus, id2word=self.id2word, num_topics=self.num_topics,
passes=passes, alpha=self.alphas, random_state=random_state,
dtype=np.float64
)"
This may cause suboptimal topics due to the default chunksize = 2000 being too small for applications that have many documents.
Could this be fixed in the next release?
Great package, thanks so much for sharing it and all of the work that has gone into it.
Metadata
Metadata
Assignees
Labels
No labels