Skip to content

Commit c166deb

Browse files
Updated readme
1 parent ad4ecb0 commit c166deb

File tree

1 file changed

+7
-19
lines changed

1 file changed

+7
-19
lines changed

README.md

Lines changed: 7 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -40,35 +40,23 @@ model.print_topics()
4040

4141
| Topic ID | Highest Ranking |
4242
| - | - |
43-
| 0 | atheists, atheism, atheist, belief, beliefs, theists, faith, gods, christians, abortion |
44-
| 1 | alt atheism, usenet alt atheism resources, usenet alt atheism introduction, alt atheism faq, newsgroup alt atheism, atheism faq resource txt, alt atheism groups, atheism, atheism faq intro txt, atheist resources |
45-
| 2 | religion, christianity, faith, beliefs, religions, christian, belief, science, cult, justification |
43+
| | ... |
4644
| 3 | fanaticism, theism, fanatism, all fanatism, theists, strong theism, strong atheism, fanatics, precisely some theists, all theism |
4745
| 4 | religion foundation darwin fish bumper stickers, darwin fish, atheism, 3d plastic fish, fish symbol, atheist books, atheist organizations, negative atheism, positive atheism, atheism index |
4846
| | ... |
4947

50-
Turftopic now also comes with a Chinese vectorizer for easier use.
48+
Turftopic now also comes with a **Chinese vectorizer** for easier use, as well as a generalist **multilingual vectorizer**.
5149

5250
```python
53-
from turftopic import KeyNMF
5451
from turftopic.vectorizers.chinese import default_chinese_vectorizer
52+
from turftopic.vectorizers.spacy import TokenCountVectorizer
5553

56-
model = KeyNMF(
57-
n_components=10,
58-
vectorizer=default_chinese_vectorizer(),
59-
encoder="BAAI/bge-small-zh-v1.5"
60-
)
61-
model.fit(corpus)
54+
chinese_vectorizer = default_chinese_vectorizer()
55+
arabic_vectorizer = TokenCountVectorizer("ar", remove_stopwords=True)
56+
danish_vectorizer = TokenCountVectorizer("da", remove_stopwords=True)
57+
...
6258

63-
model.print_topics()
6459
```
65-
| Topic ID | Highest Ranking |
66-
| - | - |
67-
| 0 | 消息, 时间, 科技, 媒体报道, 美国, 据, 国外, 讯, 宣布, 称 |
68-
| 1 | 体育讯, 新浪, 球员, 球队, 赛季, 火箭, nba, 已经, 主场, 时间 |
69-
| 2 | 记者, 本报讯, 昨日, 获悉, 新华网, 基金, 通讯员, 采访, 男子, 昨天 |
70-
| 3 | 股, 下跌, 上涨, 震荡, 板块, 大盘, 股指, 涨幅, 沪, 反弹 |
71-
| | ... |
7260

7361

7462
## Basics [(Documentation)](https://x-tabdeveloping.github.io/turftopic/)

0 commit comments

Comments
 (0)