Skip to content

Commit 1629496

Browse files
Updated readme
1 parent 9037ba0 commit 1629496

File tree

1 file changed

+20
-49
lines changed

1 file changed

+20
-49
lines changed

README.md

Lines changed: 20 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -5,74 +5,45 @@
55

66

77
## Features
8-
- Novel transformer-based topic models:
8+
- Implementations of transformer-based topic models:
99
- Semantic Signal Separation - S³ 🧭
1010
- KeyNMF 🔑
11-
- GMM :gem: (paper soon)
12-
- Implementations of other transformer-based topic models
11+
- GMM :gem:
1312
- Clustering Topic Models: BERTopic and Top2Vec
1413
- Autoencoding Topic Models: CombinedTM and ZeroShotTM
1514
- FASTopic
15+
- Dynamic, Online and Hierarchical Topic Modeling
1616
- Streamlined scikit-learn compatible API 🛠️
1717
- Easy topic interpretation 🔍
18-
- Dynamic Topic Modeling 📈 (GMM, ClusteringTopicModel and KeyNMF)
18+
- Automated topic naming with LLMs
1919
- Visualization with [topicwizard](https://github.com/x-tabdeveloping/topicwizard) 🖌️
2020

2121
> This package is still work in progress and scientific papers on some of the novel methods are currently undergoing peer-review. If you use this package and you encounter any problem, let us know by opening relevant issues.
2222
23-
### New in version 0.7.0
23+
### New in version 0.8.0
2424

25-
#### Component re-estimation, refitting and topic merging
25+
#### Automated Topic Naming
2626

27-
Some models can now easily be modified after being trained in an efficient manner,
28-
without having to recompute all attributes from scratch.
29-
This is especially significant for clustering models and $S^3$.
27+
Turftopic now allows you to automatically assign human readable names to topics using LLMs or n-gram retrieval!
3028

3129
```python
32-
from turftopic import SemanticSignalSeparation, ClusteringTopicModel
33-
34-
s3_model = SemanticSignalSeparation(5, feature_importance="combined").fit(corpus)
35-
# Re-estimating term importances
36-
s3_model.estimate_components(feature_importance="angular")
37-
# Refitting S^3 with a different number of topics (very fast)
38-
s3_model.refit(n_components=10, random_seed=42)
39-
40-
clustering_model = ClusteringTopicModel().fit(corpus)
41-
# Reduces number of topics automatically with a given method
42-
clustering_model.reduce_topics(n_reduce_to=20, reduction_method="smallest")
43-
# Merge topics manually
44-
clustering_model.join_topics([0,3,4,5])
45-
# Resets original topics
46-
clustering_model.reset_topics()
47-
# Re-estimates term importances based on a different method
48-
clustering_model.estimate_components(feature_importance="centroid")
49-
```
50-
51-
#### Manual topic naming
52-
53-
You can now manually label topics in all models in Turftopic.
54-
55-
```python
56-
# you can specify a dict mapping IDs to names
57-
model.rename_topics({0: "New name for topic 0", 5: "New name for topic 5"})
58-
# or a list of topic names
59-
model.rename_topics([f"Topic {i}" for i in range(10)])
60-
```
61-
62-
#### Saving, loading and publishing to HF Hub
63-
64-
You can now load, save and publish models with dedicated functionality.
65-
66-
```python
67-
from turftopic import load_model
30+
from turftopic import KeyNMF
31+
from turftopic.namers import OpenAITopicNamer
6832

69-
model.to_disk("out_folder/")
70-
model = load_model("out_folder/")
33+
model = KeyNMF(10).fit(corpus)
7134

72-
model.push_to_hub("your_user/model_name")
73-
model = load_model("your_user/model_name")
35+
namer = OpenAITopicNamer("gpt-4o-mini")
36+
model.rename_topics(namer)
37+
model.print_topics()
7438
```
7539

40+
| Topic ID | Topic Name | Highest Ranking |
41+
| - | - | - |
42+
| 0 | Operating Systems and Software | windows, dos, os, ms, microsoft, unix, nt, memory, program, apps |
43+
| 1 | Atheism and Belief Systems | atheism, atheist, atheists, belief, religion, religious, theists, beliefs, believe, faith |
44+
| 2 | Computer Architecture and Performance | motherboard, ram, memory, cpu, bios, isa, speed, 486, bus, performance |
45+
| 3 | Storage Technologies | disk, drive, scsi, drives, disks, floppy, ide, dos, controller, boot |
46+
| | ... |
7647

7748
## Basics [(Documentation)](https://x-tabdeveloping.github.io/turftopic/)
7849
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/x-tabdeveloping/turftopic/blob/main/examples/basic_example_20newsgroups.ipynb)

0 commit comments

Comments
 (0)