Skip to content

Commit f2063fc

Browse files
committed
Add plot example in README
1 parent 2fc69d0 commit f2063fc

File tree

3 files changed

+22
-6
lines changed

3 files changed

+22
-6
lines changed

NEWS.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
### CHANGES IN BTM VERSION 0.3.6
22

3-
- Remove LazyData from DESCRIPTION
3+
- Remove LazyData from DESCRIPTION as there is no data to be lazy about
4+
- Add plot example in README
45

56
### CHANGES IN BTM VERSION 0.3.5
67

README.md

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ More detail can be referred to the following paper:
2323
> https://github.com/xiaohuiyan/xiaohuiyan.github.io/blob/master/paper/BTM-WWW13.pdf
2424
2525

26+
![](tools/biterm-topic-model-example.png)
27+
28+
2629
### Example
2730

2831
```
@@ -93,11 +96,23 @@ scores <- predict(model, newdata = x)
9396
# The first topic is set to a background topic that equals to the empirical word distribution.
9497
# This can be used to filter out common words.
9598
set.seed(321)
96-
model <- BTM(x, k = 5, beta = 0.01, background = TRUE, iter = 1000, trace = 100)
99+
model <- BTM(x, k = 5, beta = 0.01, background = TRUE, iter = 1000, trace = 100)
97100
topicterms <- terms(model, top_n = 5)
98101
topicterms
99102
```
100103

104+
### Visualisation of your model
105+
106+
- Can be done using the textplot package (https://github.com/bnosac/textplot), which can be found at CRAN as well (https://cran.r-project.org/package=textplot)
107+
- An example visualisation built on a model of all R packages from the Natural Language Processing and Machine Learning task views is shown above (see also https://www.bnosac.be/index.php/blog/98-biterm-topic-modelling-for-short-texts)
108+
109+
```
110+
library(textplot)
111+
library(ggraph)
112+
library(concaveman)
113+
plot(model)
114+
```
115+
101116
### Provide your own set of biterms
102117

103118
An interesting use case of this package is to
@@ -127,8 +142,8 @@ biterms <- biterms[, cooccurrence(x = lemma,
127142
128143
## Build the model
129144
set.seed(123456)
130-
x <- subset(anno, upos %in% c("NOUN", "PROPN", "ADJ"))
131-
x <- x[, c("doc_id", "lemma")]
145+
x <- subset(anno, upos %in% c("NOUN", "PROPN", "ADJ"))
146+
x <- x[, c("doc_id", "lemma")]
132147
model <- BTM(x, k = 5, beta = 0.01, iter = 2000, background = TRUE,
133148
biterms = biterms, trace = 100)
134149
topicterms <- terms(model, top_n = 5)
@@ -166,8 +181,8 @@ biterms <- subset(biterms, !term1 %in% exclude & !term2 %in% exclude)
166181
167182
## Put in x only terms whch were used in the biterms object such that frequency stats of terms can be computed in BTM
168183
anno <- anno[, keep := relevant | (token_id %in% head_token_id[relevant == TRUE]), by = list(doc_id, paragraph_id, sentence_id)]
169-
x <- subset(anno, keep == TRUE, select = c("doc_id", "lemma"))
170-
x <- subset(x, !lemma %in% exclude)
184+
x <- subset(anno, keep == TRUE, select = c("doc_id", "lemma"))
185+
x <- subset(x, !lemma %in% exclude)
171186
172187
## Build the topic model
173188
model <- BTM(data = x,
85.7 KB
Loading

0 commit comments

Comments
 (0)