Skip to content

Commit 60999a6

Browse files
Merge pull request #271 from tidymodels/more-snapshots
More snapshots
2 parents c7f2a31 + 2ee21c2 commit 60999a6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+322
-126
lines changed

DESCRIPTION

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ URL: https://github.com/tidymodels/textrecipes,
2020
BugReports: https://github.com/tidymodels/textrecipes/issues
2121
Depends:
2222
R (>= 3.6),
23-
recipes (>= 1.0.7)
23+
recipes (>= 1.1.0.9000)
2424
Imports:
2525
lifecycle,
2626
dplyr,
@@ -53,6 +53,8 @@ Suggests:
5353
tokenizers.bpe,
5454
udpipe,
5555
wordpiece
56+
Remotes:
57+
tidymodels/recipes
5658
LinkingTo:
5759
cpp11
5860
VignetteBuilder:
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Errors if vocabulary size is set to low.
2+
3+
Code
4+
recipe(~text1, data = test_data) %>% step_tokenize_bpe(text1, vocabulary_size = 10) %>%
5+
prep()
6+
Condition
7+
Error in `step_tokenize_bpe()`:
8+
Caused by error in `prep()`:
9+
! `vocabulary_size` of 10 is too small for column `text1` which has a unique character count of 23
10+
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Errors if vocabulary size is set to low.
2+
3+
Code
4+
recipe(~text, data = tibble(text = "hello")) %>% step_tokenize(text, engine = "tokenizers.bpe",
5+
training_options = list(vocab_size = 2)) %>% prep()
6+
Condition
7+
Error in `step_tokenize()`:
8+
Caused by error in `prep()`:
9+
! `vocabulary_size` of 2 is too small for column `text` which has a unique character count of 4
10+

tests/testthat/_snaps/clean_levels.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
# bake method errors when needed non-standard role columns are missing
2+
3+
Code
4+
bake(trained, new_data = smith_tr[, -1])
5+
Condition
6+
Error in `step_clean_levels()`:
7+
! The following required column is missing from `new_data`: name.
8+
19
# empty printing
210

311
Code

tests/testthat/_snaps/clean_names.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,11 @@
1+
# bake method errors when needed non-standard role columns are missing
2+
3+
Code
4+
bake(trained, new_data = mtcars[, -3])
5+
Condition
6+
Error in `step_clean_names()`:
7+
! The following required column is missing from `new_data`: disp.
8+
19
# empty printing
210

311
Code

tests/testthat/_snaps/dummy_hash.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@
88
! Name collision occurred. The following variable names already exist:
99
* `dummyhash_text_01`
1010

11+
# bake method errors when needed non-standard role columns are missing
12+
13+
Code
14+
bake(trained, new_data = test_data[, -2])
15+
Condition
16+
Error in `step_dummy_hash()`:
17+
! The following required column is missing from `new_data`: sponsor_code.
18+
1119
# empty printing
1220

1321
Code

tests/testthat/_snaps/lda.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@
88
! Name collision occurred. The following variable names already exist:
99
* `lda_text_1`
1010

11+
# bake method errors when needed non-standard role columns are missing
12+
13+
Code
14+
bake(trained, new_data = tokenized_test_data[, -1])
15+
Condition
16+
Error in `step_lda()`:
17+
! The following required column is missing from `new_data`: medium.
18+
1119
# empty printing
1220

1321
Code

tests/testthat/_snaps/lemma.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,14 @@
77
Caused by error in `bake()`:
88
! `text` doesn't have a lemma attribute. Make sure the tokenization step includes lemmatization.
99

10+
# bake method errors when needed non-standard role columns are missing
11+
12+
Code
13+
bake(trained, new_data = tokenized_test_data[, -1])
14+
Condition
15+
Error in `step_lemma()`:
16+
! The following required column is missing from `new_data`: text.
17+
1018
# empty printing
1119

1220
Code

tests/testthat/_snaps/ngram.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,14 @@
1414
Error:
1515
! n must be a positive integer.
1616

17+
# bake method errors when needed non-standard role columns are missing
18+
19+
Code
20+
bake(trained, new_data = tokenized_test_data[, -1])
21+
Condition
22+
Error in `step_ngram()`:
23+
! The following required column is missing from `new_data`: text.
24+
1725
# empty printing
1826

1927
Code

tests/testthat/_snaps/pos_filter.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,14 @@
77
Caused by error in `bake()`:
88
! `text` doesn't have a pos attribute. Make sure the tokenization step includes part of speech tagging.
99

10+
# bake method errors when needed non-standard role columns are missing
11+
12+
Code
13+
bake(trained, new_data = tokenized_test_data[, -1])
14+
Condition
15+
Error in `step_pos_filter()`:
16+
! The following required column is missing from `new_data`: text.
17+
1018
# empty printing
1119

1220
Code

0 commit comments

Comments
 (0)