Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ Depends: R (>= 2.10)
Imports:
assertthat,
purrr,
stringr
stringr,
dplyr
Suggests:
charlatan,
testthat (>= 2.0.0),
Expand Down
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Generated by roxygen2: do not edit by hand

S3method(ensalt,data.frame)
export(available_shakers)
export(ensalt)
export(inspect_shaker)
export(replacement_shaker)
export(salt_capitalization)
Expand All @@ -18,3 +20,5 @@ export(salt_substitute)
export(salt_swap)
export(salt_whitespace)
export(shaker)
importFrom(dplyr,mutate_at)
importFrom(dplyr,vars)
25 changes: 25 additions & 0 deletions R/df.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#' Put salt in a data.frame
#'
#' @param tbl The table to ensalt
#' @param ... The column to ensalt. Can be combined with tidyselect helpers.
#' @param salt The salt function
#'
#' @return A dataframe with some salt in it
#' @export
#' @rdname ensalt
#' @importFrom dplyr mutate_at vars
#'
#' @examples
#' ensalt(iris, Sepal.Length, Sepal.Width, salt = salt_na)
#' ensalt(iris, contains("Sepal"), salt = salt_na)


ensalt <- function(x, ..., salt = salt_na) {
UseMethod("ensalt")
}

#' @export
#' @rdname ensalt
ensalt.data.frame <- function(x, ..., salt = salt_na) {
mutate_at(x, .vars = vars(...), .funs = salt)
}
22 changes: 22 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,28 @@ salt_empty(sample_names, p = 0.5)
salt_na(sample_names, p = 0.5)
```

## Modify a data.frame

`ensalt` allows to put some randomness inside columns of a data.frame:

This function takes:
+ A data.frame
+ Column list in `...`
+ A salt function in the `salt` argument. Defaut is `salt_na`.

```{r}
small_iris <- head(iris, 10)
ensalt(small_iris, Sepal.Length, Sepal.Width, salt = salt_na)
```

It has tidyselect terminology, so you can select columns with helpers:

```{r}
library(dplyr)
ensalt(small_iris, contains("Sepal"), salt = salt_na)
```


## Advanced usage

For more fine-grained control over the salting process, and for access to a wider range of salting types, you can use the underlying functions provided for: inserting, substituting, replacing.
Expand Down
115 changes: 83 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,55 @@ salt_na(sample_names, p = 0.5)
#> [10] NA
```

## Modify a data.frame

`ensalt` allows to put some randomness inside columns of a data.frame:

This function takes: + A data.frame + Column list in `...` + A salt
function in the `salt` argument. Defaut is `salt_na`.

``` r
small_iris <- head(iris, 10)
ensalt(small_iris, Sepal.Length, Sepal.Width, salt = salt_na)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 NA 3.5 1.4 0.2 setosa
#> 2 4.9 NA 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5.0 3.4 1.5 0.2 setosa
#> 9 NA NA 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
```

It has tidyselect terminology, so you can select columns with helpers:

``` r
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
ensalt(small_iris, contains("Sepal"), salt = salt_na)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 NA 3.5 1.4 0.2 setosa
#> 2 NA NA 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 NA 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5.0 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
```

## Advanced usage

For more fine-grained control over the salting process, and for access
Expand All @@ -144,17 +193,17 @@ ones, while `salt_substitute` overwrites those characters.
``` r
# Use p to specify the percent of values that you would like to salt
salt_insert(sample_names, shaker$punctuation, p = 0.5)
#> [1] "Ed\"win Kassulke" "B^arron Fadel"
#> [3] "Dorla Morissette" "Manuela Mante MD"
#> [5] "Ferris Kautzer" "Djuana Hyatt"
#> [7] "Dr. Leighton Ryan" "Ms.( Migdalia Smitham"
#> [9] "Ottil.ia Hermann" "Benj$iman Dach"
#> [1] "Edwin Kassu&lke" "Barron) Fadel"
#> [3] "Dorla M;orissette" "Manuela Mante MD"
#> [5] "Fer&ris Kautzer" "Djuana Hyatt"
#> [7] "Dr. Leighton Ryan" "Ms. Migd*alia Smitham"
#> [9] "Ottilia Hermann" "Benjiman Dach"

# Use n to specify how many new insertions/substitutions you want to make to selected values
salt_substitute(sample_names, shaker$punctuation, p = 0.5, n = 3)
#> [1] "Edwin Kassulke" "Barron Fadel" "D/rla Mo.issette."
#> [4] "Manuela Mante MD" "Ferris %a^t*er" "Dju,na^Hyatt'"
#> [7] "Dr. Leighto\" *(an" "Ms. Migdalia Smitham" "O%tili^ Hermann@"
#> [1] "Edwin Kassulke" "Barron Fadel" "D'r/a Morisse*te"
#> [4] "Manuela Mante MD" "Ferris Kautzer" "Djuan@ Hyatt\"("
#> [7] "Dr' Le%^hton Ryan" "Ms. M,gda^ia Smitham" "Ottilia H!r^an*"
#> [10] "Benjiman Dach"
```

Expand All @@ -164,22 +213,23 @@ like.

``` r
salt_insert(sample_names, shaker$mixed_letters, p = 0.5)
#> [1] "Edwin Kassulke" "Barron FLadel" "Dorla Morissette"
#> [4] "Manuela MantIe MD" "Ferris Kautzer" "Djuana Hyatt"
#> [7] "DrU. Leighton Ryan" "Ms. Migdalia Smitham" "Ottilia Hermannn"
#> [10] "Benjiman DachM"
#> [1] "Edwin Kassulke" "Barron Fadel"
#> [3] "Dorla MTorissette" "Manuela Mante MD"
#> [5] "Ferris Kautfzer" "Djuana Hyatt"
#> [7] "gDr. Leighton Ryan" "Ms. Migdalia SSmitham"
#> [9] "Ottilia HerBmann" "Benjiman Dach"

salt_insert(sample_numbers, shaker$digits, p = 0.5)
#> [1] "1.328059745613008" "0.667415054241444" "1.69175496457426"
#> [4] "0.001261408793618831" "-0.7424613118147763" "0.6096844205304159"
#> [7] "-20.989606379077806" "-0.0348483349098612" "0.847159905848433"
#> [10] "1.52549800647527"
#> [1] "1.28059745613008" "0.6674150254241444" "1.691775496457426"
#> [4] "0.00126140879361831" "-0.742461311814763" "0.609684420504159"
#> [7] "-0.9896406379077806" "-0.03484833490988612" "0.847159905848433"
#> [10] "1.525498006472527"

salt_insert(sample_names, c("foo", "bar", "baz"), p = 0.5)
#> [1] "Edwin Kassulke" "Barron Fadel" "Dorla Morissette"
#> [4] "Manuela Mantebaz MD" "Ferrfoois Kautzer" "Djuanabar Hyatt"
#> [7] "Dr. Leighton Ryan" "Ms. Migdalia Smitham" "Ottbazilia Hermann"
#> [10] "Benjiman Dacbarh"
#> [1] "Edwin Kassulke" "Barron Fafoodel" "Dorla Morissette"
#> [4] "Manuela Mante MD" "Ferris Kbazautzer" "Djuanabaz Hyatt"
#> [7] "bazDr. Leighton Ryan" "Ms. Migdalia Smitham" "Ottilia fooHermann"
#> [10] "Benjiman Dach"
```

`salt_replace` is a bit more targeted: it works with pairs of patterns
Expand All @@ -197,27 +247,28 @@ salt_replace(sample_names, replacement_shaker$ocr_errors, p = 1, rep_p = 1)
#> [9] "Ottilia Hermann" "Benjiman Daclh"

salt_replace(sample_names, replacement_shaker$capitalization, p = 0.5, rep_p = 0.2)
#> [1] "Edwin KassUlKe" "bARRon FaDeL" "Dorla Morissette"
#> [4] "MAnuelA MAnTe MD" "fErris KautZer" "Djuana Hyatt"
#> [7] "Dr. Leighton Ryan" "Ms. Migdalia Smitham" "Ottilia Hermann"
#> [10] "Benjiman Dach"
#> [1] "Edwin Kassulke" "BaRrOn fadel" "Dorla Morissette"
#> [4] "Manuela Mante MD" "Ferris Kautzer" "DjuanA hYAtt"
#> [7] "dr. leIghton Ryan" "Ms. Migdalia Smitham" "OttIlia Hermann"
#> [10] "BenJiMaN dach"

salt_replace(sample_numbers, replacement_shaker$decimal_commas, p = 0.5, rep_p = 1)
#> [1] "1,28059745613008" "0.667415054241444" "1.69175496457426"
#> [4] "0.00126140879361831" "-0,742461311814763" "0,609684420504159"
#> [7] "-0,989606379077806" "-0.0348483349098612" "0.847159905848433"
#> [10] "1,52549800647527"
#> [1] "1,28059745613008" "0,667415054241444" "1,69175496457426"
#> [4] "0.00126140879361831" "-0.742461311814763" "0,609684420504159"
#> [7] "-0.989606379077806" "-0.0348483349098612" "0,847159905848433"
#> [10] "1.52549800647527"
```

You may also specify your own arbitrary character vector of possible
insertions.

``` r
salt_insert(sample_names, insertions = c("X", "Z"))
#> [1] "Edwin Kassulke" "Barron FadZel" "Dorla Morissette"
#> [4] "Manuela Mante MD" "Ferris Kautzer" "Djuana HyatXt"
#> [7] "Dr. Leighton Ryan" "Ms. Migdalia Smitham" "Ottilia Hermann"
#> [10] "Benjiman Dach"
#> [1] "Edwin Kassulke" "Barron Fadel"
#> [3] "Dorla MorissettZe" "Manuela Mante MD"
#> [5] "Ferris Kautzer" "Djuana Hyatt"
#> [7] "Dr. Leighton Ryan" "Ms. Migdalia SmithamZ"
#> [9] "Ottilia Hermann" "Benjiman Dach"
```

## Possible future work
Expand Down
28 changes: 28 additions & 0 deletions man/ensalt.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.