rep_slice_sample on groups with multiple n values

Hello package maintainers!
I am building confidence intervals for groups with bootstrapped values and I'm having trouble creating multiple re-sampled datasets from which to build my confidence intervals.

Using the palmerpenguins library as an example:

```
library(tidyverse)
library(infer)
library(palmerpenguins)
```
There are 344 total observations and each species has a different number of observations:

```
nrow(penguins)
# [1] 344

penguins %>% group_by(species) %>% count()

# A tibble: 3 × 2
# Groups:   species [3]
#  species       n
  <fct>     <int>
#1 Adelie      152
#2 Chinstrap    68
#3 Gentoo      124
```
I want to be able to group by the species, and for each species pull multiple samples while using the original number of observations per each group.

```
set.seed(100)

slices <- penguins2 %>% 
    group_by(species) %>% 
    rep_slice_sample(prop = 1, replace = TRUE, reps = 10)
```
That should give me 344 * 10 = 3440 lines in the full new data set. This is true, but when you look at the data you can see that each replicate has a different number of observations. For all of the Adelie, n per sample should be 152, chinstrap should be 68, and Gentoo should be 124. Instead we find this:

```
slices %>% group_by(species, replicate) %>% count()

# A tibble: 30 × 3
# Groups:   species, replicate [30]
#   species replicate     n
#   <fct>       <int> <int>
#1 Adelie          1   148
#2 Adelie          2   147
# 3 Adelie          3   148
# 4 Adelie          4   151
# 5 Adelie          5   138
# 6 Adelie          6   157
# 7 Adelie          7   161
# 8 Adelie          8   157
# 9 Adelie          9   151
#10 Adelie         10   138
# ℹ 20 more rows
# ℹ Use `print(n = ...)` to see more rows
```
What am I missing? 
thanks for your insight.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rep_slice_sample on groups with multiple n values #527

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

rep_slice_sample on groups with multiple n values #527

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions