Skip to content

Commit

Permalink
Merge pull request #492 from spsanderson/development
Browse files Browse the repository at this point in the history
Fixes #482
  • Loading branch information
spsanderson authored May 10, 2024
2 parents b589b9c + f82bbcb commit 9b4f72d
Show file tree
Hide file tree
Showing 154 changed files with 1,182 additions and 10 deletions.
3 changes: 3 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,9 @@ export(util_uniform_stats_tbl)
export(util_weibull_aic)
export(util_weibull_param_estimate)
export(util_weibull_stats_tbl)
export(util_zero_truncated_geometric_aic)
export(util_zero_truncated_geometric_param_estimate)
export(util_zero_truncated_geometric_stats_tbl)
export(util_zero_truncated_negative_binomial_aic)
export(util_zero_truncated_negative_binomial_param_estimate)
export(util_zero_truncated_negative_binomial_stats_tbl)
Expand Down
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ the parameters of the zero-truncated Poisson distribution.
Add function `util_zero_truncated_poisson_aic()` to calculate the AIC for the zero-truncated Poisson distribution.
Add function `util_zero_truncated_poisson_stats_tbl()` to create a summary table of the zero-truncated Poisson distribution.
4. #472 - Add function `util_f_param_estimate()` and `util_f_aic()` to estimate the parameters and calculate the AIC for the F distribution.
5. #482 - Add function `util_zero_truncated_geometric_param_estimate()` to estimate the parameters of the zero-truncated geometric distribution.
Add function `util_zero_truncated_geometric_aic()` to calculate the AIC for the zero-truncated geometric distribution.
Add function `util_zero_truncated_geometric_stats_tbl()` to create a summary table of the zero-truncated geometric distribution.

## Minor Improvements and Fixes
1. Fix #468 - Update `util_negative_binomial_param_estimate()` to add the use of
Expand Down
121 changes: 121 additions & 0 deletions R/est-param-zt-geometric.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
#' Estimate Zero-Truncated Geometric Parameters
#'
#' @family Parameter Estimation
#' @family Zero-Truncated Geometric
#'
#' @details This function will attempt to estimate the `prob` parameter of the
#' Zero-Truncated Geometric distribution using given vector `.x` as input data.
#' If the parameter `.auto_gen_empirical` is set to `TRUE`, the empirical data
#' in `.x` will be run through the `tidy_empirical()` function and combined with
#' the estimated zero-truncated geometric data.
#'
#' @description This function will estimate the `prob` parameter for a
#' Zero-Truncated Geometric distribution from a given vector `.x`. The function
#' returns a list with a parameter table, and if `.auto_gen_empirical` is set
#' to `TRUE`, the empirical data is combined with the estimated distribution
#' data.
#'
#' @param .x The vector of data to be passed to the function. Must contain
#' non-negative integers and should have no zeros.
#' @param .auto_gen_empirical Boolean value (default `TRUE`) that, when set to
#' `TRUE`, will generate `tidy_empirical()` output for `.x` and combine it with
#' the estimated distribution data.
#'
#' @examples
#' library(actuar)
#' library(dplyr)
#' library(ggplot2)
#' library(actuar)
#'
#' set.seed(123)
#' ztg <- rztgeom(100, prob = 0.2)
#' output <- util_zero_truncated_geometric_param_estimate(ztg)
#'
#' output$parameter_tbl
#'
#' output$combined_data_tbl |>
#' tidy_combined_autoplot()
#'
#' @return
#' A tibble/list
#'
#' @name util_zero_truncated_geometric_param_estimate
NULL

#' @export
#' @rdname util_zero_truncated_geometric_param_estimate

util_zero_truncated_geometric_param_estimate <- function(.x, .auto_gen_empirical = TRUE) {

# Tidyeval ----
x_term <- as.numeric(.x)
n <- length(x_term)
minx <- min(as.numeric(x_term))
maxx <- max(as.numeric(x_term))
m <- mean(as.numeric(x_term))
s <- var(x_term)
sum_x <- sum(x_term)

# Checks ----
if (!is.vector(x_term, mode = "numeric")) {
rlang::abort(
message = "The '.x' term must be a numeric vector.",
use_cli_format = TRUE
)
}

if (!all(x_term == trunc(x_term)) || any(x_term <= 0)) {
rlang::abort(
message = "All values of 'x' must be positive non-zero integers.",
use_cli_format = TRUE
)
}

if (n < 2) {
rlang::abort(
message = "You must supply at least two data points for this function.",
use_cli_format = TRUE
)
}

# Estimate the prob parameter for the Zero-Truncated Geometric distribution
es_ztgeom_prob <- 1 / (1 + m - 1)

# Return Tibble ----
if (.auto_gen_empirical) {
te <- tidy_empirical(.x = x_term)
td <- tidy_zero_truncated_geometric(.n = n, .prob = round(es_ztgeom_prob, 3))
combined_tbl <- tidy_combine_distributions(te, td)
}

ret <- dplyr::tibble(
dist_type = "Zero-Truncated Geometric",
samp_size = n,
min = minx,
max = maxx,
mean = m,
variance = s,
sum_x = sum_x,
method = "Moment Estimate",
prob = es_ztgeom_prob
)

# Return ----
attr(ret, "tibble_type") <- "parameter_estimation"
attr(ret, "family") <- "zero_truncated_geometric"
attr(ret, "x_term") <- .x
attr(ret, "n") <- n

if (.auto_gen_empirical) {
output <- list(
combined_data_tbl = combined_tbl,
parameter_tbl = ret
)
} else {
output <- list(
parameter_tbl = ret
)
}

return(output)
}
86 changes: 86 additions & 0 deletions R/stats-zt-geometric-tbl.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
#' Distribution Statistics for Zero-Truncated Geometric
#'
#' @family Zero-Truncated Geometric
#' @family Distribution Statistics
#'
#' @details This function takes in a tibble generated by a `tidy_ztgeom`
#' distribution function and returns the relevant statistics for a Zero-Truncated
#' Geometric distribution. It requires data to be passed from a `tidy_ztgeom`
#' distribution function.
#'
#' @description Returns distribution statistics for Zero-Truncated Geometric
#' distribution in a tibble.
#'
#' @param .data The data being passed from a `tidy_ztgeom` distribution function.
#'
#' @examples
#' library(dplyr)
#'
#' set.seed(123)
#' tidy_zero_truncated_geometric(.prob = 0.1) |>
#' util_zero_truncated_geometric_stats_tbl() |>
#' glimpse()
#'
#' @return
#' A tibble
#'
#' @name util_zero_truncated_geometric_stats_tbl
NULL

#' @export
#' @rdname util_zero_truncated_geometric_stats_tbl

util_zero_truncated_geometric_stats_tbl <- function(.data) {

# Immediate check for tidy_ distribution function
if (!"tibble_type" %in% names(attributes(.data))) {
rlang::abort(
message = "You must pass data from a 'tidy_dist' function.",
use_cli_format = TRUE
)
}

if (attributes(.data)$tibble_type != "tidy_zero_truncated_geometric") {
rlang::abort(
message = "You must use 'tidy_zero_truncated_geometric()'",
use_cli_format = TRUE
)
}

# Extract attributes
data_tbl <- dplyr::as_tibble(.data)
atb <- attributes(data_tbl)
p <- atb$.prob

# Zero-Truncated Geometric Statistics Calculation
stat_mean <- (1 - p) / p + 1
stat_mode <- 1
stat_sd <- sqrt((1 - p) / (p * p))
stat_skewness <- (2 + p) / sqrt(1 - p)
stat_kurtosis <- 6 + (p * (p - 2)) / (1 - p)
stat_coef_var <- stat_sd / stat_mean

# Generate data table with calculated statistics
ret <- dplyr::tibble(
tidy_function = atb$tibble_type,
function_call = atb$dist_with_params,
distribution = dist_type_extractor(atb$tibble_type),
distribution_type = atb$distribution_family_type,
points = atb$.n,
simulations = atb$.num_sims,
mean = stat_mean,
mode = stat_mode,
range = "1 to Inf",
std_dv = stat_sd,
coeff_var = stat_coef_var,
skewness = stat_skewness,
kurtosis = stat_kurtosis,
computed_std_skew = tidy_skewness_vec(data_tbl$y),
computed_std_kurt = tidy_kurtosis_vec(data_tbl$y),
ci_lo = ci_lo(data_tbl$y),
ci_hi = ci_hi(data_tbl$y)
)

# Return the data table
return(ret)
}
75 changes: 75 additions & 0 deletions R/utils-aic-zt-geometric.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#' Calculate Akaike Information Criterion (AIC) for Zero-Truncated Geometric Distribution
#'
#' This function calculates the Akaike Information Criterion (AIC) for a
#' Zero-Truncated Geometric distribution fitted to the provided data.
#'
#' @family Utility
#' @author Steven P. Sanderson II, MPH
#'
#' @description
#' This function estimates the probability parameter of a Zero-Truncated Geometric
#' distribution from the provided data and calculates the AIC value based on the
#' fitted distribution.
#'
#' @param .x A numeric vector containing the data to be fitted to a Zero-Truncated
#' Geometric distribution.
#'
#' @details
#' This function fits a Zero-Truncated Geometric distribution to the provided
#' data. It estimates the probability parameter using the method of moments and
#' calculates the AIC value.
#'
#' Optimization method: Since the parameter is directly calculated from the data,
#' no optimization is needed.
#'
#' Goodness-of-fit: While AIC is a useful metric for model comparison, it's
#' recommended to also assess the goodness-of-fit of the chosen model using
#' visualization and other statistical tests.
#'
#' @examples
#' library(actuar)
#'
#' # Example: Calculate AIC for a sample dataset
#' set.seed(123)
#' x <- rztgeom(100, prob = 0.2)
#' util_zero_truncated_geometric_aic(x)
#'
#' @return
#' The AIC value calculated based on the fitted Zero-Truncated Geometric
#' distribution to the provided data.
#'
#' @name util_zero_truncated_geometric_aic
NULL

#' @export
#' @rdname util_zero_truncated_geometric_aic
util_zero_truncated_geometric_aic <- function(.x) {
# Tidyeval
x <- as.numeric(.x)

# Check input validity
if (any(x <= 0)) {
rlang::abort(
message = "All values of '.x' must be positive non-zero integers.",
use_cli_format = TRUE
)
}

# Estimate the probability parameter for Zero-Truncated Geometric distribution
prob <- TidyDensity::util_zero_truncated_geometric_param_estimate(x)$parameter_tbl |>
dplyr::pull(prob)

# Calculate log-likelihood for Zero-Truncated Geometric distribution
k_ztgeom <- 1 # Number of parameters for the Zero-Truncated Geometric distribution (prob)
logLik_ztgeom <- sum(actuar::dztgeom(x, prob = prob, log = TRUE))

# Adjust for zero truncation (exclusion of zero)
logLik_adjustment <- -length(x) * log(1 - actuar::dztgeom(0, prob = prob))
logLik_ztgeom <- logLik_ztgeom + logLik_adjustment

# Calculate AIC
AIC_ztgeom <- 2 * k_ztgeom - 2 * logLik_ztgeom

# Return AIC
return(AIC_ztgeom)
}
1 change: 1 addition & 0 deletions docs/news/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ pkgdown: 2.0.9
pkgdown_sha: ~
articles:
getting-started: getting-started.html
last_built: 2024-05-09T15:29Z
last_built: 2024-05-10T19:05Z
urls:
reference: https://www.spsanderson.com/TidyDensity/reference
article: https://www.spsanderson.com/TidyDensity/articles
Expand Down
1 change: 1 addition & 0 deletions docs/reference/check_duplicate_rows.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/reference/convert_to_ts.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions docs/reference/index.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/reference/quantile_normalize.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file modified docs/reference/tidy_four_autoplot-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/reference/tidy_mcmc_sampling.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/reference/util_bernoulli_param_estimate.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 9b4f72d

Please sign in to comment.