Skip to content

Commit

Permalink
auditor 0.2.0 as on CRAN
Browse files Browse the repository at this point in the history
  • Loading branch information
agosiewska committed May 11, 2018
1 parent 98b91a6 commit 4f7800b
Show file tree
Hide file tree
Showing 123 changed files with 1,021 additions and 168 deletions.
9 changes: 4 additions & 5 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
Package: auditor
Title: Model audit - verification, validation, and error analysis
Title: Model Audit - Verification, Validation, and Error Analysis
Version: 0.2.0
Authors@R: c(
person("Alicja", "Gosiewska", , "[email protected]", role = c("aut", "cre")),
person("Przemyslaw", "Biecek", role = c("aut", "ths"))
)
Description: The 'auditor' package provides an easy to use unified interface for creating validation
plots for any model. This visualizations allow to asses and compare the goodness of fit, performance,
and similarity of models. The auditor help statisticians, data scientists, and researchers can avoid
repetitive work consisting of writing code needed to create residuals plots.
Description: Provides an easy to use unified interface for creating validation plots for any model.
The 'auditor' helps to avoid repetitive work consisting of writing code needed to create residual plots.
This visualizations allow to asses and compare the goodness of fit, performance, and similarity of models.
Depends: R (>= 3.0.0)
License: GPL
Encoding: UTF-8
Expand Down
3 changes: 2 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# version 0.2.0
# version 0.2.0 - released on CRAN
## 07/05.2019

- new plot functions: `plotLift()`, `plotCumulativeGain()`, `plotTwoSidedECDF()`, `plotModelCorrelation()`, `plotResidualDensity()`, `plotModelPCA()`, plotPrediction()`, plotModelRanking()`
Expand All @@ -10,6 +10,7 @@
- `variable = NULL` parameter in `scoreDW()`, `scoreRuns()`, plotAutocorrelation()`, `plotACF()` causes the residuals to be not sorted by any variable
- densities in `plotResidualDensity()` may be now separated by variable values
- for function `score()` parameter `score` is renamed into `type`
- new examples

# version 0.1.1.0000
## 09/03/2018
Expand Down
2 changes: 1 addition & 1 deletion R/audit.R
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
#' \item \code{model} the audited model,
#' \item \code{fitted.values} fitted values from model,
#' \item \code{data} data used for fitting the model,
#' \item \code{y} vecor with values of predicted variable used for fittng the model,
#' \item \code{y} vector with values of predicted variable used for fitting the model,
#' \item \code{predict.function} function that were used for model predictions,
#' \item \code{residual.function} function that were used for calculating model residuals,
#' \item \code{residuals}
Expand Down
13 changes: 13 additions & 0 deletions R/plotACF.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,19 @@
#' @param variable Name of model variable to order residuals. If value is NULL data order is taken. If value is "Predicted response" or "Fitted values" then data is ordered by fitted values. If value is "Observed response" the data is ordered by a vector of actual response (\code{y} parameter passed to the \code{\link{audit}} function).
#' @param alpha Confidence level of the interval.
#'
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotACF(lm_au)
#'
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotACF(lm_au, rf_au)
#'
#'
#' @import ggplot2
#' @importFrom stats qnorm acf
#'
Expand Down
6 changes: 6 additions & 0 deletions R/plotAutocorrelation.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@
#' @param variable Name of model variable to order residuals. If value is NULL data order is taken. If value is "Predicted response" or "Fitted values" then data is ordered by fitted values. If value is "Observed response" the data is ordered by a vector of actual response (\code{y} parameter passed to the \code{\link{audit}} function).
#' @param score Logical, if TRUE values of \link{scoreDW} and \link{scoreRuns} will be added to plot.
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotAutocorrelation(lm_au)
#'
#' @import ggplot2
#'
#' @export
Expand Down
8 changes: 8 additions & 0 deletions R/plotCooksDistance.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@
#'
#' For model classes other than lm and glm the distances are computed directly from the definition.
#'
#'
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotCooksDistance(lm_au)
#'
#' @import ggplot2
#'
#' @export
Expand Down
11 changes: 10 additions & 1 deletion R/plotCumulativeGain.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' @title Cumulative Gain Chart
#'
#' @description Cumulative Gain Chartis is a plot of the rate of positive prediction against true positive rate for the different thresholds.
#' @description Cumulative Gain Chart is is a plot of the rate of positive prediction against true positive rate for the different thresholds.
#' It is useful for measuring and comparing the accuracy of the classificators.
#' @param object An object of class ModelAudit.
#' @param ... Other modelAudit objects to be plotted together.
Expand All @@ -9,6 +9,15 @@
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @examples
#' library(mlbench)
#' data("PimaIndiansDiabetes")
#' Pima <- PimaIndiansDiabetes
#' Pima$diabetes <- ifelse(Pima$diabetes == "pos", 1, 0)
#' glm_model <- glm(diabetes~., family=binomial, data=Pima)
#' glm_au <- audit(glm_model, data = Pima, y = Pima$diabetes)
#' plotCumulativeGain(glm_au)
#'
#' @import ggplot2
#' @importFrom ROCR performance prediction
#'
Expand Down
2 changes: 1 addition & 1 deletion R/plotHalfNormal.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
#'
#' @param object ModelAudit object, fitted model object or numeric vector.
#' @param score If TRUE score based on probability density function is displayed on the plot.
#' @param quant.scale if TRUE values on avis are on quantile scale.
#' @param quant.scale if TRUE values on axis are on quantile scale.
#' @param main Title of plot.
#' @param xlab The text for the x axis.
#' @param ylab The text for the y axis.
Expand Down
9 changes: 9 additions & 0 deletions R/plotLift.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,15 @@
#'
#' @return ggplot object
#'
#' @examples
#' library(mlbench)
#' data("PimaIndiansDiabetes")
#' Pima <- PimaIndiansDiabetes
#' Pima$diabetes <- ifelse(Pima$diabetes == "pos", 1, 0)
#' glm_model <- glm(diabetes~., family=binomial, data=Pima)
#' glm_au <- audit(glm_model, data = Pima, y = Pima$diabetes)
#' plotLIFT(glm_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
9 changes: 9 additions & 0 deletions R/plotModelCorrelation.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,15 @@
#'
#' @return ggplot object
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotModelCorrelation(lm_au, rf_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
11 changes: 10 additions & 1 deletion R/plotModelPCA.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,20 @@
#'
#' @param object An object of class ModelAudit,
#' @param ... Other modelAudit objects to be plotted together.
#' @param scale A logical value indicating whether the models residuals should be scaled bfore the analysis.
#' @param scale A logical value indicating whether the models residuals should be scaled before the analysis.
#' @param invisible A text specifying the elements to be hidden on the plot. Default value is "none". Allowed values are "model", "observ".
#'
#' @return ggplot object
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotModelPCA(lm_au, rf_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
9 changes: 9 additions & 0 deletions R/plotModelRanking.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,15 @@
#'
#' @return ggplot object
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotModelRanking(lm_au, rf_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
11 changes: 11 additions & 0 deletions R/plotPrediction.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,17 @@
#' @param ... Other modelAudit objects to be plotted together.
#' @param variable Name of model variable to order residuals. If value is NULL data order is taken. If value is "Observed response" the data is ordered by a vector of actual response (\code{y} parameter passed to the \code{\link{audit}} function).
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotPrediction(lm_au)
#'
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotPrediction(lm_au, rf_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
16 changes: 7 additions & 9 deletions R/plotREC.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,17 +19,15 @@
#' @seealso \code{\link{plot.modelAudit}, \link{plotROC}, \link{plotRROC}}
#'
#' @examples
#' library(auditor)
#' library(randomForest)
#' library(car)
#' model_lm <- lm(prestige ~ education + women + income, data = Prestige)
#' audit_lm <- audit(model_lm)
#'
#' plotREC(audit_lm)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotREC(lm_au)
#'
#' model_rf <- randomForest(prestige ~ education + women + income, data = Prestige)
#' audit_rf <- audit(model_rf)
#' plotREC(audit_lm, audit_rf)
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotREC(lm_au, rf_au)
#'
#'
#' @export
Expand Down
14 changes: 5 additions & 9 deletions R/plotROC.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,13 @@
#' @import plotROC
#'
#' @examples
#' library(auditor)
#' library(mlbench)
#' data("PimaIndiansDiabetes")
#'
#' model.glm <- glm(diabetes~., family=binomial, data=PimaIndiansDiabetes)
#' au.glm <- audit(model.glm, label="class glm")
#' plotROC(au.glm)
#'
#' model.glm.press <- glm(diabetes~pressure, family=binomial, data=PimaIndiansDiabetes)
#' au.glm.press <- audit(model.glm.press)
#' plotROC(au.glm, au.glm.press)
#' Pima <- PimaIndiansDiabetes
#' Pima$diabetes <- ifelse(Pima$diabetes == "pos", 1, 0)
#' glm_model <- glm(diabetes~., family=binomial, data=Pima)
#' glm_au <- audit(glm_model, data = Pima, y = Pima$diabetes)
#' plotROC(glm_au)
#'
#' @export

Expand Down
20 changes: 9 additions & 11 deletions R/plotRROC.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@
#'
#' @return ggplot object
#'
#' @details For RROC curves we use a shift, which is an equvalent to the threshold for ROC curves.
#' @details For RROC curves we use a shift, which is an equivalent to the threshold for ROC curves.
#' For each observation we calculate new prediction: \eqn{\hat{y}'=\hat{y}+s} where s is the shift.
#' Therefore, there are different error values for each shift: \eqn{e_i = \hat{y_i}' - y_i}
#'
#' Over-estimation is caluclates as: \eqn{OVER= \sum(e_i|e_i>0)}.
#' Over-estimation is calculated as: \eqn{OVER= \sum(e_i|e_i>0)}.
#'
#' Under-estimation is calculated as: \eqn{UNDER = \sum(e_i|e_i<0)}.
#'
Expand All @@ -27,17 +27,15 @@
#'
#'
#' @examples
#' library(auditor)
#' library(randomForest)
#' library(car)
#' model_lm <- lm(prestige ~ education + women + income, data = Prestige)
#' audit_lm <- audit(model_lm)
#'
#' plotRROC(audit_lm)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotRROC(lm_au)
#'
#' model_rf <- randomForest(prestige ~ education + women + income, data = Prestige)
#' audit_rf <- audit(model_rf)
#' plotRROC(audit_lm, audit_rf)
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotRROC(lm_au, rf_au)
#'
#' @import ggplot2
#'
Expand Down
11 changes: 11 additions & 0 deletions R/plotResidual.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,17 @@
#' @param variable Name of model variable to order residuals. If value is NULL data order is taken. If value is "Predicted response" or "Fitted values" then data is ordered by fitted values. If value is "Observed response" the data is ordered by a vector of actual response (\code{y} parameter passed to the \code{\link{audit}} function).
#' @param ... Other modelAudit objects to be plotted together.
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotResidual(lm_au)
#'
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotResidual(lm_au, rf_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
11 changes: 11 additions & 0 deletions R/plotResidualDensity.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,17 @@
#'
#' @return ggplot object
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotResidualDensity(lm_au)
#'
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotResidualDensity(lm_au, rf_au)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
7 changes: 7 additions & 0 deletions R/plotScaleLocation.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,13 @@
#' @param variable Name of model variable to order residuals. If value is NULL data order is taken. If value is "Predicted response" or "Fitted values" then data is ordered by fitted values. If value is "Observed response" the data is ordered by a vector of actual response (\code{y} parameter passed to the \code{\link{audit}} function).
#' @param score A logical value. If TRUE value of \link{scoreGQ} will be added.
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotScaleLocation(lm_au)
#'
#'
#' @import ggplot2
#' @importFrom stats median
#'
Expand Down
11 changes: 11 additions & 0 deletions R/plotTwoSidedECDF.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,17 @@
#'
#' @return ggplot object
#'
#' @examples
#' library(car)
#' lm_model <- lm(prestige~education + women + income, data = Prestige)
#' lm_au <- audit(lm_model, data = Prestige, y = Prestige$prestige)
#' plotTwoSidedECDF(lm_au)
#'
#' library(randomForest)
#' rf_model <- randomForest(prestige~education + women + income, data = Prestige)
#' rf_au <- audit(rf_model, data = Prestige, y = Prestige$prestige)
#' plotTwoSidedECDF(lm_au, rf_au, y.reversed = TRUE)
#'
#' @seealso \code{\link{plot.modelAudit}}
#'
#' @import ggplot2
Expand Down
Loading

0 comments on commit 4f7800b

Please sign in to comment.