-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
110 lines (90 loc) · 3.65 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
output: github_document
bibliography: vignettes/bibliography.bib
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(ggplot2)
library(outlierensembles)
```
# outlierensembles
<img src='man/figures/logo.png' align="right" height="138" />
<!-- badges: start -->
[![R-CMD-check](https://github.com/sevvandi/outlierensembles/workflows/R-CMD-check/badge.svg)](https://github.com/sevvandi/outlierensembles/actions)
<!-- badges: end -->
**outlierensembles** provides a collection of outlier/anomaly detection ensembles. Given the anomaly scores of different anomaly detection methods, the following ensemble techniques can be used to construct an ensemble score:
1. Item Response Theory based ensemble discussed in @kandanaarachchiirtensemble
2. Greedy ensemble discussed in @Schubert2012
3. Inverse Cluster Weighted Averaging (ICWA) method discussed in @Chiang2017
4. Using Maximum scores discussed in @Aggarwal2015
5. Using a threshold sum discussed in @Aggarwal2015
6. Using the mean as the ensemble score
## Installation
You can install the released version of outlierensembles from [CRAN](https://CRAN.R-project.org) with:
``` r
install.packages("outlierensembles")
```
And the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("sevvandi/outlierensembles")
```
## Example
We use 7 anomaly detection methods from DDoutlier R package as our base methods. You can use any anomaly detection method you want to build the ensemble. First, we construct the IRT ensemble. The colors show the ensemble scores.
```{r example1}
faithfulu <- scale(faithful)
y1 <- DDoutlier::KNN_AGG(faithfulu)
y2 <- DDoutlier::LOF(faithfulu)
y3 <- DDoutlier::COF(faithfulu)
y4 <- DDoutlier::INFLO(faithfulu)
y5 <- DDoutlier::KDEOS(faithfulu)
y6 <- DDoutlier::LDF(faithfulu)
y7 <- DDoutlier::LDOF(faithfulu)
Y <- cbind.data.frame(y1, y2, y3, y4, y5, y6, y7)
ens1 <- irt_ensemble(Y)
df <- cbind.data.frame(faithful, ens1$scores)
colnames(df)[3] <- "IRT"
ggplot(df, aes(eruptions, waiting)) + geom_point(aes(color=IRT)) + scale_color_gradient(low="yellow", high="red")
```
Then we do the greedy ensemble.
```{r example2}
ens2 <- greedy_ensemble(Y)
df <- cbind.data.frame(faithful, ens2$scores)
colnames(df)[3] <- "Greedy"
ggplot(df, aes(eruptions, waiting)) + geom_point(aes(color=Greedy)) + scale_color_gradient(low="yellow", high="red")
```
We do the ICWA ensemble next.
```{r example3}
ens3 <- icwa_ensemble(Y)
df <- cbind.data.frame(faithful, ens3)
colnames(df)[3] <- "ICWA"
ggplot(df, aes(eruptions, waiting)) + geom_point(aes(color=ICWA)) + scale_color_gradient(low="yellow", high="red")
```
Next, we use the maximum scores to build the ensemble.
```{r example4}
ens4 <- max_ensemble(Y)
df <- cbind.data.frame(faithful, ens4)
colnames(df)[3] <- "Max"
ggplot(df, aes(eruptions, waiting)) + geom_point(aes(color=Max)) + scale_color_gradient(low="yellow", high="red")
```
Then, we use the a threshold sum to construct the ensemble.
```{r example5}
ens5 <- threshold_ensemble(Y)
df <- cbind.data.frame(faithful, ens5)
colnames(df)[3] <- "Threshold"
ggplot(df, aes(eruptions, waiting)) + geom_point(aes(color=Threshold)) + scale_color_gradient(low="yellow", high="red")
```
Finally, we use the mean values as the ensemble score.
```{r example6}
ens6 <- average_ensemble(Y)
df <- cbind.data.frame(faithful, ens6)
colnames(df)[3] <- "Average"
ggplot(df, aes(eruptions, waiting)) + geom_point(aes(color=Average)) + scale_color_gradient(low="yellow", high="red")
```
## References