-
Notifications
You must be signed in to change notification settings - Fork 18
/
README.Rmd
185 lines (145 loc) · 8.39 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
---
title: "README"
author: "Francisco Bischoff"
date: "`r format(Sys.Date(), '- %d %b %Y')`"
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-"
)
library(tsmp)
```
<img src="man/figures/logo.png" align="right" style="float:right;" />
# Time Series with Matrix Profile
<!-- badges: start -->
[![Packagist](https://img.shields.io/badge/License-Apache--2.0-brightgreen.svg)](https://choosealicense.com/licenses/apache-2.0/)
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![CRAN version](http://www.r-pkg.org/badges/version/tsmp)](https://cran.r-project.org/package=tsmp)
[![CRAN Downloads](https://cranlogs.r-pkg.org/badges/tsmp)](https://cran.r-project.org/package=tsmp)
[![CircleCI build status](https://circleci.com/gh/matrix-profile-foundation/tsmp.svg?style=svg)](https://app.circleci.com/pipelines/github/matrix-profile-foundation/tsmp)
<!-- badges: end -->
| | Build | Dev |
|--------------|-------|-----|
| Windows | [![AppVeyor build status](https://ci.appveyor.com/api/projects/status/byfyqncr60ten98g/branch/master?svg=true)](https://ci.appveyor.com/project/franzbischoff/tsmp/branch/master) | [![AppVeyor build status](https://ci.appveyor.com/api/projects/status/byfyqncr60ten98g/branch/develop?svg=true)](https://ci.appveyor.com/project/franzbischoff/tsmp/branch/develop) |
| Coverage | [![codecov](https://codecov.io/gh/matrix-profile-foundation/tsmp/branch/master/graph/badge.svg)](https://app.codecov.io/gh/matrix-profile-foundation/tsmp) | [![codecov](https://codecov.io/gh/matrix-profile-foundation/tsmp/branch/develop/graph/badge.svg)](https://app.codecov.io/gh/matrix-profile-foundation/tsmp) |
## Notice
This version is being maintained to keep up with CRAN standards.
As soon as possible a new version (with possible breaking changes) with less dependencies will be released later in 2022 or
beginning of 2023.
## Overview
R Functions implementing UCR Matrix Profile Algorithm (http://www.cs.ucr.edu/~eamonn/MatrixProfile.html).
This package allows you to use the Matrix Profile concept as a toolkit.
This package provides:
* Algorithms to build a Matrix Profile: STAMP, STOMP, SCRIMP++, SIMPLE, MSTOMP and VALMOD.
* Algorithms for MOTIF search for Unidimensional and Multidimensional Matrix Profiles.
* Algorithm for Chains search for Unidimensional Matrix Profile.
* Algorithms for Semantic Segmentation (FLUSS) and Weakly Labeled data (SDTS).
* Algorithm for Salient Subsections detection allowing MDS plotting.
* Basic plotting for all outputs generated here.
* Sequencial workflow, see below.
```{r basic, examples, eval = FALSE}
# Basic workflow:
matrix <- tsmp(data, window_size = 30) %>%
find_motif(n_motifs = 3) %T>%
plot()
# SDTS still have a unique way to work:
model <- sdts_train(data, labels, windows)
result <- sdts_predict(model, data, round(mean(windows)))
```
Please refer to the [User Manual](https://matrixprofile.org/tsmp/reference/) for more details.
Please be welcome to suggest improvements.
### Performance on an Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz using a random walk dataset
```{r bench_data, eval=FALSE, include=TRUE}
set.seed(2018)
data <- cumsum(sample(c(-1, 1), 40000, TRUE))
```
#### Current version benchmark
WIP in this version
```{r benchmark, eval = FALSE, cache=TRUE, include=FALSE}
w <- 200
workers <- 6
data_size <- 5000
set.seed(2018)
times <- 5L
mbm_stomp <- round(median(microbenchmark::microbenchmark(tsmp(data, window_size = w, mode = "stomp", verbose = 0), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
mbm_stomp_par <- round(median(microbenchmark::microbenchmark(tsmp(data, window_size = w, mode = "stomp", n_workers = workers, verbose = 0), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
mbm_scrimp <- round(median(microbenchmark::microbenchmark(tsmp(data, window_size = w, mode = "scrimp", verbose = 0), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
times <- 10L
mbm_mpx <- round(median(microbenchmark::microbenchmark(mpx(data, window_size = w), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
mbm_mpx_par <- round(median(microbenchmark::microbenchmark(mpx(data, window_size = w, n_workers = workers), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
times <- 3L
mbm_stamp <- round(median(microbenchmark::microbenchmark(tsmp(data, window_size = w, mode = "stamp", verbose = 0), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
mbm_stamp_par <- round(median(microbenchmark::microbenchmark(tsmp(data, window_size = w, mode = "stamp", n_workers = workers, verbose = 0), times = times, setup = (data <- cumsum(sample(c(-1, 1), data_size, TRUE))))$time) / 10^9, 2)
```
```{r bench_dataset, eval = FALSE, echo = FALSE, message=FALSE, warnings=FALSE}
bench_data <- data.frame("Elapsed Time(s)" = c(mbm_stamp, mbm_stamp_par, mbm_stomp, mbm_stomp_par, mbm_scrimp, mbm_mpx, mbm_mpx_par), "Data Size" = data_size, "Window Size" = w, Threads = c(1, workers, 1, workers, 1, 1, workers), Lang = c("R", "R", "R", "R", "R", "Rcpp", "Rcpp"), row.names = c("`stamp`", "`stamp_par`", "`stomp`", "`stomp_par`", "`scrimp`", "`mpx`", "`mpx_par`"), check.names = FALSE)
knitr::kable(bench_data[order(bench_data$`Elapsed Time(s)`), ])
```
## Installation
```{r installation, gh-installation, eval = FALSE}
# Install the released version from CRAN
install.packages("tsmp")
# Or the development version from GitHub:
# install.packages("devtools")
devtools::install_github("matrix-profile-foundation/tsmp")
```
## Currently available Features
* STAMP (single and multi-thread versions)
* STOMP (single and multi-thread versions)
* STOMPi (On-line version)
* SCRIMP (single-thread, not for AB-joins yet)
* Time Series Chains
* Multivariate STOMP (mSTOMP)
* Multivariate MOTIF Search (from mSTOMP)
* Salient Subsequences search for Multidimensional Space
* Scalable Dictionary learning for Time Series (SDTS) prediction
* FLUSS (Fast Low-cost Unipotent Semantic Segmentation)
* FLOSS (Fast Low-cost On-line Unipotent Semantic Segmentation)
* SiMPle-Fast (Fast Similarity Matrix Profile for Music Analysis and Exploration)
* Annotation vectors (e.g., Stop-word MOTIF bias, Actionability bias)
* FLUSS Arc Plot and SiMPle Arc Plot
* Exact Detection of Variable Length Motifs (VALMOD)
* MPdist: Matrix Profile Distance
* Time Series Snippets
* Subsetting Matrix Profiles (`head()`, `tail()`, `[`, etc.)
* Misc:
* MASS v2.0
* MASS v3.0
* MASS extensions: ADP (Approximate Distance Profile, with PAA)
* MASS extensions: WQ (Weighted Query)
* MASS extensions: QwG (Query with Gap)
* Fast moving average
* Fast moving SD
## Roadmap
* Profile-Based Shapelet Discovery
* GPU-STOMP
## Other projects with Matrix Profile
* Python: https://github.com/target/matrixprofile-ts
* Python: https://github.com/ZiyaoWei/pyMatrixProfile
* Python: https://github.com/juanbeleno/owlpy
* Python: https://github.com/javidlakha/matrix-profile
* Python: https://github.com/shapelets/khiva-python
* R: https://github.com/shapelets/khiva-r
* Matlab: https://github.com/shapelets/khiva-matlab
* Java: https://github.com/shapelets/khiva-java
* Java: https://github.com/ensozos/Matrix-Profile
* Kotlin: https://github.com/shapelets/khiva-kotlin
* C++ (CUDA and OPENCL): https://github.com/shapelets/khiva
* CUDA: https://github.com/zpzim/STOMPSelfJoin
* CUDA: https://github.com/zpzim/SCAMP
## Matrix Profile Foundation
Our next step unifying the Matrix Profile implementation in several programming languages.
Visit: [Matrix Profile Foundation](https://matrixprofile.org)
## Package dependencies
```{r dependency_plot, echo = FALSE, fig.width = 10, fig.height = 5, message = FALSE, warning = FALSE, fig.path = "man/figures/"}
source("https://gist.githubusercontent.com/franzbischoff/3b83243dfdfa73e459935112f3f783e3/raw/239548243f984843b3a87d8cc82f99395a8ed26e/plot_dependencies.R")
plot_dependencies()
```
## Code of Conduct
Please note that the 'tsmp' project is released with a
[Contributor Code of Conduct](https://github.com/matrix-profile-foundation/tsmp/blob/master/.github/CODE_OF_CONDUCT.md).
By contributing to this project, you agree to abide by its terms.