-
Notifications
You must be signed in to change notification settings - Fork 35
/
4-Computing-on-distributed-matrices.Rpres
89 lines (56 loc) · 1.19 KB
/
4-Computing-on-distributed-matrices.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
Computing on distributed matrices
========================================================
author: Andrie de Vries & Michele Usuelli
date: 2015-07-01, UseR!2015
width: 1680
height: 1050
css: css/custom.css
Linear regression
=============
$$Y = X \beta + e$$
To solve for `b`, create a loss function, differentiate and solve for a minimum:
$$B = (X’X)^{-1} X’Y$$
Re-arrange:
$$(X’X) \beta = X’ Y$$
The R function `b <- solve(x, y)` gives the solution to $xb = y$
```{r, eval=FALSE}
b <- solve(x’x, x’y)
```
The math of distributed computing
=================================
* Commutative property
$$a + b == b + a$$
* Associative property
$$(a + b) + c == a + (b + c)$$
* Distributive property
$$a * (b + c) == (a * b) + (b * c)$$
Computing X'X is a distributive operation
=============
![](images/SSCP-matrix.png)
Regression
==========
```{r taxi-2-rmr-local, cache=FALSE, include=FALSE}
read_chunk("demo/09-linear-regression.R")
```
```{r load-packages}
```
Regression
==========
```{r generate-data}
```
Regression
==========
```{r sum-function}
```
Regression
==========
```{r XtX}
```
Regression
==========
```{r XtY}
```
Regression
==========
```{r solve}
```