Skip to content

Commit e5ffa50

Browse files
authored
Jdb/depth-3D (#282)
* remove nrounds from cache => m.info * fix rmse metric * MVP cpu subsample / split-set * benchmarks * benchmarks * up * up * MVP hybrid split set * MVP hybrid split set * GPU adaptations * benchmarks - cpu perf improvements * up recipe * WIP depth improvements * WIP 3D * WIP * WIP * test on Static Vectors * test on Static Vectors * up * up * WIP gpu integration * test * adapt num&cat gains on-demand zeroisation of gains/hist * fix oblivious * benchmarks fix oblivious * threaded hist launch * benchmarks * update threads heuristics * update threads heuristics * up * Jdb/float32 (#284) * gains refactor * up * up * up * benchmarks laptop * up * test - full float32 * tests * benchmarks * update README * up
1 parent 7fab107 commit e5ffa50

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+750
-410
lines changed

Project.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ MLJModelInterface = "e80e1ace-859a-464e-9ed9-23947d8ae3ea"
1111
NetworkLayout = "46757867-2c16-5918-afeb-47bfcb05e46a"
1212
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
1313
RecipesBase = "3cdcf5f2-1ef4-517c-9805-6587b60abb01"
14+
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
1415
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
1516
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
1617
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"

README.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -49,41 +49,41 @@ Code to reproduce is available in [`benchmarks/regressor.jl`](https://github.com
4949
- Julia: v1.10.8
5050
- Algorithms
5151
- XGBoost: v2.5.1 (Using the `hist` algorithm)
52-
- EvoTrees: v0.17.0
52+
- EvoTrees: v0.17.1
5353

5454
### CPU:
5555

5656
| **nobs** | **nfeats** | **max\_depth** | **train\_evo** | **train\_xgb** | **infer\_evo** | **infer\_xgb** |
5757
|:--------:|:----------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|
58-
| 100k | 10 | 6 | 0.37 | 0.69 | 0.03 | 0.03 |
59-
| 100k | 10 | 11 | 2.65 | 1.08 | 0.07 | 0.06 |
60-
| 100k | 100 | 6 | 0.98 | 1.34 | 0.08 | 0.13 |
61-
| 100k | 100 | 11 | 14.02 | 3.39 | 0.12 | 0.17 |
62-
| 1M | 10 | 6 | 2.53 | 7.20 | 0.26 | 0.37 |
63-
| 1M | 10 | 11 | 6.95 | 8.85 | 0.81 | 0.74 |
64-
| 1M | 100 | 6 | 6.15 | 13.57 | 0.73 | 1.43 |
65-
| 1M | 100 | 11 | 28.22 | 18.77 | 1.19 | 1.64 |
66-
| 10M | 10 | 6 | 27.09 | 86.61 | 3.09 | 3.15 |
67-
| 10M | 10 | 11 | 59.62 | 112.72 | 7.07 | 6.66 |
68-
| 10M | 100 | 6 | 73.04 | 151.22 | 5.78 | 14.12 |
69-
| 10M | 100 | 11 | 196.00 | 193.70 | 11.29 | 17.63 |
58+
| 100k | 10 | 6 | 0.36 | 0.68 | 0.05 | 0.03 |
59+
| 100k | 10 | 11 | 1.19 | 1.06 | 0.08 | 0.06 |
60+
| 100k | 100 | 6 | 0.83 | 1.31 | 0.07 | 0.15 |
61+
| 100k | 100 | 11 | 3.43 | 3.33 | 0.10 | 0.17 |
62+
| 1M | 10 | 6 | 2.20 | 6.02 | 0.28 | 0.29 |
63+
| 1M | 10 | 11 | 4.75 | 7.89 | 0.81 | 0.62 |
64+
| 1M | 100 | 6 | 5.50 | 13.57 | 0.68 | 1.30 |
65+
| 1M | 100 | 11 | 15.72 | 18.79 | 1.15 | 1.86 |
66+
| 10M | 10 | 6 | 25.31 | 80.07 | 2.74 | 2.73 |
67+
| 10M | 10 | 11 | 50.23 | 109.67 | 6.07 | 6.19 |
68+
| 10M | 100 | 6 | 79.99 | 147.04 | 6.35 | 14.05 |
69+
| 10M | 100 | 11 | 185.47 | 189.04 | 11.34 | 17.09 |
7070

7171
### GPU:
7272

7373
| **nobs** | **nfeats** | **max\_depth** | **train\_evo** | **train\_xgb** | **infer\_evo** | **infer\_xgb** |
7474
|:--------:|:----------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|
75-
| 100k | 10 | 6 | 1.32 | 0.29 | 0.01 | 0.02 |
76-
| 100k | 10 | 11 | 16.74 | 1.32 | 0.01 | 0.02 |
77-
| 100k | 100 | 6 | 1.97 | 0.63 | 0.03 | 0.14 |
78-
| 100k | 100 | 11 | 31.66 | 3.31 | 0.05 | 0.16 |
79-
| 1M | 10 | 6 | 2.39 | 0.96 | 0.05 | 0.14 |
80-
| 1M | 10 | 11 | 26.17 | 2.72 | 0.06 | 0.19 |
81-
| 1M | 100 | 6 | 3.83 | 2.92 | 0.35 | 1.35 |
82-
| 1M | 100 | 11 | 45.11 | 8.11 | 0.34 | 1.62 |
83-
| 10M | 10 | 6 | 9.63 | 7.79 | 0.54 | 1.67 |
84-
| 10M | 10 | 11 | 49.88 | 13.12 | 0.58 | 1.74 |
85-
| 10M | 100 | 6 | 23.27 | 28.22 | 3.17 | 14.57 |
86-
| 10M | 100 | 11 | 81.26 | 52.97 | 3.35 | 17.95 |
75+
| 100k | 10 | 6 | 1.26 | 0.31 | 0.01 | 0.01 |
76+
| 100k | 10 | 11 | 14.62 | 1.47 | 0.01 | 0.02 |
77+
| 100k | 100 | 6 | 1.64 | 0.66 | 0.04 | 0.12 |
78+
| 100k | 100 | 11 | 17.98 | 3.77 | 0.04 | 0.16 |
79+
| 1M | 10 | 6 | 2.07 | 1.04 | 0.05 | 0.11 |
80+
| 1M | 10 | 11 | 22.73 | 3.08 | 0.06 | 0.14 |
81+
| 1M | 100 | 6 | 3.53 | 3.19 | 0.30 | 1.36 |
82+
| 1M | 100 | 11 | 30.11 | 8.58 | 0.37 | 1.61 |
83+
| 10M | 10 | 6 | 8.44 | 7.77 | 0.48 | 1.62 |
84+
| 10M | 10 | 11 | 46.43 | 13.50 | 0.54 | 1.82 |
85+
| 10M | 100 | 6 | 21.56 | 27.49 | 3.28 | 14.64 |
86+
| 10M | 100 | 11 | 65.96 | 49.85 | 3.49 | 17.14 |
8787

8888
## MLJ Integration
8989

benchmarks/regressor-df.jl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ using BenchmarkTools
88
using Random: seed!
99
import CUDA
1010

11-
nobs = Int(1e6)
12-
num_feat = Int(100)
11+
nobs = Int(1e5)
12+
num_feat = Int(10)
1313
nrounds = 200
1414
T = Float64
1515
nthread = Base.Threads.nthreads()
@@ -73,7 +73,7 @@ params_evo = EvoTreeRegressor(;
7373
colsample=0.5,
7474
nbins=64,
7575
rng=123,
76-
device
76+
device=:gpu
7777
)
7878
@info "EvoTrees CPU"
7979
params_evo.device = :cpu
@@ -86,7 +86,7 @@ params_evo.device = :cpu
8686
# @time m_evo_df = fit_evotree(params_evo, dtrain; target_name, device, verbosity, print_every_n=100);
8787

8888
@info "train - eval"
89-
@time m_evo = fit_evotree(params_evo, dtrain; target_name, deval=dtrain, verbosity, print_every_n=100);
89+
@time m_evo = EvoTrees.fit_evotree(params_evo, dtrain; target_name, deval=dtrain, verbosity, print_every_n=100);
9090
@time m_evo = fit_evotree(params_evo, dtrain; target_name, deval=dtrain, verbosity, print_every_n=100);
9191
# @time m_evo = fit_evotree(params_evo, dtrain; target_name, device);
9292
# @btime fit_evotree($params_evo, $dtrain; target_name, deval=dtrain, metric=metric_evo, device, verbosity, print_every_n=100);

benchmarks/regressor.jl

Lines changed: 18 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -18,61 +18,66 @@ T = Float32
1818
nthreads = Base.Threads.nthreads()
1919

2020
device_list = [:cpu, :gpu]
21-
# device_list = [:cpu]
21+
# device_list = [:gpu]
2222

2323
nobs_list = Int.([1e5, 1e6, 1e7])
24-
# nobs_list = Int.([1e5])
24+
# nobs_list = Int.([1e6])
2525

2626
nfeats_list = [10, 100]
2727
# nfeats_list = [10]
2828

2929
max_depth_list = [6, 11]
30-
# max_depth_list = [6]
30+
# max_depth_list = [11]
3131

32-
for device in device_list
32+
# nobs = first(nobs_list)
33+
# nfeats = first(nfeats_list)
34+
# max_depth = first(max_depth_list)
35+
# _device = first(device_list)
36+
37+
for _device in device_list
3338
df = DataFrame()
3439
for nobs in nobs_list
3540
for nfeats in nfeats_list
3641
for max_depth in max_depth_list
3742

3843
_df = DataFrame(
39-
:device => device,
44+
:device => _device,
4045
:nobs => nobs,
4146
:nfeats => nfeats,
4247
:max_depth => max_depth)
4348

44-
@info "device: $device | nobs: $nobs | nfeats: $nfeats | max_depth : $max_depth | nthreads: $nthreads"
49+
@info "device: $_device | nobs: $nobs | nfeats: $nfeats | max_depth : $max_depth | nthreads: $nthreads"
4550
seed!(123)
4651
x_train = rand(T, nobs, nfeats)
4752
y_train = rand(T, size(x_train, 1))
4853

4954
if run_evo
5055
@info "EvoTrees"
56+
loss == :mse ? metric = :mae : metric = loss
5157

5258
params_evo = EvoTreeRegressor(;
5359
loss,
60+
metric,
5461
nrounds,
5562
max_depth,
56-
lambda=0.0,
57-
gamma=0.0,
5863
eta=0.05,
5964
min_weight=1.0,
6065
rowsample=0.5,
6166
colsample=0.5,
6267
nbins=64,
6368
tree_type,
6469
rng=123,
65-
device
70+
device=_device
6671
)
6772

6873
if nobs == first(nobs_list) && nfeats == first(nfeats_list) && max_depth == first(max_depth_list)
6974
@info "warmup"
7075
_m_evo = EvoTrees.fit(params_evo; x_train, y_train, x_eval=x_train, y_eval=y_train, print_every_n=100)
71-
_m_evo(x_train; device)
76+
_m_evo(x_train; device=_device)
7277
end
7378
t_train_evo = @elapsed m_evo = EvoTrees.fit(params_evo; x_train, y_train, x_eval=x_train, y_eval=y_train, print_every_n=100)
7479
@info "train" t_train_evo
75-
t_infer_evo = @elapsed pred_evo = m_evo(x_train; device)
80+
t_infer_evo = @elapsed pred_evo = m_evo(x_train; device=_device)
7681
@info "predict" t_infer_evo
7782

7883
_df = hcat(_df, DataFrame(
@@ -90,7 +95,7 @@ for device in device_list
9095
loss_xgb = "reg:logistic"
9196
metric_xgb = "logloss"
9297
end
93-
tree_method = device == :gpu ? "gpu_hist" : "hist"
98+
tree_method = _device == :gpu ? "gpu_hist" : "hist"
9499

95100
params_xgb = Dict(
96101
:num_round => nrounds,
@@ -127,6 +132,6 @@ for device in device_list
127132
end
128133
end
129134
select!(df, Cols(:device, :nobs, :nfeats, :max_depth, r"train_", r"infer_"))
130-
path = joinpath(@__DIR__, "results", "regressor-$device.csv")
135+
path = joinpath(@__DIR__, "results", "regressor-$_device.csv")
131136
CSV.write(path, df)
132137
end
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
device,nobs,nfeats,max_depth,train_evo,train_xgb,infer_evo,infer_xgb
2+
cpu,100000,10,6,0.3743021,1.176697,0.0550724,0.0429094
3+
cpu,100000,10,11,1.1436303,1.8938544,0.1259657,0.1382608
4+
cpu,100000,100,6,0.9814852,3.1355962,0.1055696,0.0928166
5+
cpu,100000,100,11,4.616001,6.7854621,0.2106412,0.2003472
6+
cpu,1000000,10,6,3.2015263,11.4682413,0.6143176,0.4112723
7+
cpu,1000000,10,11,6.9050752,14.5805848,1.3154719,1.0993092
8+
cpu,1000000,100,6,9.4400736,28.9168518,1.1881067,1.2226019
9+
cpu,1000000,100,11,25.9319549,39.802335,2.3564482,2.008837
Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
device,nobs,nfeats,max_depth,train_evo,train_xgb,infer_evo,infer_xgb
2-
cpu,100000,10,6,0.345232161,0.65847186,0.045316401,0.040189505
3-
cpu,100000,10,11,2.580894208,1.089510234,0.084650005,0.084697627
4-
cpu,100000,100,6,0.959834889,1.524195573,0.072256826,0.146073543
5-
cpu,100000,100,11,13.686787359,3.785870581,0.124084269,0.170606949
6-
cpu,1000000,10,6,2.717498812,6.742247024,0.367360243,0.353636316
7-
cpu,1000000,10,11,6.88788943,8.579951345,0.666775884,0.77485982
8-
cpu,1000000,100,6,6.165680372,14.594660709,0.749947886,1.45115231
9-
cpu,1000000,100,11,28.688951439,19.8922985,1.217750442,1.682898495
10-
cpu,10000000,10,6,28.678338705,84.613924096,3.386359654,3.165746575
11-
cpu,10000000,10,11,61.208988825,113.656486078,7.67380462,6.600254493
12-
cpu,10000000,100,6,77.53141553,155.981138148,6.336081082,14.886237411
13-
cpu,10000000,100,11,201.734550127,197.814503303,12.010449067,19.124665015
2+
cpu,100000,10,6,0.361355379,0.678134269,0.045930033,0.027864463
3+
cpu,100000,10,11,1.189094729,1.062972237,0.084741834,0.060884632
4+
cpu,100000,100,6,0.825356156,1.313781493,0.067967321,0.150138476
5+
cpu,100000,100,11,3.42695732,3.331481386,0.095095408,0.166030696
6+
cpu,1000000,10,6,2.201062642,6.023023676,0.275750074,0.28599348
7+
cpu,1000000,10,11,4.751317168,7.887055184,0.806453339,0.620726646
8+
cpu,1000000,100,6,5.504610729,13.570157299,0.680102165,1.300623584
9+
cpu,1000000,100,11,15.716454123,18.785287015,1.153798679,1.859046418
10+
cpu,10000000,10,6,25.310867718,80.071412658,2.738922497,2.725513612
11+
cpu,10000000,10,11,50.234171619,109.67043611,6.072321306,6.193963369
12+
cpu,10000000,100,6,79.989164881,147.038049295,6.349441546,14.05273939
13+
cpu,10000000,100,11,185.473381031,189.043610669,11.337630305,17.09201453
Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
device,nobs,nfeats,max_depth,train_evo,train_xgb,infer_evo,infer_xgb
2-
gpu,100000,10,6,1.393475185,0.29801486,0.012653619,0.016872025
3-
gpu,100000,10,11,18.045771247,1.327634994,0.016253794,0.021481793
4-
gpu,100000,100,6,2.0343422,0.64073468,0.038305586,0.141896393
5-
gpu,100000,100,11,32.958506717,3.31747766,0.042851475,0.168755927
6-
gpu,1000000,10,6,2.499420877,0.987308459,0.054601898,0.151581453
7-
gpu,1000000,10,11,27.620747373,2.807839901,0.062031674,0.190938948
8-
gpu,1000000,100,6,3.953714002,2.972838848,0.346283083,1.386766398
9-
gpu,1000000,100,11,47.886146049,8.117343845,0.359410687,1.629208393
10-
gpu,10000000,10,6,9.934777531,7.80047099,0.528591876,1.719790443
11-
gpu,10000000,10,11,51.197501921,13.824716873,0.596972681,1.79145336
12-
gpu,10000000,100,6,24.710822744,29.915560417,3.287248791,14.589865683
13-
gpu,10000000,100,11,84.898558817,55.18430062,3.467578551,17.583748497
2+
gpu,100000,10,6,1.256613468,0.313196943,0.010231688,0.012269633
3+
gpu,100000,10,11,14.623176106,1.467511279,0.01166807,0.017105634
4+
gpu,100000,100,6,1.644037379,0.663502672,0.037336099,0.120381483
5+
gpu,100000,100,11,17.9836062,3.773791697,0.037074338,0.163107176
6+
gpu,1000000,10,6,2.074666689,1.038114183,0.0536376,0.111471447
7+
gpu,1000000,10,11,22.726148062,3.076851506,0.062363774,0.136844871
8+
gpu,1000000,100,6,3.529607328,3.187346238,0.303591273,1.358293227
9+
gpu,1000000,100,11,30.113652481,8.584528505,0.366404765,1.613612471
10+
gpu,10000000,10,6,8.43700029,7.766574309,0.482068343,1.623854427
11+
gpu,10000000,10,11,46.430825165,13.497415747,0.536403507,1.823497389
12+
gpu,10000000,100,6,21.557374609,27.492228137,3.278849596,14.64168117
13+
gpu,10000000,100,11,65.958620621,49.845463765,3.485583919,17.135490542
264 KB
Loading
264 KB
Loading
52.7 KB
Loading

0 commit comments

Comments
 (0)