Skip to content

Forest serialization and deserialization #52

@jemus42

Description

@jemus42

It needs to be possible to safely store an rpf for practical applications where refitting ad-hoc is not feasible, i.e. the following needs to run without error:

library(randomPlantedForest)

# Normal use
rpfit <- rpf(mpg ~ wt + cyl, data = mtcars)
rpfit
#> -- Regression Random Planted Forest --
#> 
#> Formula: mpg ~ wt + cyl 
#> Fit using 2 predictors and main effects only.
#> Forest is _not_ purified!
#> 
#> Called with parameters:
#> 
#>             loss: L2
#>           ntrees: 50
#>  max_interaction: 1
#>           splits: 30
#>        split_try: 10
#>            t_try: 0.4
#>            delta: 0
#>          epsilon: 0.1
#>    deterministic: FALSE
#>         nthreads: 1
#>           purify: FALSE
#>               cv: FALSE
predict(rpfit, mtcars)
#> # A tibble: 32 × 1
#>    .pred
#>    <dbl>
#>  1  20.4
#>  2  20.3
#>  3  26.1
#>  4  20.8
#>  5  16.5
#>  6  18.6
#>  7  15.1
#>  8  23.8
#>  9  23.1
#> 10  19.0
#> # ℹ 22 more rows

# Serialize and cleanup
temp_loc <- tempfile()
saveRDS(rpfit, file = temp_loc)
rm(rpfit)

# Attempt to restore: Not working
rpfit <- readRDS(temp_loc)
rpfit
#> -- Regression Random Planted Forest --
#> 
#> Formula: mpg ~ wt + cyl 
#> Fit using 2 predictors and main effects only.
#> Error in .External(structure(list(name = "CppMethod__invoke_notvoid", : NULL value passed as symbol address
predict(rpfit, mtcars)
#> Error in .External(structure(list(name = "CppMethod__invoke_notvoid", : NULL value passed as symbol address

Created on 2023-11-21 with reprex v2.0.2

As of now I don't know how the internals work for that, or if it's even possible to hook into saveRDS for that or if we need to write a separate serialization mechanism.

It's not a high priority now, but in the long run this needs to be possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C++Anything referring to the underlying C++ implementation.help wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions