Release Release 2.2.0 · splicemachine/pysplice

What's New?

Stronger AWS Sagemaker deployment support using k8s ServiceAccounts
Model metadata tracking for in-db deployed models using the MODEL_METADATA and LIVE_MODEL_STATUS table and view
Support for in-db deployment for Keras linear models (LSTMs/RNNs/CNNs not yet supported).
Support for in-db deployment XGBoost using H2O/SKlearn implementations
SKLearn bug fix with fastnumbers
SKlearn better support for non-double return types
Upgrade from pickle -> cloudpickle for sklearn model serialization, adding support for both external and lambda functions inside SKLearn Pipelines
Merge in-db deployment to a 1 table design from a 2-table design. All features + model prediction(s) are stored in a single table
Support for deploying models to an existing table
Support for selecting which columns from a table are used in the model prediction. This allows you to deploy models to a "subset" fo a table.
Better support for in-db deployment for sklearn Pipelines that have predict parameters
deploy_db api cleanup: Removed model parameter and make run_id required. Model is pulled behind the scenes. DF parameter is optional and not required if deploying model to existing table.
General code cleanup

deploy_db will no longer work with old parameters. New parameter set and order is required.
createTable from the PySpliceContext now has parameters ordered dataframe, schema_table_name instead of the other way around to match all other APIs in the module.

This release is in tandem with the ml-workflow release. Upgrade scripts are attached to that release.