Skip to content
This repository was archived by the owner on Apr 15, 2022. It is now read-only.

Release 2.2.0

Compare
Choose a tag to compare
@Ben-Epstein Ben-Epstein released this 22 Jun 22:12
· 405 commits to master since this release
504f8f2

What's New?

  • Stronger AWS Sagemaker deployment support using k8s ServiceAccounts
  • Model metadata tracking for in-db deployed models using the MODEL_METADATA and LIVE_MODEL_STATUS table and view
  • Support for in-db deployment for Keras linear models (LSTMs/RNNs/CNNs not yet supported).
  • Support for in-db deployment XGBoost using H2O/SKlearn implementations
  • SKLearn bug fix with fastnumbers
  • SKlearn better support for non-double return types
  • Upgrade from pickle -> cloudpickle for sklearn model serialization, adding support for both external and lambda functions inside SKLearn Pipelines
  • Merge in-db deployment to a 1 table design from a 2-table design. All features + model prediction(s) are stored in a single table
  • Support for deploying models to an existing table
  • Support for selecting which columns from a table are used in the model prediction. This allows you to deploy models to a "subset" fo a table.
  • Better support for in-db deployment for sklearn Pipelines that have predict parameters
  • deploy_db api cleanup: Removed model parameter and make run_id required. Model is pulled behind the scenes. DF parameter is optional and not required if deploying model to existing table.
  • General code cleanup

BREAKING CHANGES

  • deploy_db will no longer work with old parameters. New parameter set and order is required.
  • createTable from the PySpliceContext now has parameters ordered dataframe, schema_table_name instead of the other way around to match all other APIs in the module.

This release is in tandem with the ml-workflow release. Upgrade scripts are attached to that release.