Skip to content

Releases: snowflakedb/snowflake-ml-python

1.7.4

28 Jan 19:49
ae87f58
Compare
Choose a tag to compare

1.7.4

Bug Fixes

  • Registry: Fixed an issue that the hugging face pipeline is loaded using incorrect dtype.
  • Registry: Fixed an issue that only 1 row is used when infer the model signature in the modeling model.

Behavior Changes

  • Registry: ModelVersion.run on a service would require redeploying the service once account opts into nested function.

New Features

  • Add new snowflake.ml.jobs preview API for running headless workloads on SPCS using
    Container Runtime for ML
  • Added guardrails option to Cortex complete function, enabling
    Cortex Guard support

1.7.3

09 Jan 20:29
9abffca
Compare
Choose a tag to compare

1.7.3

  • Added lowercase versions of Cortex functions, added deprecation warning to Capitalized versions.
  • Bumped the requirements of fsspec and s3fs to >=2024.6.1,<2026
  • Bumped the requirement of mlflow to >=2.16.0, <3
  • Registry: Support 500+ features for model registry

Bug Fixes

  • Registry: Fixed a bug when providing non-range index pandas DataFrame as the input to a ModelVersion.run.
  • Registry: Improved random model version name generation to prevent collisions.
  • Registry: Fix an issue when inferring signature or running inference with Snowpark data that has a column whose type
    is ARRAY and contains NULL value.
  • Registry: ModelVersion.run now accepts fully qualified service name.
  • Monitoring: Fix issue in SDK with creating monitors using fully qualified names.
  • Registry: Fix error in log_model for any sklearn models with only data pre-processing including pre-processing only
    pipeline models due to default explainability enablement.

Behavior Changes

New Features

  • Added user_files argument to Registry.log_model for including images or any extra file with the model.
  • Registry: Added support for handling Hugging Face model configurations with auto-mapping functionality
  • DataConnector: Add new DataConnector.from_sql() constructor

1.7.2

21 Nov 19:10
7bc5f40
Compare
Choose a tag to compare

1.7.2

Bug Fixes

  • Model Explainability: Fix issue that explain is enabled for scikit-learn pipeline
    whose task is UNKNOWN and fails later when invoked.

Behavior Changes

New Features

  • Registry: Support asynchronous model inference service creation with the block option
    in ModelVersion.create_service() set to True by default.

1.7.1

05 Nov 19:21
38d2497
Compare
Choose a tag to compare

1.7.1

Bug Fixes

  • Registry: Null value is now allowed in the dataframe used in model signature inference. Null values will be ignored
    and others will be used to infer the signature.
  • Registry: Pandas Extension DTypes (pandas.StringDType(), pandas.BooleanDType(), etc.) are now supported in model
    signature inference.
  • Registry: Null value is now allowed in the dataframe used to predict.
  • Data: Fix missing snowflake.ml.data.* module exports in wheel
  • Dataset: Fix missing snowflake.ml.dataset.* module exports in wheel.
  • Registry: Fix the issue that tf_keras.Model is not recognized as keras model when logging.

Behavior Changes

New Features

  • Registry: Option to enable_monitoring set to False by default. This will gate access to preview features of Model Monitoring.
  • Model Monitoring: show_model_monitors Registry method. This feature is still in Private Preview.
  • Registry: Support pd.Series in input and output data.
  • Model Monitoring: add_monitor Registry method. This feature is still in Private Preview.
  • Model Monitoring: resume and suspend ModelMonitor. This feature is still in Private Preview.
  • Model Monitoring: get_monitor Registry method. This feature is still in Private Preview.
  • Model Monitoring: delete_monitor Registry method. This feature is still in Private Preview.

1.7.0

22 Oct 19:23
f737798
Compare
Choose a tag to compare

1.7.0

Behavior Change

  • Generic: Require python >= 3.9.
  • Data Connector: Update to_torch_dataset and to_torch_datapipe to add a dimension for scalar data.
    This allows for more seamless integration with PyTorch DataLoader, which creates batches by stacking inputs of each batch.

Examples:

ds = connector.to_torch_dataset(shuffle=False, batch_size=3)
  • Input: "col1": [10, 11, 12]

    • Previous batch: array([10., 11., 12.]) with shape (3,)
    • New batch: array([[10.], [11.], [12.]]) with shape (3, 1)
  • Input: "col2": [[0, 100], [1, 110], [2, 200]]

    • Previous batch: array([[ 0, 100], [ 1, 110], [ 2, 200]]) with shape (3,2)
    • New batch: No change
  • Model Registry: External access integrations are optional when creating a model inference service in
    Snowflake >= 8.40.0.

  • Model Registry: Deprecate build_external_access_integration with build_external_access_integrations in
    ModelVersion.create_service().

Bug Fixes

  • Registry: Updated log_model API to accept both signature and sample_input_data parameters.
  • Feature Store: ExampleHelper uses fully qualified path for table name. change weather features aggregation from 1d to 1h.
  • Data Connector: Return numpy array with appropriate object type instead of list for multi-dimensional
    data from to_torch_dataset and to_torch_datapipe
  • Model explainability: Incompatibility between SHAP 0.42.1 and XGB 2.1.1 resolved by using latest SHAP 0.46.0.

New Features

  • Registry: Provide pass keyworded variable length of arguments to class ModelContext. Example usage:
mc = custom_model.ModelContext(
    config = 'local_model_dir/config.json',
    m1 = model1
)

class ExamplePipelineModel(custom_model.CustomModel):
    def __init__(self, context: custom_model.ModelContext) -> None:
      super().__init__(context)
      v = open(self.context['config']).read()
      self.bias = json.loads(v)['bias']

    @custom_model.inference_api
    def predict(self, input: pd.DataFrame) -> pd.DataFrame:
      model_output = self.context['m1'].predict(input)
      return pd.DataFrame({'output': model_output + self.bias})
  • Model Development: Upgrade scikit-learn in UDTF backend for log_loss metric. As a result, eps argument is now ignored.
  • Data Connector: Add the option of passing a None sized batch to to_torch_dataset for better
    interoperability with PyTorch DataLoader.
  • Model Registry: Support pandas.CategoricalDtype
  • Registry: It is now possible to pass signatures and sample_input_data at the same time to capture background
    data from explainablity and data lineage.

1.6.4

17 Oct 18:54
f54ab9f
Compare
Choose a tag to compare

1.6.4

Bug Fixes

  • Registry: Fix an issue that leads to incident when using ModelVersion.run with service.

1.6.3

07 Oct 18:37
6186ce6
Compare
Choose a tag to compare

1.6.3

  • Model Registry (PrPr) has been removed.

Bug Fixes

  • Registry: Fix a bug that when package whose name does not follow PEP-508 is provided when logging the model,
    an unexpected normalization is happening.
  • Registry: Fix not a valid remote uri error when logging mlflow models.
  • Registry: Fix a bug that ModelVersion.run is called in a nested way.
  • Registry: Fix an issue that leads to log_model failure when local package version contains parts other than
    base version.

New Features

  • Data: Improve DataConnector.to_pandas() performance when loading from Snowpark DataFrames.
  • Model Registry: Allow users to set a model task while using log_model.
  • Feature Store: FeatureView supports ON_CREATE or ON_SCHEDULE initialize mode.

1.6.2

12 Sep 18:17
f50d041
Compare
Choose a tag to compare

1.6.2 (TBD)

Bug Fixes

  • Modeling: Support XGBoost version that is larger than 2.

  • Data: Fix multiple epoch iteration over DataConnector.to_torch_datapipe() DataPipes.

  • Generic: Fix a bug that when an invalid name is provided to argument where fully qualified name is expected, it will
    be parsed wrongly. Now it raises an exception correctly.

  • Model Explainability: Handle explanations for multiclass XGBoost classification models

  • Model Explainability: Workarounds and better error handling for XGB>2.1.0 not working with SHAP==0.42.1

New Features

  • Data: Add top-level exports for DataConnector and DataSource to snowflake.ml.data.
  • Data: Add native batching support via batch_size and drop_last_batch arguments to DataConnector.to_torch_dataset()
  • Feature Store: update_feature_view() supports taking feature view object as argument.

Behavior Changes

1.6.1

13 Aug 02:53
2b044fc
Compare
Choose a tag to compare

1.6.1 (2024-08-12)

Bug Fixes

  • Feature Store: Support large metadata blob when generating dataset
  • Feature Store: Added a hidden knob in FeatureView as kargs for setting customized
    refresh_mode
  • Registry: Fix an error message in Model Version run when function_name is not mentioned and model has multiple
    target methods.
  • Cortex inference: snowflake.cortex.Complete now only uses the REST API for streaming and the use_rest_api_experimental
    is no longer needed.
  • Feature Store: Add a new API: FeatureView.list_columns() which list all column information.
  • Data: Fix DataFrame ingestion with ArrowIngestor.

New Features

  • Enable set_params to set the parameters of the underlying sklearn estimator, if the snowflake-ml model has been fit.
  • Data: Add top-level exports for DataConnector and DataSource to snowflake.ml.data.
  • Data: Add snowflake.ml.data.ingestor_utils module with utility functions helpful for DataIngestor implementations.
  • Data: Add new to_torch_dataset() connector to DataConnector to replace deprecated DataPipe.
  • Registry: Option to enable_explainability set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.
  • Registry: Option to enable_explainability when registering SHAP supported sklearn models.

Behavior Changes

1.6.0

29 Jul 21:11
123693a
Compare
Choose a tag to compare

1.6.0

Bug Fixes

  • Modeling: SimpleImputer can impute integer columns with integer values.
  • Registry: Fix an issue when providing a pandas Dataframe whose index is not starting from 0 as the input to
    the ModelVersion.run.

New Features

  • Feature Store: Add overloads to APIs accept both object and name/version. Impacted APIs include read_feature_view(),
    refresh_feature_view(), get_refresh_history(), resume_feature_view(), suspend_feature_view(), delete_feature_view().
  • Feature Store: Add docstring inline examples for all public APIs.
  • Feature Store: Add new utility class ExampleHelper to help with load source data to simplify public notebooks.
  • Registry: Option to enable_explainability when registering XGBoost models as a pre-PuPr feature.
  • Feature Store: add new API update_entity().
  • Registry: Option to enable_explainability when registering Catboost models as a pre-PuPr feature.
  • Feature Store: Add new argument warehouse to FeatureView constructor to overwrite the default warehouse. Also add
    a new column 'warehouse' to the output of list_feature_views().
  • Registry: Add support for logging model from a model version.
  • Modeling: Distributed Hyperparameter Optimization now announce GA refresh version. The latest memory efficient version
    will not have the 10GB training limitation for dataset any more. To turn off, please run
    from snowflake.ml.modeling._internal.snowpark_implementations import ( distributed_hpo_trainer, ) distributed_hpo_trainer.ENABLE_EFFICIENT_MEMORY_USAGE = False
  • Registry: Option to enable_explainability when registering LightGBM models as a pre-PuPr feature.

Behavior Changes

  • Feature Store: change some positional parameters to keyword arguments in following APIs:
    • Entity(): desc.
    • FeatureView(): timestamp_col, refresh_freq, desc.
    • FeatureStore(): creation_mode.
    • update_entity(): desc.
    • register_feature_view(): block, overwrite.
    • list_feature_views(): entity_name, feature_view_name.
    • get_refresh_history(): verbose.
    • retrieve_feature_values(): spine_timestamp_col, exclude_columns, include_feature_view_timestamp_col.
    • generate_training_set(): save_as, spine_timestamp_col, spine_label_cols, exclude_columns,
      include_feature_view_timestamp_col.
    • generate_dataset(): version, spine_timestamp_col, spine_label_cols, exclude_columns,
      include_feature_view_timestamp_col, desc, output_type.