Skip to content

Releases: snowflakedb/snowflake-ml-python

1.5.1

22 May 18:28
2932445
Compare
Choose a tag to compare

1.5.1

Bug Fixes

  • Dataset: Fix snowflake.connector.errors.DataError: Query Result did not match expected number of rows when accessing
    DatasetVersion properties when case insensitive SHOW VERSIONS IN DATASET check matches multiple version names.
  • Dataset: Fix bug in SnowFS bulk file read when used with DuckDB
  • Registry: Fixed a bug when loading old models.
  • Lineage: Fix Dataset source lineage propagation through snowpark.DataFrame transformations

Behavior Changes

  • Feature Store: convert clear() into a private function. Also make it deletes feature views and entities only.
  • Feature Store: Use NULL as default value for timestamp tag value.

New Features

  • Feature Store: Added new snowflake.ml.feature_store.setup_feature_store() API to assist Feature Store RBAC setup.
  • Feature Store: Add output_type argument to FeatureStore.generate_dataset() to allow generating data snapshots
    as Datasets or Tables.
  • Registry: log_model, get_model, delete_model now supports fully qualified name.
  • Modeling: Supports anonymous stored procedure during fit calls so that modeling would not require sufficient
    permissions to operate on schema. Please call
    import snowflake.ml.modeling.parameters.enable_anonymous_sproc # noqa: F401

1.5.0

01 May 20:03
c530f5c
Compare
Choose a tag to compare

1.5.0

Bug Fixes

  • Registry: Fix invalid parameter 'SHOW_MODEL_DETAILS_IN_SHOW_VERSIONS_IN_MODEL' error.

Behavior Changes

  • Model Development: The behavior of fit_transform for all estimators is changed.
    Firstly, it will cover all the estimator that contains this function,
    secondly, the output would be the union of pandas DataFrame and snowpark DataFrame.

Model Registry (PrPr)

snowflake.ml.registry.artifact and related snowflake.ml.model_registry.ModelRegistry APIs have been removed.

  • Removed snowflake.ml.registry.artifact module.
  • Removed ModelRegistry.log_artifact(), ModelRegistry.list_artifacts(), ModelRegistry.get_artifact()
  • Removed artifacts argument from ModelRegistry.log_model()

Dataset (PrPr)

snowflake.ml.dataset.Dataset has been redesigned to be backed by Snowflake Dataset entities.

  • New Datasets can be created with Dataset.create() and existing Datasets may be loaded
    with Dataset.load().
  • Datasets now maintain an immutable selected_version state. The Dataset.create_version() and
    Dataset.load_version() APIs return new Dataset objects with the requested selected_version state.
  • Added dataset.create_from_dataframe() and dataset.load_dataset() convenience APIs as a shortcut
    to creating and loading Datasets with a pre-selected version.
  • Dataset.materialized_table and Dataset.snapshot_table no longer exist with Dataset.fully_qualified_name
    as the closest equivalent.
  • Dataset.df no longer exists. Instead, use DatasetReader.read.to_snowpark_dataframe().
  • Dataset.owner has been moved to Dataset.selected_version.owner
  • Dataset.desc has been moved to DatasetVersion.selected_version.comment
  • Dataset.timestamp_col, Dataset.label_cols, Dataset.feature_store_metadata, and
    Dataset.schema_version have been removed.

Feature Store (PrPr)

FeatureStore.generate_dataset argument list has been changed to match the new
snowflake.ml.dataset.Dataset definition

  • materialized_table has been removed and replaced with name and version.
  • name moved to first positional argument
  • save_mode has been removed as merge behavior is no longer supported. The new behavior is always errorifexists.

New Features

  • Registry: Add export method to ModelVersion instance to export model files.
  • Registry: Add load method to ModelVersion instance to load the underlying object from the model.
  • Registry: Add Model.rename method to Model instance to rename or move a model.

Dataset (PrPr)

  • Added Snowpark DataFrame integration using Dataset.read.to_snowpark_dataframe()
  • Added Pandas DataFrame integration using Dataset.read.to_pandas()
  • Added PyTorch and TensorFlow integrations using Dataset.read.to_torch_datapipe()
    and Dataset.read.to_tf_dataset() respectively.
  • Added fsspec style file integration using Dataset.read.files() and Dataset.read.filesystem()

1.4.1 (2024-04-18)

New Features

  • Registry: Add support for catboost model (catboost.CatBoostClassifier, catboost.CatBoostRegressor).
  • Registry: Add support for lightgbm model (lightgbm.Booster, lightgbm.LightGBMClassifier, lightgbm.LightGBMRegressor).

Bug Fixes

  • Registry: Fix a bug that leads to relax_version option is not working.

1.4.0

08 Apr 20:25
b1cfe76
Compare
Choose a tag to compare

1.4.0

Bug Fixes

  • Registry: Fix a bug when multiple models are being called from the same query, models other than the first one will
    have incorrect result. This fix only works for newly logged model.
  • Modeling: When registering a model, only method(s) that is mentioned in save_model would be added to model signature
    in SnowML models.
  • Modeling: Fix a bug that when n_jobs is not 1, model cannot execute methods such as
    predict, predict_log_proba, and other batch inference methods. The n_jobs would automatically
    set to 1 because vectorized udf currently doesn't support joblib parallel backend.
  • Modeling: Fix a bug that batch inference methods cannot infer the datatype when the first row of data contains NULL.
  • Modeling: Matches Distributed HPO output column names with the snowflake identifier.
  • Modeling: Relax package versions for all Distributed HPO methods if the installed version
    is not available in the Snowflake conda channel
  • Modeling: Add sklearn as required dependency for LightGBM package.

Behavior Changes

  • Registry: apply method is no longer by default logged when logging a xgboost model. If that is required, it could
    be specified manually when logging the model by log_model(..., options={"target_methods": ["apply", ...]}).

New Features

  • Registry: Add support for sentence-transformers model (sentence_transformers.SentenceTransformer).
  • Registry: Now version name is no longer required when logging a model. If not provided, a random human readable ID
    will be generated.

1.3.1

21 Mar 21:00
fbebee7
Compare
Choose a tag to compare

1.3.1

New Features

  • FileSet: snowflake.ml.fileset.sfcfs.SFFileSystem can now be used in UDFs and stored procedures.

1.3.0

12 Mar 22:46
27431b2
Compare
Choose a tag to compare

1.3.0

Bug Fixes

  • Registry: Fix a bug that leads to module in code_paths when log_model cannot be correctly imported.
  • Registry: Fix incorrect error message when validating input Snowpark DataFrame with array feature.
  • Model Registry: Fix an issue when deploying a model to SPCS that some files do not have proper permission.
  • Model Development: Relax package versions for all inference methods if the installed version
    is not available in the Snowflake conda channel

Behavior Changes

  • Registry: When running the method of a model, the value range based input validation to avoid input from overflowing
    is now optional rather than enforced, this should improve the performance and should not lead to problem for most
    kinds of model. If you want to enable this check as previous, specify strict_input_validation=True when
    calling run.
  • Registry: By default relax_version=True when logging a model instead of using the specific local dependency versions.
    This improves dependency versioning by using versions available in Snowflake. To switch back to the previous behavior
    and use specific local dependency versions, specify relax_version=False when calling log_model.
  • Model Development: The behavior of fit_predict for all estimators is changed.
    Firstly, it will cover all the estimator that contains this function,
    secondly, the output would be the union of pandas DataFrame and snowpark DataFrame.

New Features

  • FileSet: snowflake.ml.fileset.sfcfs.SFFileSystem can now be serialized with pickle.

1.2.3

26 Feb 22:45
de45707
Compare
Choose a tag to compare

1.2.3

Bug Fixes

  • Registry: Now when providing Decimal Type column to a DOUBLE or FLOAT feature will not error out but auto cast with
    warnings.
  • Registry: Improve the error message when specifying currently unsupported pip_requirements argument.
  • Model Development: Fix precision_recall_fscore_support incorrect results when average="samples".
  • Model Registry: Fix an issue that leads to description, metrics or tags are not correctly returned in newly created
    Model Registry (PrPr) due to Snowflake BCR 2024_01

Behavior Changes

  • Feature Store: FeatureStore.suspend_feature_view and FeatureStore.resume_feature_view doesn't mutate input feature
    view argument any more. The updated status only reflected in the returned feature view object.

New Features

  • Model Development: support score_samples method for all the classes, including Pipeline,
    GridSearchCV, RandomizedSearchCV, PCA, IsolationForest, ...
  • Registry: Support deleting a version of a model.

1.2.2

13 Feb 20:43
ef56e3f
Compare
Choose a tag to compare

1.2.2

Bug Fixes

Behavior Changes

New Features

  • Model Registry: Support providing external access integrations when deploying a model to SPCS. This will help and be
    required to make sure the deploying process work as long as SPCS will by default deny all network connections. The
    following endpoints must be allowed to make deployment work: docker.com:80, docker.com:443, anaconda.com:80,
    anaconda.com:443, anaconda.org:80, anaconda.org:443, pypi.org:80, pypi.org:443. If you are using
    snowflake.ml.model.models.huggingface_pipeline.HuggingFacePipelineModel object, the following endpoints are required
    to be allowed: huggingface.com:80, huggingface.com:443, huggingface.co:80, huggingface.co:443.

1.2.1

26 Jan 02:12
2a6cb27
Compare
Choose a tag to compare

1.2.1

New Features

  • Model Development: Infers output column data type for transformers when possible.
  • Registry: relax_version option is available in the options argument when logging the model.

1.2.0

11 Jan 23:37
8571a05
Compare
Choose a tag to compare

1.2.0

Bug Fixes

  • Model Registry: Fix "XGBoost version not compiled with GPU support" error when running CPU inference against open-source
    XGBoost models deployed to SPCS.
  • Model Registry: Fix model deployment to SPCS on Windows machines.

Behavior Changes

New Features

  • Model Development: Introduced XGBoost external memory training feature. This feature enables training XGBoost models
    on large datasets that don't fit into memory.
  • Registry: New Registry class named snowflake.ml.registry.Registry providing similar APIs as the old one but works
    with new MODEL object in Snowflake SQL. Also, we are providingsnowflake.ml.model.Model and
    snowflake.ml.model.ModelVersion to represent a model and a specific version of a model.
  • Model Development: Add support for fit_predict method in AgglomerativeClustering, DBSCAN, and OPTICS classes;
  • Model Development: Add support for fit_transform method in MDS, SpectralEmbedding and TSNE class.

Additional Notes

  • Model Registry: The snowflake.ml.registry.model_registry.ModelRegistry has been deprecated starting from version
    1.2.0. It will stay in the Private Preview phase. For future implementations, kindly utilize
    snowflake.ml.registry.Registry, except when specifically required. The old model registry will be removed once all
    its primary functionalities are fully integrated into the new registry.

1.1.2

18 Dec 19:43
35d2b4f
Compare
Choose a tag to compare

1.1.2

Bug Fixes

  • Generic: Fix the issue that stack trace is hidden by telemetry unexpectedly.
  • Model Development: Execute model signature inference without materializing full dataframe in memory.
  • Model Registry: Fix occasional 'snowflake-ml-python library does not exist' error when deploying to SPCS.

Behavior Changes

  • Model Registry: When calling predict with Snowpark DataFrame, both inferred or normalized column names are accepted.
  • Model Registry: When logging a Snowpark ML Modeling Model, sample input data or manually provided signature will be
    ignored since they are not necessary.

New Features

  • Model Development: SQL implementation of binary precision_score metric.