Skip to content

Commit

Permalink
Project import generated by Copybara. (#116)
Browse files Browse the repository at this point in the history
GitOrigin-RevId: 6fc3ce416ce5843fc01936fc61bf5480ae9f791f

Co-authored-by: Snowflake Authors <[email protected]>
  • Loading branch information
sfc-gh-anavalos and Snowflake Authors authored Sep 12, 2024
1 parent 0bdaf0b commit f50d041
Show file tree
Hide file tree
Showing 142 changed files with 4,356 additions and 1,493 deletions.
2 changes: 2 additions & 0 deletions BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ load("//:packages.bzl", "PACKAGES")
load("//bazel:py_rules.bzl", "py_wheel")
load("//bazel/requirements:rules.bzl", "generate_pyproject_file")

package(default_visibility = ["//visibility:public"])

exports_files([
"CHANGELOG.md",
"README.md",
Expand Down
23 changes: 21 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,26 @@
# Release History

## 1.6.1 (TBD)
## 1.6.2 (TBD)

### Bug Fixes

- Modeling: Support XGBoost version that is larger than 2.

- Data: Fix multiple epoch iteration over `DataConnector.to_torch_datapipe()` DataPipes.
- Generic: Fix a bug that when an invalid name is provided to argument where fully qualified name is expected, it will
be parsed wrongly. Now it raises an exception correctly.
- Model Explainability: Handle explanations for multiclass XGBoost classification models
- Model Explainability: Workarounds and better error handling for XGB>2.1.0 not working with SHAP==0.42.1

### New Features

- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.
- Data: Add native batching support via `batch_size` and `drop_last_batch` arguments to `DataConnector.to_torch_dataset()`
- Feature Store: update_feature_view() supports taking feature view object as argument.

### Behavior Changes

## 1.6.1 (2024-08-12)

### Bug Fixes

Expand All @@ -17,7 +37,6 @@
### New Features

- Enable `set_params` to set the parameters of the underlying sklearn estimator, if the snowflake-ml model has been fit.
- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.
- Data: Add `snowflake.ml.data.ingestor_utils` module with utility functions helpful for `DataIngestor` implementations.
- Data: Add new `to_torch_dataset()` connector to `DataConnector` to replace deprecated DataPipe.
- Registry: Option to `enable_explainability` set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.
Expand Down
6 changes: 5 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -304,7 +304,7 @@ Example:
## Unit Testing
Write `pytest` or Python `unittest` style unit tests.
Write Python `unittest` style unit tests. Pytest is allowed, but not recommended.

### `unittest`

Expand All @@ -320,6 +320,10 @@ from absl.testing import absltest
# instead of
# from unittest import TestCase, main
from absl.testing.absltest import TestCase, main
# Call main.
if __name__ == '__main__':
absltest.main()
```

`absltest` provides better `bazel` integration which produces a more detailed XML
Expand Down
5 changes: 4 additions & 1 deletion bazel/environments/conda-env-snowflake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ dependencies:
- lightgbm==3.3.5
- mlflow==2.3.1
- moto==4.0.11
- mypy==1.10.0
- networkx==2.8.4
- numpy==1.23.5
- packaging==23.0
Expand All @@ -54,14 +55,16 @@ dependencies:
- snowflake-snowpark-python==1.17.0
- sphinx==5.0.2
- sqlparse==0.4.4
- starlette==0.27.0
- tensorflow==2.12.0
- tokenizers==0.13.2
- toml==0.10.2
- torchdata==0.6.1
- transformers==4.32.1
- types-PyYAML==6.0.12.12
- types-protobuf==4.23.0.1
- types-requests==2.30.0.0
- types-toml==0.10.8.6
- typing-extensions==4.5.0
- typing-extensions==4.6.3
- werkzeug==2.2.2
- xgboost==1.7.3
12 changes: 6 additions & 6 deletions bazel/environments/conda-env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,6 @@ dependencies:
- cachetools==4.2.2
- catboost==1.2.0
- cloudpickle==2.2.1
- conda-forge::accelerate==0.22.0
- conda-forge::mypy==1.5.1
- conda-forge::starlette==0.27.0
- conda-forge::types-PyYAML==6.0.12
- conda-forge::types-cachetools==4.2.2
- conda-libmamba-solver==23.7.0
- coverage==6.3.2
- cryptography==39.0.1
Expand All @@ -33,6 +28,7 @@ dependencies:
- lightgbm==3.3.5
- mlflow==2.3.1
- moto==4.0.11
- mypy==1.10.0
- networkx==2.8.4
- numpy==1.23.5
- packaging==23.0
Expand All @@ -59,18 +55,22 @@ dependencies:
- snowflake-snowpark-python==1.17.0
- sphinx==5.0.2
- sqlparse==0.4.4
- starlette==0.27.0
- tensorflow==2.12.0
- tokenizers==0.13.2
- toml==0.10.2
- torchdata==0.6.1
- transformers==4.32.1
- types-PyYAML==6.0.12.12
- types-protobuf==4.23.0.1
- types-requests==2.30.0.0
- types-toml==0.10.8.6
- typing-extensions==4.5.0
- typing-extensions==4.6.3
- werkzeug==2.2.2
- xgboost==1.7.3
- pip
- pip:
- --extra-index-url https://pypi.org/simple
- accelerate==0.22.0
- types-cachetools==4.2.2
- peft==0.5.0
12 changes: 6 additions & 6 deletions bazel/environments/conda-gpu-env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,6 @@ dependencies:
- cachetools==4.2.2
- catboost==1.2.0
- cloudpickle==2.2.1
- conda-forge::accelerate==0.22.0
- conda-forge::mypy==1.5.1
- conda-forge::starlette==0.27.0
- conda-forge::types-PyYAML==6.0.12
- conda-forge::types-cachetools==4.2.2
- conda-libmamba-solver==23.7.0
- coverage==6.3.2
- cryptography==39.0.1
Expand All @@ -33,6 +28,7 @@ dependencies:
- lightgbm==3.3.5
- mlflow==2.3.1
- moto==4.0.11
- mypy==1.10.0
- networkx==2.8.4
- numpy==1.23.5
- nvidia::cuda==11.7.*
Expand Down Expand Up @@ -61,19 +57,23 @@ dependencies:
- snowflake-snowpark-python==1.17.0
- sphinx==5.0.2
- sqlparse==0.4.4
- starlette==0.27.0
- tensorflow==2.12.0
- tokenizers==0.13.2
- toml==0.10.2
- torchdata==0.6.1
- transformers==4.32.1
- types-PyYAML==6.0.12.12
- types-protobuf==4.23.0.1
- types-requests==2.30.0.0
- types-toml==0.10.8.6
- typing-extensions==4.5.0
- typing-extensions==4.6.3
- werkzeug==2.2.2
- xgboost==1.7.3
- pip
- pip:
- --extra-index-url https://pypi.org/simple
- accelerate==0.22.0
- types-cachetools==4.2.2
- peft==0.5.0
- vllm==0.2.1.post1
5 changes: 0 additions & 5 deletions bazel/requirements/requirements.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,6 @@
"pattern": "^$|^([1-9][0-9]*!)?(0|[1-9][0-9]*)(\\.(0|[1-9][0-9]*))*((a|b|rc|alpha|beta)(0|[1-9][0-9]*))?(\\.post(0|[1-9][0-9]*))?(\\.dev(0|[1-9][0-9]*))?$",
"type": "string"
},
"from_channel": {
"default": "https://repo.anaconda.com/pkgs/snowflake",
"description": "The channel where the package come from, set if not from Snowflake Anaconda Channel.",
"type": "string"
},
"gpu_only": {
"default": false,
"description": "The package is required when running in an environment where GPU is available.",
Expand Down
4 changes: 2 additions & 2 deletions ci/conda_recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ build:
noarch: python
package:
name: snowflake-ml-python
version: 1.6.1
version: 1.6.2
requirements:
build:
- python
Expand Down Expand Up @@ -45,7 +45,7 @@ requirements:
- snowflake-snowpark-python>=1.17.0,<2
- sqlparse>=0.4,<1
- typing-extensions>=4.1.0,<5
- xgboost>=1.7.3,<2
- xgboost>=1.7.3,<2.1
- python>=3.8,<3.12
run_constrained:
- catboost>=1.2.0, <2
Expand Down
2 changes: 0 additions & 2 deletions codegen/codegen_rules.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,6 @@ def autogen_estimators(module, estimator_info_list):
"//snowflake/ml/_internal/exceptions:exceptions",
"//snowflake/ml/_internal/utils:temp_file_utils",
"//snowflake/ml/_internal/utils:query_result_checker",
"//snowflake/ml/_internal/utils:pkg_version_utils",
"//snowflake/ml/_internal/utils:identifier",
"//snowflake/ml/model:model_signature",
"//snowflake/ml/model/_signatures:utils",
Expand Down Expand Up @@ -181,7 +180,6 @@ def autogen_snowpark_pandas_tests(module, module_root_dir, snowpark_pandas_estim
"//snowflake/ml/_internal/snowpark_pandas:snowpark_pandas_lib",
"//snowflake/ml/utils:connection_params",
],
compatible_with_snowpark = False,
timeout = "long",
legacy_create_init = 0,
shard_count = 5,
Expand Down
9 changes: 6 additions & 3 deletions codegen/sklearn_wrapper_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -1153,15 +1153,18 @@ def generate(self) -> "XGBoostWrapperGenerator":
super().generate()

# Populate XGBoost specific values
self.estimator_imports_list.append("import xgboost")
self.estimator_imports_list.extend(["import sklearn", "import xgboost"])
self.test_estimator_input_args_list.extend(
["random_state=0", "subsample=1.0", "colsample_bynode=1.0", "n_jobs=1"]
)
self.score_sproc_imports = ["xgboost"]
self.score_sproc_imports = ["xgboost", "sklearn"]
# TODO(snandamuri): Replace cloudpickle with joblib after latest version of joblib is added to snowflake conda.
self.supported_export_method = "to_xgboost"
self.unsupported_export_methods = ["to_sklearn", "to_lightgbm"]
self.deps = "f'numpy=={np.__version__}', f'xgboost=={xgboost.__version__}', f'cloudpickle=={cp.__version__}'"
self.deps = (
"f'numpy=={np.__version__}', f'scikit-learn=={sklearn.__version__}', "
+ "f'xgboost=={xgboost.__version__}', f'cloudpickle=={cp.__version__}'"
)
self._construct_string_from_lists()
return self

Expand Down
Loading

0 comments on commit f50d041

Please sign in to comment.