Skip to content

Commit ea885c8

Browse files
niall-turbittarpitjasa-dbvladimirk-db
authored
Models uc update (#106)
* bump DBR 12.2 -> 13.3 * add input_include_models_in_unity_catalog option * add schema name * add enums in template schema * update train notebook for models in UC * bump mlflow * update with models in uc functionality * model validation updated for models in UC * update validation workflow * update deploy for models in UC * remove line * batch inference updated for models in UC * update train with recipes for models in UC * update condition for model_name * update test config * update ml artifacts with UC models * update model variable for template * add unity_catalog_read_user_group prompt * update tests for additional prompts * update example project configs * add input_ * revert MLflow Recipes changes for models in UC * remove url * fix docstring * add additional params * update tests * update feature store code * fix schema name * revert fs * update readme * update feature table names for UC * add links to bundles files * update UC read user group * UC as default * Update databricks_template_schema.json Co-authored-by: Arpit Jasapara <[email protected]> * rearrange args * Update template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/validation/notebooks/ModelValidation.py.tmpl Co-authored-by: Arpit Jasapara <[email protected]> * clarify schema name * validation test on schema name * Update template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/training/notebooks/Train.py.tmpl Co-authored-by: Vladimir Kolovski <[email protected]> * Update template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/utils.py.tmpl Co-authored-by: Vladimir Kolovski <[email protected]> * update alias for be consistent with UI * Update README.md Co-authored-by: Vladimir Kolovski <[email protected]> * update read user group description * add validation for Feature Store and Unity Catalog * Fix tests * Fix additional tests * Fix docs * Fix more issues * remove import * * update environment to target * update batch inference input table name for FS & UC * remove unused import * fix FS train notebook imports * missing comma * update display * update display * missing comma * revert aws test config * add installs to feature store notebook * update FS inference readme --------- Co-authored-by: Arpit Jasapara <[email protected]> Co-authored-by: Vladimir Kolovski <[email protected]> Co-authored-by: Arpit Jasapara <[email protected]>
1 parent 6f59874 commit ea885c8

32 files changed

+478
-163
lines changed

.gitignore

-3
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,6 @@
55
# local bundle files
66
**/.databricks/*
77

8-
mlops-stacks-using-bundle.iml
9-
mlops-stacks-using-bundle.ipr
10-
mlops-stacks-using-bundle.iws
118
*.hcl
129
.idea/
1310
.vscode/

README.md

+7-5
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@ Your organization can use the default stack as is or customize it as needed, e.g
1919
adapt individual components to fit your organization's best practices. See the
2020
[stack customization guide](stack-customization.md) for more details.
2121

22-
Using Databricks MLOps stacks, data scientists can quickly get started iterating on ML code for new projects while ops engineers set up CI/CD and ML service state
23-
management, with an easy transition to production. You can also use MLOps stacks as a building block
22+
Using Databricks MLOps stack, data scientists can quickly get started iterating on ML code for new projects while ops engineers set up CI/CD and ML service state
23+
management, with an easy transition to production. You can also use MLOps stack as a building block
2424
in automation for creating new data science projects with production-grade CI/CD pre-configured.
2525

26-
![MLOps Stacks diagram](doc-images/mlops-stack.png)
26+
![MLOps Stack diagram](doc-images/mlops-stack.png)
2727

2828
See the [FAQ](#FAQ) for questions on common use cases.
2929

@@ -68,9 +68,11 @@ ready to productionize a model. We recommend specifying any known parameters upf
6868
* ``input_release_branch``: Name of the release branch. The production jobs (model training, batch inference) defined in this
6969
repo pull ML code from this branch.
7070
* ``input_read_user_group``: User group name to give READ permissions to for project resources (ML jobs, integration test job runs, and machine learning resources). A group with this name must exist in both the staging and prod workspaces. Defaults to "users", which grants read permission to all users in the staging/prod workspaces. You can specify a custom group name e.g. to restrict read permissions to members of the team working on the current ML project.
71+
* ``input_include_models_in_unity_catalog``: If selected, models will be registered to [Unity Catalog](https://docs.databricks.com/en/mlflow/models-in-uc.html#models-in-unity-catalog). Models will be registered under a three-level namespace of `<catalog>.<schema_name>.<model_name>`, according the the target environment in which the model registration code is executed. Thus, if model registration code runs in the `prod` environment, the model will be registered to the `prod` catalog under the namespace `<prod>.<schema>.<model_name>`. This assumes that the respective catalogs exist in Unity Catalog (e.g. `dev`, `staging` and `prod` catalogs). Target environment names, and catalogs to be used are defined in the Databricks bundles files, and can be updated as needed.
72+
* ``input_schema_name``: If using [Models in Unity Catalog](https://docs.databricks.com/en/mlflow/models-in-uc.html#models-in-unity-catalog), specify the name of the schema under which the models should be registered. Defaults to "schema_name", however we recommend changing this during project initialization (e.g. a schema may map to a specific ML use case, such as `fraud_detection`). We default to using the same `schema_name` across catalogs, thus this schema must exist in each catalog used. For example, the training pipeline when executed in the staging environment will register the model to `staging.<schema_name>.<model_name>`, whereas the same pipeline executed in the prod environment will register the mode to `prod.<schema_name>.<model_name>`.
73+
* ``input_unity_catalog_read_user_group``: If using [Models in Unity Catalog](https://docs.databricks.com/en/mlflow/models-in-uc.html#models-in-unity-catalog), define the name of the user group to grant `EXECUTE` (read & use model) privileges for the registered model. Defaults to "account users".
7174
* ``input_include_feature_store``: If selected, will provide [Databricks Feature Store](https://docs.databricks.com/machine-learning/feature-store/index.html) stack components including: project structure and sample feature Python modules, feature engineering notebooks, ML resource configs to provision and manage Feature Store jobs, and automated integration tests covering feature engineering and training.
7275
* ``input_include_mlflow_recipes``: If selected, will provide [MLflow Recipes](https://mlflow.org/docs/latest/recipes.html) stack components, dividing the training pipeline into configurable steps and profiles.
73-
7476

7577
See the generated ``README.md`` for next steps!
7678

@@ -111,7 +113,7 @@ for details on how to do this.
111113

112114
### Does the MLOps stack cover data (ETL) pipelines?
113115

114-
Since MLOps Stacks is based on [databricks CLI bundles](https://docs.databricks.com/dev-tools/cli/bundle-commands.html),
116+
Since MLOps Stack is based on [databricks CLI bundles](https://docs.databricks.com/dev-tools/cli/bundle-commands.html),
115117
it's not limited only to ML workflows and assets - it works for assets across the Databricks Lakehouse. For instance, while the existing ML
116118
code samples contain feature engineering, training, model validation, deployment and batch inference workflows,
117119
you can use it for Delta Live Tables pipelines as well.

databricks_template_schema.json

+31-13
Original file line numberDiff line numberDiff line change
@@ -4,66 +4,84 @@
44
"order": 1,
55
"type": "string",
66
"default": "my-mlops-project",
7-
"description": "Project Name"
7+
"description": "Welcome to MLOps Stack. For detailed information on project generation, see the README at https://github.com/databricks/mlops-stack/blob/main/README.md. \n\nProject Name"
88
},
99
"input_root_dir": {
1010
"order": 2,
1111
"type": "string",
1212
"default": "my-mlops-project",
13-
"description": "Root directory name. Use a name different from the project name if you intend to use monorepo"
13+
"description": "\nRoot directory name. Use a name different from the project name if you intend to use monorepo"
1414
},
1515
"input_cloud": {
1616
"order": 3,
1717
"type": "string",
18-
"description": "Select cloud. \nChoose from azure, aws",
18+
"description": "\nSelect cloud. \nChoose from azure, aws",
1919
"default": "azure"
2020
},
2121
"input_cicd_platform": {
2222
"order": 4,
2323
"type": "string",
24-
"description": "Select CICD platform. \nChoose from github_actions, github_actions_for_github_enterprise_servers, azure_devops",
24+
"description": "\nSelect CICD platform. \nChoose from github_actions, github_actions_for_github_enterprise_servers, azure_devops",
2525
"default": "github_actions"
2626
},
2727
"input_databricks_staging_workspace_host": {
2828
"order": 5,
2929
"type": "string",
3030
"default": "",
31-
"description": "URL of staging Databricks workspace, used to run CI tests on PRs and preview config changes before they're deployed to production. Default: \nAzure - https://adb-xxxx.xx.azuredatabricks.net\nAWS - https://your-staging-workspace.cloud.databricks.com\n"
31+
"description": "\nURL of staging Databricks workspace, used to run CI tests on PRs and preview config changes before they're deployed to production. Default: \nAzure - https://adb-xxxx.xx.azuredatabricks.net\nAWS - https://your-staging-workspace.cloud.databricks.com\n"
3232
},
3333
"input_databricks_prod_workspace_host": {
3434
"order": 6,
3535
"type": "string",
3636
"default": "",
37-
"description": "URL of production Databricks workspace. Default: \nAzure - https://adb-xxxx.xx.azuredatabricks.net\nAWS - https://your-prod-workspace.cloud.databricks.com\n"
37+
"description": "\nURL of production Databricks workspace. Default: \nAzure - https://adb-xxxx.xx.azuredatabricks.net\nAWS - https://your-prod-workspace.cloud.databricks.com\n"
3838
},
3939
"input_default_branch": {
4040
"order": 7,
4141
"type": "string",
4242
"default": "main",
43-
"description": "Name of the default branch, where the prod and staging ML resources are deployed from and the latest ML code is staged. Default:"
43+
"description": "\nName of the default branch, where the prod and staging ML resources are deployed from and the latest ML code is staged. Default"
4444
},
4545
"input_release_branch": {
4646
"order": 8,
4747
"type": "string",
4848
"default": "release",
49-
"description": "Name of the release branch. The production jobs (model training, batch inference) defined in this stack pull ML code from this branch. Default:"
49+
"description": "\nName of the release branch. The production jobs (model training, batch inference) defined in this stack pull ML code from this branch. Default"
5050
},
5151
"input_read_user_group": {
5252
"order": 9,
5353
"type": "string",
5454
"default": "users",
55-
"description": "User group name to give READ permissions to for project resources (ML jobs, integration test job runs, and machine learning resources). A group with this name must exist in both the staging and prod workspaces. Default:"
55+
"description": "\nUser group name to give READ permissions to for project resources (ML jobs, integration test job runs, and machine learning resources). A group with this name must exist in both the staging and prod workspaces. Default"
5656
},
57-
"input_include_feature_store": {
57+
"input_include_models_in_unity_catalog": {
5858
"order": 10,
5959
"type": "string",
60-
"description": "Whether to include feature store. \nChoose from no, yes",
60+
"description": "\nWhether to use the Model Registry with Unity Catalog. \nChoose from no, yes",
61+
"default": "yes"
62+
},
63+
"input_schema_name": {
64+
"order": 11,
65+
"type": "string",
66+
"description": "\nName of schema to use when registering a model in Unity Catalog. \nNote that this schema must already exist. Default",
67+
"default": "schema_name"
68+
},
69+
"input_unity_catalog_read_user_group": {
70+
"order": 12,
71+
"type": "string",
72+
"default": "account users",
73+
"description": "\nUser group name to give EXECUTE privileges to models in Unity Catalog. A group with this name must exist in the Unity Catalog that the staging and prod workspaces can access. Default"
74+
},
75+
"input_include_feature_store": {
76+
"order": 13,
77+
"type": "string",
78+
"description": "\nWhether to include Feature Store. \nChoose from no, yes",
6179
"default": "no"
6280
},
6381
"input_include_mlflow_recipes": {
64-
"order": 11,
82+
"order": 14,
6583
"type": "string",
66-
"description": "Whether to include mlflow recipes. \nChoose from no, yes",
84+
"description": "\nWhether to include MLflow Recipes. \nChoose from no, yes",
6785
"default": "no"
6886
}
6987
}

library/input_validation.tmpl

+13
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,21 @@
3838
{{ fail `Azure DevOps is not supported as a cicd_platform option with cloud=aws. If cloud=aws the currently supported cicd_platform is GitHub Actions.` }}
3939
{{- end -}}
4040

41+
- Validate schema_name for invalid characters
42+
{{- if eq .input_include_models_in_unity_catalog `yes` -}}
43+
{{- if ((regexp `[ ./\\]+`).MatchString .input_schema_name) -}}
44+
{{ fail `schema_name contained invalid characters. Valid schema names cannot contain any of the following characters: " ", ".", "\", "/"` }}
45+
{{- end -}}
46+
{{- end -}}
47+
4148
- Validate feature store and recipes
4249
{{- if and (eq .input_include_feature_store `yes`) (eq .input_include_mlflow_recipes `yes`) -}}
4350
{{ fail `Feature Store cannot be used with MLflow recipes. Please only use one of the two or neither.` }}
4451
{{- end -}}
52+
53+
- Validate feature store and recipes
54+
{{- if and (eq .input_include_models_in_unity_catalog `yes`) (eq .input_include_mlflow_recipes `yes`) -}}
55+
{{ fail `The Model Registry in Unity Catalog cannot be used with MLflow recipes. Please only use one of the two or neither.` }}
56+
{{- end -}}
57+
4558
{{- end -}}

library/template_variables.tmpl

+26
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,32 @@
113113
{{- end -}}
114114
{{- end }}
115115

116+
{{ define `include_models_in_unity_catalog` -}}
117+
{{- if (eq .input_include_models_in_unity_catalog `no`) -}}
118+
no
119+
{{- else if (eq .input_include_models_in_unity_catalog `yes`) -}}
120+
yes
121+
{{- else -}}
122+
{{ fail `Invalid selection of include_models_in_unity_catalog. Please choose from [no, yes]` }}
123+
{{- end -}}
124+
{{- end }}
125+
126+
{{ define `schema_name` -}}
127+
{{- if (eq .input_include_models_in_unity_catalog `yes`) -}}
128+
{{ .input_schema_name }}
129+
{{- else -}}
130+
{{ "" }}
131+
{{- end -}}
132+
{{- end }}
133+
134+
{{ define `unity_catalog_read_user_group` -}}
135+
{{- if (eq .input_include_models_in_unity_catalog `yes`) -}}
136+
{{ .input_unity_catalog_read_user_group }}
137+
{{- else -}}
138+
{{ "account users" }}
139+
{{- end -}}
140+
{{- end }}
141+
116142
{{ define `cloud_specific_node_type_id` -}}
117143
{{- if (eq .input_cloud `aws`) -}}
118144
i3.xlarge

template/{{.input_root_dir}}/_params_testing_only.txt.tmpl

+6
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ input_release_branch={{.input_release_branch}}
99
input_read_user_group={{.input_read_user_group}}
1010
input_include_feature_store={{.input_include_feature_store}}
1111
input_include_mlflow_recipes={{.input_include_mlflow_recipes}}
12+
input_include_models_in_unity_catalog={{.input_include_models_in_unity_catalog}}
13+
input_schema_name={{.input_schema_name}}
14+
input_unity_catalog_read_user_group={{.input_unity_catalog_read_user_group}}
1215

1316
root_dir={{ template `root_dir` . }}
1417
project_name={{ template `project_name` . }}
@@ -23,6 +26,9 @@ release_branch={{ template `release_branch` . }}
2326
read_user_group={{ template `read_user_group` . }}
2427
include_feature_store={{ template `include_feature_store` . }}
2528
include_mlflow_recipes={{ template `include_mlflow_recipes` . }}
29+
include_models_in_unity_catalog={{ template `include_models_in_unity_catalog` . }}
30+
schema_name={{ template `schema_name` . }}
31+
unity_catalog_read_user_group={{ template `unity_catalog_read_user_group` . }}
2632
cloud_specific_node_type_id={{ template `cloud_specific_node_type_id` . }}
2733
framework={{ template `framework` . }}
2834
model_name={{ template `model_name` . }}

template/{{.input_root_dir}}/docs/project-overview.md.tmpl

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,10 @@
66
This project defines an ML pipeline for automated retraining and batch inference of an ML model
77
on tabular data.
88

9-
See the full pipeline structure below. The [stacks README](https://github.com/databricks/mlops-stack/blob/main/Pipeline.md)
9+
See the full pipeline structure below. The [stack README](https://github.com/databricks/mlops-stack/blob/main/Pipeline.md)
1010
contains additional details on how ML pipelines are tested and deployed across each of the dev, staging, prod environments below.
1111

12-
![MLOps Stacks diagram](images/mlops-stack-summary.png)
12+
![MLOps Stack diagram](images/mlops-stack-summary.png)
1313

1414

1515
## Code structure

template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/databricks.yml.tmpl

+2-3
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,14 @@
22
bundle:
33
name: {{template `project_name` .}}
44

5-
65
variables:
76
experiment_name:
87
description: Experiment name for the model training.
98
default: /Users/${workspace.current_user.userName}/${bundle.target}-{{template `experiment_base_name` .}}
109
model_name:
1110
description: Model name for the model training.
12-
default: ${bundle.target}-{{template `model_name` .}}
13-
11+
{{ if (eq .input_include_models_in_unity_catalog `no`) }}default: ${bundle.target}-{{template `model_name` .}}
12+
{{- else -}}default: {{template `model_name` .}}{{end}}
1413

1514
include:
1615
# Resources folder contains ML artifact resources for the ml project that defines model and experiment

template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/deployment/batch_inference/README.md.tmpl

+2-1
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,8 @@ df = spark.table(
4747
).drop("fare_amount")
4848

4949
df.write.mode("overwrite").saveAsTable(
50-
name="hive_metastore.default.taxi_scoring_sample"
50+
{{ if (eq .input_include_models_in_unity_catalog `yes`) }}name="hive_metastore.default.taxi_scoring_sample"
51+
{{- else -}}name="<catalog>.{{template `schema_name` .}}.feature_store_inference_input"{{ end }}
5152
)
5253
```
5354
{{ end }}

template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/deployment/batch_inference/notebooks/BatchInference.py.tmpl

+27-6
Original file line numberDiff line numberDiff line change
@@ -26,12 +26,21 @@ dbutils.widgets.dropdown("env", "dev", ["dev", "staging", "prod"], "Environment
2626
dbutils.widgets.text("input_table_name", "", label="Input Table Name")
2727
# Delta table to store the output predictions.
2828
dbutils.widgets.text("output_table_name", "", label="Output Table Name")
29+
{{- if (eq .input_include_models_in_unity_catalog "no") }}
2930
# Batch inference model name
30-
dbutils.widgets.text("model_name", "", label="Model Name")
31+
dbutils.widgets.text(
32+
"model_name", "dev-{{template `model_name` .}}", label="Model Name"
33+
)
34+
{{else}}
35+
# Unity Catalog registered model name to use for the trained mode.
36+
dbutils.widgets.text(
37+
"model_name", "dev.{{template `schema_name` .}}.{{template `model_name` .}}", label="Model Name"
38+
){{end}}
3139

3240
# COMMAND ----------
3341

3442
import os
43+
3544
notebook_path = '/Workspace/' + os.path.dirname(dbutils.notebook.entry_point.getDbutils().notebook().getContext().notebookPath().get())
3645
%cd $notebook_path
3746

@@ -55,7 +64,9 @@ sys.path.append("../..")
5564
# COMMAND ----------
5665

5766
# DBTITLE 1,Define input and output variables
58-
from utils import get_deployed_model_stage_for_env
67+
{{- if (eq .input_include_models_in_unity_catalog "no") }}
68+
from utils import get_deployed_model_stage_for_env{{else}}
69+
from utils import get_deployed_model_alias_for_env{{end}}
5970

6071
env = dbutils.widgets.get("env")
6172
input_table_name = dbutils.widgets.get("input_table_name")
@@ -64,18 +75,28 @@ model_name = dbutils.widgets.get("model_name")
6475
assert input_table_name != "", "input_table_name notebook parameter must be specified"
6576
assert output_table_name != "", "output_table_name notebook parameter must be specified"
6677
assert model_name != "", "model_name notebook parameter must be specified"
78+
{{- if (eq .input_include_models_in_unity_catalog "no") }}
6779
stage = get_deployed_model_stage_for_env(env)
68-
model_uri = f"models:/{model_name}/{stage}"
80+
model_uri = f"models:/{model_name}/{stage}"{{else}}
81+
alias = get_deployed_model_alias_for_env(env)
82+
model_uri = f"models:/{model_name}@{alias}"{{end}}
6983

70-
# Get model version from stage
71-
from mlflow import MlflowClient
84+
# COMMAND ----------
7285

86+
from mlflow import MlflowClient
87+
{{ if (eq .input_include_models_in_unity_catalog "no") }}
88+
# Get model version from stage
7389
model_version_infos = MlflowClient().search_model_versions("name = '%s'" % model_name)
7490
model_version = max(
7591
int(version.version)
7692
for version in model_version_infos
7793
if version.current_stage == stage
78-
)
94+
){{else}}
95+
# Get model version from alias
96+
client = MlflowClient(registry_uri="databricks-uc")
97+
model_version = client.get_model_version_by_alias(model_name, alias).version{{end}}
98+
99+
# COMMAND ----------
79100

80101
# Get datetime
81102
from datetime import datetime

template/{{.input_root_dir}}/{{template `project_name_alphanumeric_underscore` .}}/deployment/batch_inference/predict.py.tmpl

+2-1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ def predict_batch(
99
Apply the model at the specified URI for batch inference on the table with name input_table_name,
1010
writing results to the table with name output_table_name
1111
"""
12+
{{ if (eq .input_include_models_in_unity_catalog "yes") }}mlflow.set_registry_uri("databricks-uc"){{ end }}
1213
table = spark_session.table(input_table_name)
1314
{{ if (eq .input_include_feature_store `yes`) }}
1415
from databricks.feature_store import FeatureStoreClient
@@ -26,7 +27,7 @@ def predict_batch(
2627
)
2728
{{ else }}
2829
predict = mlflow.pyfunc.spark_udf(
29-
spark_session, model_uri, result_type="string", env_manager="conda"
30+
spark_session, model_uri, result_type="string", env_manager="virtualenv"
3031
)
3132
output_df = (
3233
table.withColumn("prediction", predict(struct(*table.columns)))

0 commit comments

Comments
 (0)