Skip to content

[Fix] Preserve externally set spark_env_vars during databricks_cluster updates when ignore_changes is used#5438

Open
jlieow wants to merge 10 commits intodatabricks:mainfrom
jlieow:fix/spark-env-vars-ignore-changes
Open

[Fix] Preserve externally set spark_env_vars during databricks_cluster updates when ignore_changes is used#5438
jlieow wants to merge 10 commits intodatabricks:mainfrom
jlieow:fix/spark-env-vars-ignore-changes

Conversation

@jlieow
Copy link
Contributor

@jlieow jlieow commented Feb 27, 2026

Related to #1238

Changes

When spark_env_vars are set externally (e.g. by cluster policies or manual edits) but not declared in the Terraform config, updating any other cluster field (like cluster_name) causes the Edit API to wipe those env vars. This happens because the Clusters Edit API (POST /api/2.1/clusters/edit) does a full replacement — omitted fields are cleared. The provider was sending nil for spark_env_vars since StructToData skips writing API-returned values to state when the field is not in the user's config.

This fix carries over existing spark_env_vars from the current cluster state into the Edit request when the user has not configured any. This also fixes lifecycle { ignore_changes = [spark_env_vars] } which previously had nothing to preserve.

No schema changes. No changes to Read behavior.


Detailed Changes

Preserve spark_env_vars in update path (clusters/resource_cluster.go, lines 622-636)

Before:

		} else {
			SetForceSendFieldsForCluster(&cluster, d)

After:

		} else {
			// Preserve externally-set spark_env_vars (e.g. from cluster policies)
			// when not configured by the user. The Edit API does a full replacement,
			// so omitting spark_env_vars would clear them.
			//
			// cluster.SparkEnvVars = from Terraform state (HCL config)
			// clusterInfo.SparkEnvVars = from GET API (actual cluster values)
			//
			// HCL has env vars | Cluster has env vars | Action
			// false            | false                | nothing to preserve
			// false            | true                 | carry over values set outside of Terraform
			// true             | false                | values in Terraform takes precedence
			// true             | true                 | values in Terraform takes precedence
			if len(cluster.SparkEnvVars) == 0 && len(clusterInfo.SparkEnvVars) > 0 {
				cluster.SparkEnvVars = clusterInfo.SparkEnvVars
			}
			SetForceSendFieldsForCluster(&cluster, d)

Why: The EditCluster struct is built from Terraform state via DataToStructPointer (line 546). When spark_env_vars is not in the user's config, state has no value for it, so cluster.SparkEnvVars is nil. The Edit API interprets this as "clear spark_env_vars". The current cluster details are already fetched at line 578 (clusterInfo), so we carry over the existing env vars when the user hasn't configured any.


Tests

  • TestResourceClusterUpdate_PreservesExternalSparkEnvVars: Simulates an update where the cluster GET returns spark_env_vars set externally (PYSPARK_PYTHON, ENV_FROM_POLICY) but the user's Terraform state has no spark_env_vars. Asserts via ExpectedRequest that the Edit API call includes the externally-set env vars rather than omitting them.

  • make test run locally

  • relevant change in docs/ folder

  • covered with integration tests in internal/acceptance

  • using Go SDK

  • using TF Plugin Framework

  • has entry in NEXT_CHANGELOG.md file

When spark_env_vars are set externally (e.g. by cluster policies) but not
in the user's Terraform config, the Edit API would clear them because
the full replacement semantics treat omission as deletion. This carries
over existing spark_env_vars from the current cluster state when the
user config has none.
Add truth table documenting all four scenarios for spark_env_vars
preservation logic. Update changelog to mention lifecycle ignore_changes
fix.
When spark_env_vars are set externally (e.g. by cluster policies) but not
in the user's Terraform config, the Edit API would clear them because
the full replacement semantics treat omission as deletion. This carries
over existing spark_env_vars from the current cluster state when the
user config has none.
Add truth table documenting all four scenarios for spark_env_vars
preservation logic. Update changelog to mention lifecycle ignore_changes
fix.
@jlieow jlieow requested review from a team as code owners February 27, 2026 19:02
@jlieow jlieow requested review from tanmay-db and removed request for a team February 27, 2026 19:02
@jlieow jlieow changed the title [Fix] Preserve externally-set spark_env_vars during databricks_cluster updates [Fix] Preserve externally set spark_env_vars during databricks_cluster updates when ignore_changes is used Feb 27, 2026
@jlieow jlieow temporarily deployed to test-trigger-is March 2, 2026 11:41 — with GitHub Actions Inactive
Copy link
Contributor

@tanmay-db tanmay-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jlieow, thanks for the PR, left some comments

Comment on lines +634 to +636
if len(cluster.SparkEnvVars) == 0 && len(clusterInfo.SparkEnvVars) > 0 {
cluster.SparkEnvVars = clusterInfo.SparkEnvVars
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlieow if Edit does full replacement, isn't the expected behaviour for terraform apply that whatever is in the config is applied? Should these variables be part of config then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to #1238. In that issue, we can see that users are trying to manage parts of the resource in Terraform and others in the UI.

The user is attempting to use ignore_changes to let Terraform and an external process manage different parts of the same object, which is what ignore_changes is supposed to do as per the lifecycle reference doc.

However, due to the full replacement behaviour of the Edit API, this breaks the Terraform lifecycle behaviour. ignore_changes only suppresses drift detection in the plan, but the provider still sends an empty map to the API, which clears the externally-set values. This impacts users who manage this resource via multiple processes.

This fix assumes that native Terraform features like ignore_changes should take precedence since these are core features that users expect to work across all resources and not be tied to API limitations.

### Documentation
* Fixed `databricks_cluster` to preserve externally-set `spark_env_vars` (e.g. from cluster policies) during updates when not configured in Terraform. This also fixes `lifecycle { ignore_changes = [spark_env_vars] }` which previously failed to prevent deletion of externally-set values ([#1238](https://github.com/databricks/terraform-provider-databricks/issues/1238)).

* Added documentation note about whitespace handling in `MAP` column types for `databricks_sql_table`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be removed

Copy link
Contributor Author

@jlieow jlieow Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have amended the commit and restored the note.

@jlieow jlieow force-pushed the fix/spark-env-vars-ignore-changes branch from 51868da to ec304d7 Compare March 2, 2026 17:52
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/terraform

Inputs:

  • PR number: 5438
  • Commit SHA: ec304d76989c6a2ef1215b246e46203cc5ed618b

Checks will be approved automatically on success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants