doc: Refresh OpenLineage provider docs #60462
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Documentation improvements for OpenLineage provider
This PR improves the OpenLineage provider documentation with better organization and expanded content:
spark.rst: Added separate page for Spark integration, explaining that it's a separate entity from the Airflow provider and requires its own installation and configuration.
macros.rst: Expanded to be the central documentation for job hierarchy and macros, consolidating information about cross-job dependencies (TriggerDagRunOperator, API triggers, ExternalTaskSensor, Airflow Assets) and how OpenLineage handles job relationships in each scenario.
troubleshooting.rst: Added troubleshooting page with best practices and common errors, where users trying to debug OL problems can find helpful information.
supported_classes.rst: Added explanation of what "supported" means, clarifying that all operators emit basic lineage while "supported" operators provide additional operator-specific metadata.
guides/structure.rst: Added paragraph explaining the difference between the
openlineage-pythonclient package and theapache-airflow-providers-openlineageprovider package.guides/developer.rst: Added section about helper functions e.g. emit_openlineage_events_for_databricks_queries
provider.yml: Reordered options so that they are alphabetical.
guides/user.rst: Moved some basic stuff to structure.rst and dev stuff to developer.rst, nothing was left so this file was removed.
Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rstor{issue_number}.significant.rst, in airflow-core/newsfragments.