Skip to content

[Python][Docs] Revise Python Documentation #46601

Open
@AlenkaF

Description

@AlenkaF

Describe the enhancement requested

This is the umbrella ticket for ongoing efforts to improve and revise the PyArrow Python User Guide and API reference documentation.

Many sections of the guide need refreshing, especially based on what users frequently search for and the kinds of issues commonly reported. We’ve already started prioritizing topics through research using Matomo web analytics and GitHub issues. In the future, insights from Kapa AI will also be incorporated. Comments on the priorities are welcome!

We’ll open sub-issues for specific tasks as we go. Everyone is welcome to contribute—whether you're experienced or just getting started, your help in making the PyArrow docs better is appreciated! ❤️

Suggested Focus Areas (in order of priority):

  1. Parquet module
  2. Dataset module
  3. Table, RecordBatch, Schema and data types
  4. Getting Started
  5. Pandas integration
  6. IPC

Note: The User Guide is the main focus of this revision effort, but improving the API reference documentation is also important. Analytics show it receives a significant amount of traffic, so we'll consider enhancements there as well.

Existing Documentation Issues

Connected

Component(s)

Documentation, Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions