Description
Description
Kedro currently does not support .docx files as a native dataset type. I'm always frustrated when I need to work with .docx documents in my pipelines but have to handle them manually outside of Kedro’s datasets. This feature request proposes adding a DocxDataSet to support reading from and writing to Word documents using python-docx.
Context
This change can be useful because many workflows in enterprise and research environments rely on .docx files for documentation, reports, and structured data exchange. Integrating a DOCXDataSet into Kedro would streamline pipelines that involve .docx processing, reduce code complexity, and enhance reproducibility. It would also benefit other users who need to interact with Word documents as part of their data pipeline without breaking the Kedro design pattern.
Metadata
Metadata
Assignees
Type
Projects
Status