Support streaming Bytes/ in-memory buffers as input for `read*` without full materialization

Currently, DuckDB supports reading from file paths, URLs directly, however, providing in-memory buffers/ async readers is not allowed. This limits the ability to build efficient streaming pipelines or integrate with custom or remote object stores without temporary files or full memory loads.

**Feature Request**

Enable DuckDB to accept `BytesIO`, `Vec<u8>`, or `AsyncRead`-like streams as streaming inputs for read_parquet, read_csv, etc., without requiring the entire buffer to be loaded into memory first.

This is especially useful in contexts like:

- Building data ingestion pipelines from custom sources (e.g., [opendal](https://github.com/datafuselabs/opendal))
- Processing large remote files via chunked downloading / range reads
- Memory-constrained environments or real-time ETL systems

Allowing streaming ingestion directly from buffers would unlock a range of integration and performance improvements for DuckDB in both cloud-native and high-throughput use cases.

Would love to hear thoughts on the feasibility and roadmap for this!
Thanks for building such a powerful engine!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support streaming Bytes/ in-memory buffers as input for `read*` without full materialization #485

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support streaming Bytes/ in-memory buffers as input for read* without full materialization #485

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Support streaming Bytes/ in-memory buffers as input for `read*` without full materialization #485