Skip to content

[FEA] Add IO task manager to assist in efficient use of KvikIO #17639

Open
@GregoryKimball

Description

@GregoryKimball

Is your feature request related to a problem? Please describe.
Some parquet writers produce a page size distribution where ~30% of the pages are 64B to 32 KiB, in contrast to the typical distribution where the median is 1-4 MiB. Although we can decompress and decode the smaller pages efficiently, we observe low IO throughput. We speculate that coalescing the <100 KiB read requests into larger requests would improve this performance degradation issue.

Describe the solution you'd like
Add an IO manager layer to the datasources that use KvikIO. If there are several small byte ranges in sequence, we could coalesce them into a single KvikIO request. I suspect we would need to break them up again before beginning downstream processing. Perhaps we should restrict the coalescing only to byte ranges that are contiguous, so that we can split the buffers again without copying the data.

Additional context and alternatives
Previously we disabled KvikIO for small copies, but the IO throughput was especially poor when using UVM due to a large number of prefetches. (also see #17260). We might also be able to do task coalescing within KvikIO.

The approach for managing the coalesce and split will hopefully not trigger allocate, because we would prefer to avoid triggering a "prefetch-on-alloc" for each 100s of bytes or few KiB.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cuIOcuIO issuefeature requestNew feature or requestlibcudfAffects libcudf (C++/CUDA) code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions