You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Some parquet writers produce a page size distribution where ~30% of the pages are 64B to 32 KiB, in contrast to the typical distribution where the median is 1-4 MiB. Although we can decompress and decode the smaller pages efficiently, we observe low IO throughput. We speculate that coalescing the <100 KiB read requests into larger requests would improve this performance degradation issue.
Describe the solution you'd like
Add an IO manager layer to the datasources that use KvikIO. If there are several small byte ranges in sequence, we could coalesce them into a single KvikIO request. I suspect we would need to break them up again before beginning downstream processing. Perhaps we should restrict the coalescing only to byte ranges that are contiguous, so that we can split the buffers again without copying the data.
Additional context and alternatives
Previously we disabled KvikIO for small copies, but the IO throughput was especially poor when using UVM due to a large number of prefetches. (also see #17260). We might also be able to do task coalescing within KvikIO.
The approach for managing the coalesce and split will hopefully not trigger allocate, because we would prefer to avoid triggering a "prefetch-on-alloc" for each 100s of bytes or few KiB.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
Some parquet writers produce a page size distribution where ~30% of the pages are 64B to 32 KiB, in contrast to the typical distribution where the median is 1-4 MiB. Although we can decompress and decode the smaller pages efficiently, we observe low IO throughput. We speculate that coalescing the <100 KiB read requests into larger requests would improve this performance degradation issue.
Describe the solution you'd like
Add an IO manager layer to the datasources that use KvikIO. If there are several small byte ranges in sequence, we could coalesce them into a single KvikIO request. I suspect we would need to break them up again before beginning downstream processing. Perhaps we should restrict the coalescing only to byte ranges that are contiguous, so that we can split the buffers again without copying the data.
Additional context and alternatives
Previously we disabled KvikIO for small copies, but the IO throughput was especially poor when using UVM due to a large number of prefetches. (also see #17260). We might also be able to do task coalescing within KvikIO.
The approach for managing the coalesce and split will hopefully not trigger
allocate
, because we would prefer to avoid triggering a "prefetch-on-alloc" for each 100s of bytes or few KiB.The text was updated successfully, but these errors were encountered: