Skip to content

[Parquet] Writing in 57.0.0 seems 10% slower than 56.0.0 #8783

@alamb

Description

@alamb

Describe the bug
While testing the tpchgen-rs upgrade to arrow 57 in

@clflushopt, @kevinjqliu and I found that arrow-rs 57 seems to write data around 10% slower than arrow 56: clflushopt/tpchgen-rs#200 (review)

Specifically running this command is around 10% slower

tpchgen-cli --scale-factor=100 --tables=lineitem --parts=10 --format=parquet

56.0.0 takes 0m27.122s
57.0.0 takes 0m28.776s

To Reproduce

rm -rf lineitem && cargo build --release && time ./target/release/tpchgen-cli --scale-factor=100 --tables=lineitem --parts=10 --format=parquet

Expected behavior

57 should be the same or better performance as 56

Additional context
I am doing some git bisecting to see if I can find some more data

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions