pyarrow supports groupby operations now (non-released version): https://github.com/apache/arrow/commit/999d97add8e540021b7f42ffec91a6b26ddf2691 So a pyarrow benchmark could be implemented now. Renaming Arrow do dplyr-arrow would also make sense as I assumed Arrow meant pyarrow: https://github.com/h2oai/db-benchmark/issues/229