-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
gh-140135: use PyBytesWriter
in io.RawIOBase.readall
; 3.98x faster
#140139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PyBytesWriter
in io.RawIOBase.readall
, 2.34x fasterPyBytesWriter
in io.RawIOBase.readall
PyBytesWriter
in io.RawIOBase.readall
PyBytesWriter
in io.RawIOBase.readall
; 3.98x faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a NEWS entry, IMO this speedup is significant enough to be documented!
Impressive speedup! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good
The CI failed. I'm not exactly sure why. The same jobs succeeded before:
No meaningful changes since the last run (b8a7f89 is just style, and f0ae824 just adds a NEWS entry.) From what I see, some flakiness was reported: I'm rerunning by merging the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice optimization!
I think this looks good, the only other thing is I would verify this doesn't regress performance for sizes <1MB (e.g. 1K and 4K maybe). I expect the performance differences to be much smaller there if at all present, but that's OK, we just want to make sure they aren't regressing. |
Great point. The benchmark (the same, just tweaked `SIZES_KB`):import io
import pyperf
CHUNK_SIZE = 1024
SIZES_KB = [1, 4, 8]
class ChunkedRaw(io.RawIOBase):
def __init__(self, data, chunk_size):
self._buf = memoryview(data)
self._pos = 0
self._chunk_size = chunk_size
def readable(self):
return True
def read(self, n: int = -1):
if self._pos >= len(self._buf):
return b""
to_read = (
self._chunk_size
if (n is None or n < 0)
else min(n, self._chunk_size)
)
end = min(self._pos + to_read, len(self._buf))
out = bytes(self._buf[self._pos : end])
self._pos = end
return out
def generate_bytes(total):
block = b"abcdefghijklmnopqrstuvwxyz0123456789" * 128
return (block * (total // len(block) + 1))[:total]
def _bench_readall(data, chunk_size):
r = ChunkedRaw(data, chunk_size)
out = r.readall()
if len(out) != len(data):
raise RuntimeError("what is going on...???")
def main():
runner = pyperf.Runner()
for size_kib in SIZES_KB:
total_bytes = size_kib * 1024
data = generate_bytes(total_bytes)
name = f"rawiobase_readall_{size_kib}KB_chunk{CHUNK_SIZE}"
runner.bench_func(name, _bench_readall, data, CHUNK_SIZE)
if __name__ == "__main__":
main() The result:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great thank you!
The issue #140135 provides more details.
Benchmark
The script:
The results (with
--rigorous
):The environment:
sudo ./python -m pyperf system tune
ensured.PyBytesWriter
API inio.RawIOBase.readall
#140135