-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
gh-139871: Add bytearray.take_bytes([n])
to efficiently extract bytes
#140128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This sets up so the bytes can be "taken" as a byes object without requiring a copy. I ran pyperformance (results below) and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine. ------ pyperformance compare main.json bytearray_bytes.json -O table main.json ========= Performance version: 1.11.0 Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42 Number of logical CPUs: 32 Start date: 2025-10-14 00:55:52.519236 End date: 2025-10-14 02:23:01.308400 bytearray_bytes.json ==================== Performance version: 1.11.0 Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42 Number of logical CPUs: 32 Start date: 2025-10-13 23:22:29.928152 End date: 2025-10-14 00:49:34.467284 +----------------------------------+-----------+----------------------+--------------+------------------------+ | Benchmark | main.json | bytearray_bytes.json | Change | Significance | +==================================+===========+======================+==============+========================+ | 2to3 | 137 ms | 136 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_generators | 193 ms | 195 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_cpu_io_mixed | 285 ms | 286 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_cpu_io_mixed_tg | 289 ms | 290 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager | 50.4 ms | 51.5 ms | 1.02x slower | Significant (t=-10.40) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_cpu_io_mixed | 223 ms | 225 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_cpu_io_mixed_tg | 263 ms | 264 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_io | 370 ms | 372 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_io_tg | 380 ms | 384 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_memoization | 125 ms | 126 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_memoization_tg | 161 ms | 162 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_eager_tg | 125 ms | 125 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_io | 366 ms | 360 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_io_tg | 359 ms | 361 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_memoization | 177 ms | 181 ms | 1.02x slower | Significant (t=-9.20) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_memoization_tg | 188 ms | 189 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_none | 151 ms | 151 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | async_tree_none_tg | 150 ms | 151 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | asyncio_tcp | 182 ms | 161 ms | 1.13x faster | Significant (t=32.85) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | asyncio_tcp_ssl | 548 ms | 553 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | asyncio_websockets | 342 ms | 339 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | bench_mp_pool | 7.12 ms | 7.08 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | bench_thread_pool | 818 us | 819 us | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | bpe_tokeniser | 2.10 sec | 2.09 sec | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | chaos | 27.9 ms | 28.0 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | comprehensions | 7.45 us | 7.24 us | 1.03x faster | Significant (t=3.27) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | connected_components | 308 ms | 309 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | coroutines | 11.1 ms | 11.2 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | coverage | 33.6 ms | 34.1 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | create_gc_cycles | 1.16 ms | 1.16 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | crypto_pyaes | 37.1 ms | 35.6 ms | 1.04x faster | Significant (t=10.63) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | dask | 347 ms | 351 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deepcopy | 118 us | 117 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deepcopy_memo | 12.8 us | 12.7 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deepcopy_reduce | 1.32 us | 1.34 us | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | deltablue | 1.65 ms | 1.64 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | django_template | 17.9 ms | 17.8 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | docutils | 1.19 sec | 1.20 sec | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | dulwich_log | 19.5 ms | 19.7 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | fannkuch | 184 ms | 181 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | float | 37.1 ms | 36.7 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | gc_traversal | 3.04 ms | 2.84 ms | 1.07x faster | Significant (t=19.48) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | generators | 15.9 ms | 15.3 ms | 1.04x faster | Significant (t=7.03) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | genshi_text | 11.3 ms | 11.2 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | genshi_xml | 25.5 ms | 25.5 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | go | 57.6 ms | 56.7 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | hexiom | 2.92 ms | 2.88 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | html5lib | 26.0 ms | 26.5 ms | 1.02x slower | Significant (t=-9.20) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | json_dumps | 4.48 ms | 4.44 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | json_loads | 11.7 us | 11.7 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | k_core | 1.41 sec | 1.42 sec | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | logging_format | 3.27 us | 3.30 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | logging_silent | 45.5 ns | 45.8 ns | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | logging_simple | 3.02 us | 3.01 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | mako | 6.02 ms | 6.03 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | many_optionals | 473 us | 478 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | mdp | 587 ms | 578 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | meteor_contest | 50.2 ms | 50.5 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | nbody | 54.6 ms | 52.4 ms | 1.04x faster | Significant (t=10.72) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | nqueens | 41.7 ms | 40.4 ms | 1.03x faster | Significant (t=6.79) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pathlib | 9.77 ms | 9.73 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle | 5.99 us | 6.01 us | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle_dict | 12.5 us | 12.8 us | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle_list | 1.98 us | 1.96 us | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pickle_pure_python | 149 us | 150 us | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pidigits | 111 ms | 115 ms | 1.03x slower | Significant (t=-18.53) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pprint_pformat | 737 ms | 748 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pprint_safe_repr | 362 ms | 369 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | pyflate | 211 ms | 205 ms | 1.03x faster | Significant (t=7.43) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | python_startup | 7.88 ms | 7.88 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | python_startup_no_site | 4.72 ms | 4.76 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | raytrace | 130 ms | 128 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_compile | 50.0 ms | 50.2 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_dna | 101 ms | 103 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_effbot | 1.72 ms | 1.77 ms | 1.03x slower | Significant (t=-26.42) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | regex_v8 | 12.5 ms | 12.3 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | richards | 20.4 ms | 20.0 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | richards_super | 23.4 ms | 22.8 ms | 1.03x faster | Significant (t=11.36) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_fft | 154 ms | 153 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_lu | 55.4 ms | 57.0 ms | 1.03x slower | Significant (t=-5.67) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_monte_carlo | 32.8 ms | 32.8 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_sor | 57.8 ms | 56.9 ms | 1.02x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | scimark_sparse_mat_mult | 2.75 ms | 2.76 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | shortest_path | 316 ms | 318 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | spectral_norm | 47.7 ms | 51.6 ms | 1.08x slower | Significant (t=-2.01) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sphinx | 465 ms | 467 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_normalize | 50.3 ms | 50.2 ms | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_optimize | 24.2 ms | 24.4 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_parse | 576 us | 572 us | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlglot_v2_transpile | 724 us | 722 us | 1.00x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sqlite_synth | 1.14 us | 1.15 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | subparsers | 20.6 ms | 20.7 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_expand | 181 ms | 184 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_integrate | 8.54 ms | 8.55 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_str | 103 ms | 105 ms | 1.02x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | sympy_sum | 55.9 ms | 56.0 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | telco | 3.39 ms | 3.34 ms | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | tomli_loads | 971 ms | 982 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | typing_runtime_protocols | 73.2 us | 73.6 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpack_sequence | 25.2 ns | 23.0 ns | 1.10x faster | Significant (t=7.03) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpickle | 6.99 us | 7.05 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpickle_list | 2.07 us | 2.10 us | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | unpickle_pure_python | 105 us | 104 us | 1.01x faster | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_generate | 40.5 ms | 40.7 ms | 1.00x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_iterparse | 49.7 ms | 50.4 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_parse | 77.2 ms | 79.1 ms | 1.02x slower | Significant (t=-16.14) | +----------------------------------+-----------+----------------------+--------------+------------------------+ | xml_etree_process | 29.5 ms | 29.8 ms | 1.01x slower | Not significant | +----------------------------------+-----------+----------------------+--------------+------------------------+
bytearray.take_bytes([n])
to efficiently extract bytes
bytearray.take_bytes([n])
to efficiently extract bytes
bytearray.take_bytes([n])
to efficiently extract bytes
.. impl-detail:: | ||
|
||
CPython implements this as a zero-copy operation making it a very | ||
efficient way to make a :class:`bytes` from a :class:`bytearray`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only when n is None, no?
remaining = PyBytes_FromStringAndSize(self->ob_start + to_take, | ||
remaining_length + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remaining = PyBytes_FromStringAndSize(self->ob_start + to_take, | |
remaining_length + 1); | |
remaining = PyBytes_FromStringAndSize(self->ob_start + to_take, | |
remaining_length + 1); |
buffer = bytearray(1024) | ||
... | ||
data = buffer.take_bytes() | ||
assert len(buffer) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest removing these assertions. We can trust take_bytes() API, no? :-)
Update
bytearray
to contain abytes
and provide a zero-copy path to "extract" thebytes
. This allows making several code paths more efficient.This does not move any codepaths to make use of this new API. The documentation changes include common code patterns which can be made more efficient with this API.
When just changing
bytearray
to containbytes
I ran pyperformance on a--with-lto --enable-optimizations --with-static-libpython
build (results below) and don't see any major speedups or slowdowns with this; all seems to be in the noise of my machine (Generally changes under 5% or benchmarks that don't touch bytes/bytearray).pyperformance compare main.json bytearray_bytes.json
main.json
Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-14 00:55:52.519236
End date: 2025-10-14 02:23:01.308400
bytearray_bytes.json
Performance version: 1.11.0
Report on Linux-6.17.1-arch1-1-x86_64-with-glibc2.42
Number of logical CPUs: 32
Start date: 2025-10-13 23:22:29.928152
End date: 2025-10-14 00:49:34.467284
.take_bytes([n])
a zero-copy path frombytearray
tobytes
#139871