Skip to content

Benchmarks for common python I/O patterns #399

Open
@cmaloney

Description

@cmaloney

As work happens on I/O pieces I've been building specialized micro-benchmarks (ex. gh-120754 Speed up open().read() pattern by reducing the number of system calls and others have gh-117151: IO performance improvement, increase io.DEFAULT_BUFFER_SIZE to 128k), it would be nice to have more general benchmarks to validate I/O performance for common cases.

Talking a little with people at PyConUS there was some interest in the tests, and a general desire for I/O tests not to be enabled by default, but to be a group which can be manually run.

General I/O shapes I'm hoping to add benchmarks for:

  • read/write all of the byes of a file in a single call (including pathlib.Path.read_text, pathlib.Path.write_text)
  • read/write many small files (ex. .pyc files, maybe just compile_all?)
  • streaming bytes read/write (ex. to a pipe / console such as stdin/stdout/stderr, non-seekable devices)
  • read/write a zipfile, tarfile (read + seek, write + seek, in particular buffering behavior)
  • use zipimport
  • Create a zipapp
  • Multi-threaded write to stdout, stderr (ex. logging in a large application/codebase)

Note: With these aiming to stay at the Binary / Bytes IO layer as much as possible (not touch Text I/O for now)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions