Skip to content

Terminal Performance

Eric Rothstein edited this page Jan 16, 2024 · 18 revisions

Terminal output on Microsoft Windows is notoriously slow. This means that applications which need to write a lot of text to the terminal, e.g. via puts() or printf() functions, can easily be bottlenecked by terminal output! A typical symptom is that the application makes very slow progress while, at the same time, using very little CPU time. That is because the application is not actually limited by the CPU, but instead spends most of its time just waiting for terminal output operations to complete. In other words, the application spends an excessive amount of time being blocked on functions like puts() or printf() – as those functions can take a very long time to return, when the destination of the write operation is a terminal window.

It is not known why terminal output on Windows is slow, but it probably is related to the way how Microsoft implemented the inter-process communication from the console application to the terminal window. Anyway, it was found that using tee as an intermediary buffer between the console application and the terminal window can greatly improve the performance!

terminal-performance

Example

For this specific purpose, we can invoke the tee program with a NUL destination file. The NUL parameter instructs tee to just forward the input data from the stdin stream to the stdout stream, but not copy the data into a file.

gizmo.exe [...] | tee.exe NUL

Benchmark Results

Here is a simple test program, that generates a bunch of pseudo-random numbers:

import time
from random import random, seed

seed(42)

time_enter = time.monotonic()

for _ in range(1000000):
    print(random())

time_leave = time.monotonic()

print("----")
print("Execution time: {:.2f} sec".format(time_leave - time_enter))

Execution time when running directly in the Windows terminal:

C:\dev>python rand.py
…
Execution time: 202.56 sec

Execution time when running the same program and using tee as an intermediary buffer:

C:\dev>python rand.py | tee-x64.exe NUL
…
Execution time: 42.19 sec

Simply passing the output through tee results in a speed-up of ~4.8× 😎

Note

Tested on the following machine:

grafik

Clone this wiki locally