Skip to content
Koichi Murase edited this page Sep 1, 2024 · 4 revisions

ble.sh Internals

I have written an explanation on how ble.sh works in a wiki page of Oil: "How Interactive Shells Work · oilshell/oil Wiki". I created this page because I think it is a good opportunity to summarize the internal implementation of ble.sh. Please be careful that the details of the internal implementation can be changed in future.

1. What kind of Bash features does ble.sh use to make an interactive interface?

I originally wrote this section in "How Interactive Shells Work · oilshell/oil Wiki" to explain what kind of APIs is needed to make an interactive interface on top of them.

Processing user inputs

ble.sh uses bind -x which can be used to bind a user-provided command to a user input sequence. ble.sh steals all the user inputs from GNU Readline by binding a shell function to all possible byte values 0-255. The essential idea can be illustrated by the following code (although there are many workarounds for old Bash bugs in actual ble.sh. See lib/init-bind.sh).

declare i
for i in {0..255}; do
  declare keyseq=$(untranslate-keyseq "$i")
  bind -x "\"$keyseq\": process-byte $i"
done

There is no explicit main loop in ble.sh. ble.sh processes received bytes asynchronously one-by-one. In other words, it borrows the main loop of GNU Readline in which Readline calls the shell functions bounded by bind -x. The input byte stream is decoded into the character stream by the specified input encoding (default: UTF-8). The character stream is translated into the key stream by processing special escape sequences that represents cursor keys, function keys, key modifiers, etc. Finally key sequences are constructed from keys in the key stream based on the current keymaps and are dispatched for various operations. All of these input processing is implemented by Bash programs (See src/decode.sh).

Another important Bash feature that ble.sh utilizes is read -t 0 which can be used to test if the next byte in standard input is already available or not. ble.sh uses read -t 0 for polling. For example, ble.sh implements costly operations (e.g. history load, autosuggestions, filtering of menu items, history search) in a kind of coroutines/fibers and perform them in backgrounds while there is no user inputs. When ble.sh detects user inputs by read -t 0, it suspends the fiber and resume it after finishing the processing of the user inputs. Also ble.sh uses read -t 0 to detect the pasting from clipboard (assuming that many inputs in a short time is pasting), etc. (cf the fiber system is implemented by functions ble/util/idle.* in src/util.sh).

API Requirements: To summarize, ble.sh only requires primitive I/O operations, receive byte (bind -x) and poll (read -t 0) for its essential part. In other words, Bash/Readline doesn't provide any satisfactory high-level APIs for user-input processing (Bash/Readline provides bind for key bindings but it has tight limitations). If a shell provides some high-level support, a customizable key-binding system and a coroutine system would help users to develop interactive interfaces.

Layout and rendering of command line

ble.sh directly constructs the terminal control sequences (escape sequences) by itself. First it determines the graphic attributes (highlighting color, etc.) of each character in the command line (this is another long story, so I'll skip the details). Next, it calculates the width of each Unicode character (it doesn't support combining characters currently) and determine the display position of each character. Then it constructs the control sequences to update the changed part (the characters which has colors or positions different from those in the previous rendering). Finally it outputs the constructed sequences to stderr (See src/canvas.sh for primitive layout/rendering functions, and ble/textarea#* in src/edit.sh for command line rendering).

When ble.sh calculates the layout, it uses the terminal sizes which is available through the special Bash variables LINES and COLUMNS (Of course shopt -s checkwinsize is turned on by ble.sh). Also ble.sh traps SIGWINCH to update the layout and redraw the command line on the size change of terminals. It should also be noted that prompts are also calculated by ble.sh by analyzing PS1 so that ble.sh knows the size and cursor movement of the prompt (See ble-edit/prompt/* in src/edit.sh). When constructing the control sequences, ble.sh also refers to terminfo/termcap by tput command if available (See lib/init-term.sh).

Also, when ble.sh is activated, all the outputs from Bash/Readline is suppressed. To achieve this, ble.sh performs redirection of file descriptors of Bash process using exec >... <....

API Requirements: ble.sh requires a primitive I/O operation output string (printf). In addition, the means to get the current terminal size (LINES and COLUMNS) is needed. The same information can be obtained by external commands such as tput lines and tput cols (ncurses) or resize (xterm utility), yet it is useful to provide them as builtin features (as these commands might not be available in the system). If a shell provides high-level support for this, layout and rendering can be performed by the shell but not by the shell scripts so that the shell scripts only have to specify the characters and their graphic attributes. If the shell provides the prompt calculation, it should also provide the cursor position information after the prompt is printed. The means to suppress/control the I/O of the original shell is also needed.

Command execution

ble.sh uses eval. The commands must be executed in the top-level context (i.e., not in the function scope), so ble.sh uses a form of bind -x slightly modified from that described in the above section (Processing user inputs):

bind -x "\"$keyseq\": process-byte $i; eval -- \"\$_toplevel_commands\""

Here the shell variable _toplevel_commands is usually empty but contains commands only when some commands should be executed in the top-level context.

Also ble.sh needs to adjust the state of terminals and TTY handlers using special terminal sequences and also the external command stty before and after the command execution. Those adjustments are also included in _toplevel_commands

API Requirements: The ble.sh requires a means to execute commands in the top-level context (direct eval in bind -x). Also ble.sh uses the external command stty to adjust the pty handler state which might be better to be built in the shell.

Summary

ble.sh expects Bash for primitive IO operations such as read (bind -x), write (printf), select/poll (read -t 0), file descriptor manipulation (exec redirections). Also, it uses bind -x & eval to execute command in the top-level context. To properly layout and render the command line contents, it needs a means to get the current terminal size ($LINES and $COLUMNS) and detect the terminal-size change (SIGWINCH trap).

2. Concurrency in ble.sh

There are many heavy operations in interactive interface of shells as described below. These operations are performed in backgrounds with some mechanism of concurrency in ble.sh.

  • Delayed Load: One example is the initialization of ble.sh. The entire codebase of ble.sh involves more than 40k lines, and it will take some time to source the entire scripts and perform even a minimal initialization. To reduce the start-up time of the Bash interactive session for better user experience, the main code of ble.sh only contains the basic line-editor and command execution feature (though it is still about 21k lines). The other modules such as syntax analysis (lib/core-syntax.sh ~ 7k lines), completion engine (lib/core-complete.sh ~ 6k lines), vim editing-mode (keymap/vi.sh ~ 8k lines), and other initialization scripts are loaded in backgrounds after ble.sh session started.
  • History Initialization: Another example of heavy operation is loading of history. ble.sh refers to Bash command history in line editing to visit and search old commands in history. To implement this feature, the history should be loaded into arrays which takes some time when there are many history entries. ble.sh initializes the history arrays in the background in idle time in which there are no user inputs.
  • History Search: Another example is the history search which also takes some time. To enable user to cancel the search or to provide progress bars for the search, ble.sh also wants to perform some concurrent operations.
  • Completion: Also, complete can be another heavy operations when hundreds or thousands of possible completions are generated. In particular, as ble.sh processes all the possible completions in Bash script, it can take a longer time than the normal Bash interface.

In Bash, one may create a background subshell by command & for concurrency. But the problem of this method is that it is complicated to synchronize the data between the main shell process and the subshell in real time. Another complication is that the standard output needs to be synchronized between the main process and background subshells, or the background subshells should not output anything to the standard output. Also, launching a new process by fork needs some computational costs. To avoid these problems, ble.sh runs in a single process/thread (mostly) but uses some mechanisms of concurrency similar to coroutines or fibers.

There are two major framework of concurrency in ble.sh. One is ble/util/idle and the other is

3. Internal and external states

The shell settings and the terminal settings for the line editor and the command execution is in general different. For example, the "echo" of the user input is desired for the command execution, while we don't want the "echo" of the user input when the line editor is in the foreground. In ble.sh, the setup for the line editor is called the internal state, and the setup for the command execution is called the external state. ble.sh switches many settings when it goes under transition from the line editor mode to the command execution mode and vice versa.

ble.sh adjusts the necessary part of the settings in its internal state, while it tries to preserve the external state for the command execution as much as possible. In principle, ble.sh tries to save the external settings when it switches to the internal state, and restores the external settings when it switches to the external state. However, for various reasons, some settings cannot be preserved or intentionally changed.

The states of TTY handler

ble.sh uses the POSIX utility stty to adjust the state of the TTY handler. However, the available options depend on the system, and the detailed interface of stty also depends on the system. For this reason, it is impossible to fully specify the TTY options for the internal state. Assuming that the external TTY state is not too strange, ble.sh copies the external TTY state and changes the parts that typically need the adjustments. If the external TTY state is totally broken, ble.sh may not work as expected.

ble.sh by default does not save and restore the external TTY state. There are two reasons. One is that the external TTY state can be broken by a crash of the executed command, where the necessary cleanup is missed. If the external TTY state were fully restored, it would affect the behavior of the succeeding commands. We assume that a broken external state is not an intended one by default. Another reason is that saving and restoring the TTY state requires the additional overhead of fork/exec, which is normally negligibly small but can be noticeable in systems like Cygwin, WSL, and Termux or when the system has a high load average. To make ble.sh fully save and restore the external TTY state, please use the option bleopt term_tty_restore.

Shell settings

Since ble.sh itself is written in Bash script, some strange shell settings can break the line editor.

For example, if there are any shell functions or aliases that change the behavior of the builtin commands, those shell functions and aliases will be removed in the internal state. ble.sh still tries to restore the shell functions and aliases for the external state, but some commands such as builtin will not be restored.

Some builtin commands (trap, readonly, bind, history, read, and exit) are replaced by ble.sh's wrapper functions.

To distinguish ESC and the Meta modifier from each other, ble.sh sets the Readline setting keymap-timeout.

Other shell settings that are changed in the internal states include shell options (set -ekuvxBT), Bash options (extdebug, nocasematch, expand_aliases), a Readline setting (convert-meta), and the locale variables (LANG, LC_ALL, LC_COLLATE, and others affected by LC_ALL), and the variable IFS, POSIXLY_CORRECT, IGNOREEOF, FUNCNEST, PS1, PROMPT_COMMAND, and TIME_FORMAT. It also saves and restores BASH_REMATCH. Those are supposed to be restored in the external state.

Terminal settings

ble.sh also adjusts the terminal colors specified by SGR. If no SGR is specified in PS1 in plain Bash (without ble.sh), the graphic settings specified by a command using the control function SGR can affect all the rendering and the output of the subsequent commands. However, ble.sh also needs to change the setting for its syntax highlighting and other UIs. We do not restore the SGR state for the external state because there is no general way to get the current state so we cannot reliably save the setting in the first place. Some terminals have a mechanism to request the current SGR state, but this is not supported by many terminals and it has a delay because it requires a roundtrip communication. Also, some terminals support pushing/popping the SGR states, but this is supported by only a small number of terminals. Another reason not to restore it is that this again can be broken by a crash of an executed command, and thus perfectly restoring the external state is not useful in general.

Similarly, ble.sh changes the advanced keyboard protocols such as modifyOtherKeys and kitty's protocol, but it is difficult to obtain the current setting, so it is impossible to perfectly restore it for the external state. ble.sh also changes the bracketed paste mode and the cursor styles. They are all set to the typical (sane) state in the external state regardless of the previous external state. If the user sets them to some insane states for the command execution, it will not be preserved.

Clone this wiki locally