Add internal code mode module for programmatic notebook control by manzt · Pull Request #8670 · marimo-team/marimo

manzt · 2026-03-12T19:07:59Z

This introduces marimo._code_mode, an internal agent-only API that gives programmatic access to a running marimo notebook. The motivating use case is letting agents (e.g. from a scratchpad) insert, delete, replace, and reorder cells without going through the frontend UI. e.g.,

import marimo._code_mode as cm

async with cm.get_context() as ctx:
    # Install packages (queued, installed before cell ops)
    ctx.install_packages("pandas", "polars>=0.20")

    # Cell ops (appends at end by default)
    cid = ctx.create_cell("import pandas as pd")
    ctx.create_cell("df = pd.DataFrame()", after=cid)
    ctx.create_cell("setup()", before=cid, hide_code=True, disabled=True)
    
    ctx.update_cell("my_cell", code="x = 42")
    ctx.update_cell("other", hide_code=False, disabled=True)
    ctx.delete_cell("old_cell")
    ctx.move_cell("my_cell", after="other_cell")

    # Set UI element values (batched)
    ctx.set_ui_value(slider, 10)

# Dry-run compile check is on by default; disable with:
async with cm.get_context(check=False) as ctx:
    ...

vercel · 2026-03-12T19:08:05Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Mar 16, 2026 1:40am

marimo/_code_mode/_edits.py

Copilot

Pull request overview

Adds an internal marimo._code_mode module to let agents programmatically edit a running notebook (insert/delete/replace/reorder cells and set UI values) by reducing edits into a plan and applying them to the kernel, with accompanying tests.

Changes:

Introduces marimo._code_mode with edit descriptors (_edits.py) and an async runtime context (_context.py) to apply edits and broadcast notifications.
Adds tests for plan building and apply_edit behavior.
Makes the kernel control loop more resilient by catching/logging exceptions during control request handling.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
`marimo/_code_mode/_edits.py`	Defines immutable edit descriptors (`NotebookCellData`, `NotebookEdit`) used to describe notebook edits.
`marimo/_code_mode/_context.py`	Implements `AsyncCodeModeContext` to apply edits against a live kernel/graph and send notifications/execute cells.
`marimo/_code_mode/__init__.py`	Exposes the internal API surface and provides usage docs/examples.
`marimo/_runtime/runtime.py`	Wraps `kernel.handle_message()` in the control loop with exception logging.
`tests/_code_mode/test_plan_building.py`	Adds unit tests for reducing edits into a plan (`_build_plan`).
`tests/_code_mode/test_apply_edit.py`	Adds integration-style tests for applying edits to a `Kernel` and observing graph/notifications.
`tests/_code_mode/__init__.py`	Initializes the new test package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

marimo/_code_mode/_edits.py

Copilot · 2026-03-12T20:00:51Z

marimo/_runtime/runtime.py

+                try:
+                    await kernel.handle_message(request)
+                except Exception:
+                    LOGGER.exception(
+                        "Failed to handle control request: %s",
+                        type(request).__name__,
+                    )


This new try/except logs and continues on any kernel.handle_message() exception. Two concerns: (1) exceptions from the merged UpdateUIElementCommand/ModelCommand branch (above) are still unhandled and can still crash the control loop, so the exception handling is inconsistent; (2) swallowing exceptions here may leave the client without a response/notification. Consider either applying the same guard to the merged branch and/or emitting an explicit failure notification/teardown so the system fails in a controlled way.

marimo/_code_mode/_context.py

Copilot · 2026-03-12T20:00:52Z

marimo/_code_mode/_context.py

+                if entry.config is not None:
+                    cell.configure(entry.config.asdict())
+                    self._kernel.cell_metadata[entry.cell_id] = CellMetadata(
+                        config=entry.config


Test coverage gap: there’s no test that updates code for an existing cell without providing config and asserts the cell’s previous config (from kernel.cell_metadata) is preserved. Adding a regression test around this branch would catch config-loss issues when recompiling/re-registering a cell.

Suggested change

if entry.config is not None:

cell.configure(entry.config.asdict())

self._kernel.cell_metadata[entry.cell_id] = CellMetadata(

config=entry.config

# Preserve existing config if no new config is provided

cfg = entry.config

if cfg is None:

existing_metadata = self._kernel.cell_metadata.get(

entry.cell_id

)

if existing_metadata is not None:

cfg = existing_metadata.config

if cfg is not None:

cell.configure(cfg.asdict())

self._kernel.cell_metadata[entry.cell_id] = CellMetadata(

config=cfg

marimo/_code_mode/__init__.py

marimo/_code_mode/_context.py

The code mode API lacked a way to install packages programmatically. This adds `install_packages(*packages)` which reads the user's configured package manager and passes pip-style specifiers directly through to it. Specifiers like `polars>=0.20` are passed as-is rather than being parsed into name/version pairs, since pip and uv handle the full specifier natively. ```python async with cm.get_context() as nb: await nb.install_packages("pandas", "polars>=0.20") ``` Tests mock the underlying `package_manager.install()` to verify each specifier string reaches the package manager unchanged.

The context module was getting large with op types, plan building, and validation mixed in alongside the public AsyncCodeModeContext class. This moves all internal machinery (op dataclasses, `_build_plan`, `_validate_ops`, `_PlanEntry`) into a dedicated `_plan.py` module so `_context.py` stays focused on the context manager API. Also renames `test_apply_edit.py` to `test_context.py` to match the module it tests, and switches notification assertions to use `msgspec.to_builtins()` for typed snapshot comparisons instead of manual dict-building helpers.

The code mode context's `install_packages` was async and executed immediately, which didn't fit the batched mutation pattern used by `add_cell`, `update_cell`, etc. It's now synchronous and queues packages, which are installed one-by-one in `__aexit__` before cell ops are applied. This ensures newly added cells can import the just-installed packages. ```py async with cm.get_context() as ctx: ctx.install_packages("pandas", "numpy>=2.0") ctx.add_cell("import pandas as pd") ``` When code is executed via the `/api/kernel/execute` scratchpad endpoint, `MissingPackageAlertNotification` is now suppressed from reaching the frontend via a new Room-level suppression mechanism on `session.scoped()`. The listener still captures the notification through the event bus and surfaces a helpful error suggesting `ctx.install_packages()` instead of the raw `ModuleNotFoundError` traceback. The convention in docstrings is also updated from `nb` to `ctx` throughout.

Calling `ctx.add_cell(...)` without `async with cm.get_context()` silently queues operations that never flush, making it look like the call succeeded when nothing actually happened. This was discovered during agent-driven notebook editing where the missing `async with` caused cells to never appear in the UI. All mutating methods (`add_cell`, `update_cell`, `delete_cell`, `move_cell`, `install_packages`) now check an `_entered` flag set by `__aenter__` and raise immediately with a clear message showing the correct pattern.

Extracts `is_headless_request()` into `_messaging/context.py` (replaces the private `_accepts_html()` in `tracebacks.py`). The missing-packages hook now checks this before broadcasting `MissingPackageAlertNotification`, so non-browser clients like the `/execute` SSE endpoint no longer receive the alert — the `ModuleNotFoundError` already surfaces as a cell error with an `ctx.install_packages(...)` hint. This also removes the Room-level `_suppressed_types` mechanism and the `suppress` parameter on `session.scoped()`, which were added to solve the same problem with more machinery.

The code mode context manager now validates all queued cell ops before touching the graph. On `__aexit__`, `_dry_run_compile` compiles each cell with `compile_cell`, temporarily registers it in the graph, and checks for newly introduced multiply-defined names or cycles. If anything fails the graph is restored and no mutations occur. Existing graph problems are snapshotted beforehand so only new issues are flagged. `get_context()` accepts `check=True` (default) to control this. The runtime errors include a `check=False` hint for callers who want to bypass validation intentionally. `set_ui_value` is now sync and queue-based like the other mutation methods. Updates are collected during the block and flushed as a single batched `UpdateUIElementCommand` on exit, after cell ops are applied.

When `update_cell` changed a cell's code without explicitly passing config kwargs, the cell's existing configuration (e.g. `hide_code`, `disabled`) was silently lost. The recompiled cell would get a bare default config, and the frontend notification would hardcode `hide_code=True` regardless of prior state. Now the `code_changed` path reads back the cell's existing `CellConfig` from kernel metadata and carries it forward. The notification fallback also uses stored metadata instead of a hardcoded default. New cells created via `create_cell` still default to `hide_code=True`.

These methods are the primary interface for AI agents editing notebooks programmatically. The previous docstrings were terse one-liners that didn't explain arguments or show usage patterns. The new docstrings follow the marimo convention (Examples with code blocks, then Args) so that agents working with this API can understand the semantics of each parameter — particularly the "None means keep existing" behavior of `update_cell` and the `draft` flag.

The code mode context was calling `graph.delete_cell()` directly when removing or updating cells, bypassing the kernel's cleanup path. This left stale variables in `kernel.globals`, leaked UI elements (no `RemoveUIElementsNotification`), and skipped lifecycle hook disposal. Now deletions go through `Kernel._delete_cell` and updates through `Kernel._deactivate_cell`, which properly invalidate globals, dispose UI elements, and fire lifecycle hooks. We use these internal primitives directly rather than `DeleteCellCommand` because we need synchronous graph manipulation within a single atomic batch — the command path would trigger `_run_cells` for descendants, which we already handle via `ExecuteCellsCommand` at the end. Also merges the previously duplicated `is_new` / `code_changed` branches in `_apply_ops`, preserves existing cell config on code-only updates, simplifies the notification config lookup, and fixes a mypy error where `_cell_manager` should have been the public `cell_manager` property.

The previous `_apply_ops` manually compiled cells, registered them in the graph, and called private kernel methods (`_delete_cell`, `_deactivate_cell`) to clean up state. This reimplemented half of what `mutate_graph` already does and was fragile — any changes to the kernel's cleanup path would need to be mirrored here. Now `_apply_ops` builds `ExecuteCellCommand` / `DeleteCellCommand` lists and passes them to `mutate_graph`, which handles compilation, registration, deletion, globals cleanup, UI element disposal, and lifecycle hooks through its established code path. Configs are resolved before the call (since `mutate_graph` may delete metadata for replaced cells) and applied after registration. Draft cells are excluded from the run set returned by `mutate_graph`.

`mutate_graph` calls `_deactivate_cell` (which removes a cell from the dict) then `_try_registering_cell` (which re-adds it at the end). This means updated cells lose their position in the ordering. Since the code mode plan tracks the intended cell order, we need to reorder the graph's internal dict after mutation to match. Adds `Topology.reorder_nodes` to rearrange the cells dict in-place, and calls it in `_apply_ops` right after `mutate_graph` returns.

The dry-run compile check in `_dry_run_compile` evicts and re-registers cells to validate updates, but `register_cell` appends to the end of the dict. This corrupts the cell ordering that `_apply_ops` reads on line 620 via `list(self.graph.cells.keys())`, so the plan built from that ordering preserves the wrong position. The fix snapshots the cell order before any mutations and restores it after cleanup. Separately, `run_scratchpad` was not flushing `state_updates` after execution. When a widget `.observe()` callback calls a `mo.state` setter from the scratchpad, the update gets queued but never processed because `run_scratchpad` returns without calling `_run_cells(set())`. Other code paths like `set_ui_element_value` and `handle_receive_model_message` already do this flush. Adding the same pattern to `run_scratchpad` fixes downstream cell reactivity for programmatic widget state changes.

The dry-run compile check in _dry_run_compile did not simulate _DeleteOp, so deleting a cell and creating a replacement that defines the same names would falsely raise "Multiply-defined names". Now delete ops evict cells from the graph during the dry run, matching the behavior already in place for _UpdateOp. Also renames update_cell to edit_cell for a clearer API.

When cell operations are applied via `async with cm.get_context()`, `__aexit__` now prints a line per operation to stdout. This gives agents confirmation that ops took effect without needing to re-query the graph. Previously, success was completely silent, making it impossible to distinguish "operations applied" from "nothing happened" when executing remotely via the scratchpad. Output looks like: ✓ created cell 'data_loader' ✓ edited code of cell 'a1b2c3d4' ✓ deleted cell 'scratch' Cell names are used when available, otherwise the first 8 characters of the cell ID.

When using /api/execute in headless mode, the notebook may not have been instantiated yet, meaning the kernel's globals are empty and scratchpad code can't reference notebook variables. This adds a session.instantiate() call before enqueuing the scratchpad command. The kernel already no-ops if the notebook is already instantiated, so this is safe on every call. Queue ordering guarantees instantiation completes before the scratchpad runs.

The code_mode API lacked support for cell names and couldn't convert a regular cell into a setup cell. `create_cell` and `edit_cell` now accept a `name` parameter. Passing `name="setup"` uses the well-known setup cell ID so the frontend recognizes it as a setup cell — the name itself is cleared since setup identity is purely a cell_id concern. `edit_cell` handles the tricky case: when an existing cell is converted to setup, it migrates the cell_id via a `new_cell_id` field on `_UpdateOp`. The plan builder swaps the ID in place (preserving position), and `_apply_ops` sees the old ID as deleted and the new one as added, which is exactly what `mutate_graph` needs. Cell IDs now use `CellIdGenerator` (producing short 4-letter IDs like `Hbol`) instead of UUIDs. The generator is seeded with existing graph cell IDs to avoid collisions. This was necessary because `cell_manager` is unreachable from the kernel process — it lives on the server side. ```python async with cm.get_context() as ctx: ctx.create_cell("import marimo as mo", name="setup") ctx.edit_cell("Hbol", name="setup") # migrates cell_id ```

When batching multiple operations in a single `async with` block, `before`/`after` targets could only reference cells by their live graph name or by a pending add's cell ID. This meant that `create_cell(..., name="foo")` followed by `create_cell(..., after="foo")` would fail, and so would referencing a cell by a name assigned via `edit_cell` in the same batch. `_resolve_target` now also searches pending adds by name and queued `_UpdateOp` renames, so all of these work within a single batch: ctx.create_cell("x = 1", name="first") ctx.create_cell("y = x + 1", after="first") ctx.edit_cell("old_cell", name="renamed") ctx.create_cell("z = 1", after="renamed")

github-actions · 2026-03-16T04:00:55Z

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.20.5-dev70

vercel bot deployed to Preview March 12, 2026 19:08 View deployment

manzt added the enhancement New feature or request label Mar 12, 2026

vercel bot deployed to Preview March 12, 2026 19:10 View deployment

manzt force-pushed the push-qtlwwmptyrvr branch from fe050d3 to 467bf90 Compare March 12, 2026 19:26

manzt requested a review from dmadisetti as a code owner March 12, 2026 19:26

vercel bot deployed to Preview March 12, 2026 19:27 View deployment

vercel bot deployed to Preview March 12, 2026 19:29 View deployment

manzt force-pushed the push-qtlwwmptyrvr branch from ea7bcfe to 1f5911e Compare March 12, 2026 19:38

vercel bot deployed to Preview March 12, 2026 19:39 View deployment

vercel bot deployed to Preview March 12, 2026 19:42 View deployment

mscolnick requested review from Copilot and removed request for dmadisetti March 12, 2026 19:54

Copilot started reviewing on behalf of mscolnick March 12, 2026 19:54 View session

mscolnick reviewed Mar 12, 2026

View reviewed changes

marimo/_code_mode/_edits.py Outdated Show resolved Hide resolved

mscolnick reviewed Mar 12, 2026

View reviewed changes

marimo/_code_mode/_edits.py Outdated Show resolved Hide resolved

Copilot AI reviewed Mar 12, 2026

View reviewed changes

manzt force-pushed the push-qtlwwmptyrvr branch from 2d61e07 to ae4e010 Compare March 12, 2026 20:42

manzt force-pushed the push-ukxtvoylltuk branch from 5ef4725 to 89284ff Compare March 12, 2026 20:42

vercel bot deployed to Preview March 12, 2026 20:43 View deployment

mscolnick reviewed Mar 12, 2026

View reviewed changes

marimo/_code_mode/_context.py Outdated Show resolved Hide resolved

Base automatically changed from push-ukxtvoylltuk to manzt/agent-cli March 12, 2026 21:12

manzt force-pushed the manzt/agent-cli branch from 101f362 to d6592fb Compare March 12, 2026 21:18

manzt force-pushed the push-qtlwwmptyrvr branch from ae4e010 to b1e266e Compare March 12, 2026 21:18

vercel bot deployed to Preview March 12, 2026 21:20 View deployment

vercel bot deployed to Preview March 12, 2026 21:30 View deployment

vercel bot deployed to Preview March 12, 2026 21:31 View deployment

Base automatically changed from manzt/agent-cli to main March 12, 2026 21:40

manzt force-pushed the push-qtlwwmptyrvr branch from 543988e to dd19df2 Compare March 12, 2026 21:41

vercel bot deployed to Preview March 12, 2026 21:42 View deployment

manzt and others added 18 commits March 15, 2026 21:12

Rename add_cell to create_cell

ff3ea2c

fix: expose .buffer on ThreadSafeStdout/Stderr for package installation

c9ca784

manzt force-pushed the push-qtlwwmptyrvr branch from c633134 to 57882da Compare March 16, 2026 01:12

vercel bot deployed to Preview March 16, 2026 01:13 View deployment

vercel bot deployed to Preview March 16, 2026 01:32 View deployment

manzt force-pushed the push-qtlwwmptyrvr branch from 9e24f0d to 4f9e1e7 Compare March 16, 2026 01:39

vercel bot deployed to Preview March 16, 2026 01:40 View deployment

mscolnick approved these changes Mar 16, 2026

View reviewed changes

manzt enabled auto-merge (squash) March 16, 2026 03:55

manzt disabled auto-merge March 16, 2026 03:55

manzt merged commit 1e5a7bf into main Mar 16, 2026
35 of 43 checks passed

manzt deleted the push-qtlwwmptyrvr branch March 16, 2026 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add internal code mode module for programmatic notebook control#8670

Add internal code mode module for programmatic notebook control#8670
manzt merged 27 commits intomainfrom
push-qtlwwmptyrvr

manzt commented Mar 12, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-                if entry.config is not None:
-                    cell.configure(entry.config.asdict())
-                    self._kernel.cell_metadata[entry.cell_id] = CellMetadata(
-                        config=entry.config
+                # Preserve existing config if no new config is provided
+                cfg = entry.config
+                if cfg is None:
+                    existing_metadata = self._kernel.cell_metadata.get(
+                        entry.cell_id
+                    )
+                    if existing_metadata is not None:
+                        cfg = existing_metadata.config
+                if cfg is not None:
+                    cell.configure(cfg.asdict())
+                    self._kernel.cell_metadata[entry.cell_id] = CellMetadata(
+                        config=cfg

Conversation

manzt commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

manzt commented Mar 12, 2026 •

edited

Loading

vercel bot commented Mar 12, 2026 •

edited

Loading