|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code when working with code in this repository. |
| 4 | + |
| 5 | +## About This Project |
| 6 | + |
| 7 | +lobstr is a package for R developers that prints data structures and objects in a tree-like fashion. It provides specialized `base::str()`-like functions that help visualize objects during development: |
| 8 | + |
| 9 | +- `ast()`: draws the abstract syntax tree of R expressions |
| 10 | +- `ref()`: shows hows objects can be shared across data structures by digging into the underlying references |
| 11 | +- `obj_size()`: computes the size of an object taking these shared references into account |
| 12 | +- `cst()` shows how frames on the call stack are connected |
| 13 | + |
| 14 | + |
| 15 | +## Key development commands |
| 16 | + |
| 17 | +General advice: |
| 18 | +* When running R from the console, always run it with `--quiet` |
| 19 | +* Always run `air format .` after generating code |
| 20 | + |
| 21 | +### Testing |
| 22 | + |
| 23 | +- Use `devtools::test()` to run all tests |
| 24 | +- Use `devtools::test_file("tests/testthat/test-filename.R")` to run tests in a specific file |
| 25 | +- DO NOT USE `devtools::test_active_file()` |
| 26 | +- All testing functions automatically load code; you don't needs to. |
| 27 | + |
| 28 | +- All new code should have an accompanying test. |
| 29 | +- Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R`. |
| 30 | +- If there are existing tests, place new tests next to similar existing tests. |
| 31 | + |
| 32 | +### Documentation |
| 33 | + |
| 34 | +- Run `devtools::document()` after changing any roxygen2 docs. |
| 35 | +- Every user facing function should be exported and have roxygen2 documentation. |
| 36 | +- Whenever you add a new documentation file, make sure to also add the topic name to `_pkgdown.yml`. |
| 37 | +- Run `pkgdown::check_pkgdown()` to check that all topics are included in the reference index. |
| 38 | +- Use sentence case for all headings |
| 39 | + |
| 40 | +## Core Architecture |
| 41 | + |
| 42 | +### Main Components |
| 43 | + |
| 44 | +1. **Abstract Syntax Trees** (`R/ast.R`): |
| 45 | + - `ast()` - Visualizes R expression structure as a tree |
| 46 | + - Recursively processes calls, symbols, and literals |
| 47 | + - Uses rlang for quosure handling and expression manipulation |
| 48 | + - Output formatting handled by tree display utilities |
| 49 | + |
| 50 | +2. **Reference Tracking** (`R/ref.R`, `src/address.cpp`): |
| 51 | + - `ref()` - Shows memory addresses and shared references |
| 52 | + - `obj_addr()`, `obj_addrs()` - Get memory locations of objects |
| 53 | + - Tracks how objects are shared across data structures |
| 54 | + - Handles lists, environments, and optionally character vectors (global string pool) |
| 55 | + - Uses depth-first search with seen tracking to avoid infinite loops |
| 56 | + |
| 57 | +3. **Object Size Calculation** (`R/size.R`, `src/size.cpp`): |
| 58 | + - `obj_size()` - Computes memory size accounting for shared references |
| 59 | + - `obj_sizes()` - Shows individual contributions of multiple objects |
| 60 | + - Handles ALTREP objects correctly (R 3.5+) |
| 61 | + - Smart environment handling: stops at global, base, empty, and namespace environments |
| 62 | + - C++ implementation traverses object tree with deduplication |
| 63 | + |
| 64 | +4. **Call Stack Trees** (`R/cst.R`): |
| 65 | + - `cst()` - Displays call stack relationships |
| 66 | + - Wrapper around `rlang::trace_back()` with simplified output |
| 67 | + - Shows how frames are connected through parent relationships |
| 68 | + |
| 69 | +5. **Low-Level Inspection** (`R/sxp.R`, `src/inspect.cpp`): |
| 70 | + - `sxp()` - Deep inspection of C-level SEXP structures |
| 71 | + - Recursive descent into R's internal data structures |
| 72 | + - Optional expansion of: character pool, ALTREP, environments, calls, bytecode |
| 73 | + - Returns structured list with metadata (type, length, address, named status, etc.) |
| 74 | + |
| 75 | +6. **Memory Utilities** (`R/mem.R`): |
| 76 | + - `mem_used()` - Current R memory usage via `gc()` |
| 77 | + - Platform-aware node size calculation (32-bit vs 64-bit) |
| 78 | + |
| 79 | +7. **Generic Tree Printing** (`R/tree.R`): |
| 80 | + - `tree()` - General-purpose tree printer for nested lists |
| 81 | + - Highly customizable (depth, length limits, value/class printers) |
| 82 | + - Handles environments with cycle detection |
| 83 | + - Attribute display support |
| 84 | + |
| 85 | +### Key Design Patterns |
| 86 | + |
| 87 | +- **Tree Visualization**: Consistent tree-based output across all functions using shared utilities (`R/tree.R`, `R/utils.R`) |
| 88 | +- **C++ Integration**: Performance-critical operations (memory addresses, size calculation, SEXP inspection) implemented in C++ via cpp11 |
| 89 | +- **Reference Tracking**: Inspection functions use `seen` sets to handle cycles and shared references |
| 90 | +- **Lazy Evaluation**: `ast()` and `obj_addr()` use rlang quasiquotation to quote the AST or avoid taking unnecessary references |
| 91 | +- **Testing Stability**: Address normalization in tests (sequential IDs instead of actual pointers) |
| 92 | + |
| 93 | +### File Organization |
| 94 | + |
| 95 | +- `R/` - R source code organized by main user-facing functions |
| 96 | + - `ast.R`, `ref.R`, `size.R`, `cst.R`, `sxp.R` - Main visualization functions |
| 97 | + - `mem.R` - Memory utilities |
| 98 | + - `address.R` - Address helper functions |
| 99 | + - `tree.R` - Generic tree printing infrastructure |
| 100 | + - `utils.R` - Shared utilities (string, display, box characters) |
| 101 | +- `src/` - C++ source code using cpp11 |
| 102 | + - `address.cpp` - Memory address extraction |
| 103 | + - `size.cpp` - Object size calculation with tree traversal |
| 104 | + - `inspect.cpp` - Deep SEXP inspection |
| 105 | + - `utils.h` - Shared C++ utilities |
| 106 | +- `tests/testthat/` - Comprehensive test suite |
| 107 | + |
| 108 | +### C++ Implementation Details |
| 109 | + |
| 110 | +All C++ code uses the cpp11 interface for R integration: |
| 111 | +- Uses `std::set<SEXP>` for tracking seen objects during traversal |
| 112 | +- Implements custom vector size calculation matching R's memory allocation strategy |
| 113 | +- Handles ALTREP objects (R 3.5+) with conditional compilation |
| 114 | +- Recursive tree traversal with depth limits to prevent stack overflow |
| 115 | +- Namespace and special environment detection to avoid infinite recursion |
| 116 | + |
| 117 | +This codebase prioritizes accurate visualization of R's internal structures while maintaining performance through C++ implementation of core algorithms. |
0 commit comments