You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adds a WARP.md file to provide guidance for developers working with the doubletake library, covering project overview, architecture, package structure, development commands, key files, PII patterns, CI/CD pipeline, and common development patterns.
Improves PII replacement in data walker
Refactors the string replacement logic in the data walker to handle known and extra patterns separately, allowing for more flexible PII masking.
Adds comprehensive unit tests to validate the string replacement functionality, covering various scenarios including email, phone, SSN, credit card, extra patterns, allowed patterns, and mixed-case PII.
This file provides guidance to WARP (warp.dev) when working with code in this repository.
4
+
5
+
## Project Overview
6
+
7
+
**doubletake** is a Python library for intelligent PII (Personally Identifiable Information) detection and replacement. It provides high-performance processing of complex nested data structures with multiple replacement strategies.
8
+
9
+
### Core Architecture
10
+
11
+
The library uses a **dual-strategy architecture** for optimal performance vs. flexibility:
12
+
13
+
1.**JSONGrepper**: High-performance JSON serialization + regex replacement for simple use cases
14
+
2.**DataWalker**: Recursive tree traversal with full context for advanced features (callbacks, fake data, path targeting)
15
+
16
+
**Strategy Selection Logic**: The main `DoubleTake` class automatically chooses the appropriate processor:
17
+
- Uses `JSONGrepper` when only basic pattern replacement is needed (default settings)
18
+
- Switches to `DataWalker` when advanced features are enabled (`use_faker=True`, custom `callback`, etc.)
19
+
20
+
### Package Structure
21
+
22
+
```
23
+
doubletake/
24
+
├── __init__.py # Main DoubleTake class with auto-strategy selection
25
+
├── searcher/
26
+
│ ├── json_grepper.py # Fast JSON-based PII replacement
27
+
│ └── data_walker.py # Flexible recursive data traversal
0 commit comments