🎉 doubletake v1.0.0 - First Official Release!

We're incredibly excited to announce the first official release of doubletake! 🚀

After months of development, rigorous testing, and careful optimization, we're proud to bring you a powerful, flexible, and production-ready library for intelligent PII detection and replacement in Python.

🌟 What is doubletake?

doubletake is a sophisticated library that automatically detects and replaces Personally Identifiable Information (PII) in your data structures. Whether you're anonymizing datasets for testing, protecting sensitive information in logs, or ensuring GDPR compliance, doubletake makes it effortless and reliable.

✨ Key Features in v1.0.0

🚀 Dual-Strategy Architecture

JSONGrepper: Lightning-fast JSON serialization + regex replacement for simple use cases
DataWalker: Flexible recursive tree traversal with full context for advanced scenarios
Automatic Strategy Selection: The library intelligently chooses the optimal approach based on your configuration

🎯 Smart PII Detection

Built-in patterns for the most common PII types:

📧 Email addresses ([email protected])
📱 Phone numbers (555-123-4567, (555) 123-4567)
🆔 Social Security Numbers (123-45-6789)
💳 Credit card numbers (4532-1234-5678-9012)
🌐 IP addresses (192.168.1.1)
🔗 URLs (https://example.com/path)

🔧 Highly Configurable

Custom Patterns: Add your own regex patterns for domain-specific PII
Allowed Lists: Exclude certain pattern types from replacement
Path Targeting: Precisely target specific data paths using dot notation
Flexible Replacement: Choose between asterisks, custom characters, or realistic fake data

📊 Realistic Fake Data Generation

# Instead of: email: "****@******.***" 
# Get: email: "[email protected]"

🌳 Deep Structure Support

Handle complex nested dictionaries and lists automatically
Preserve data structure and non-PII content perfectly
Breadcrumb navigation for context-aware processing

🛡️ Type Safe & Robust

Full type hints for excellent IDE support
Comprehensive input validation
100% test coverage with rigorous edge case testing

🚀 Quick Start

pip install doubletake

from doubletake import DoubleTake

# Initialize with default settings
db = DoubleTake()

# Your data with PII
data = [
    {
        "user_id": 12345,
        "name": "John Doe",
        "email": "[email protected]", 
        "phone": "555-123-4567",
        "ssn": "123-45-6789"
    }
]

# Replace PII automatically
masked_data = db.mask_data(data)
# Result: email becomes "****@******.***", phone becomes "***-***-****"

🎛️ Advanced Configuration Examples

Generate Realistic Fake Data

db = DoubleTake(use_faker=True)
# Emails become: [email protected]
# Phones become: +1-555-234-5678

Custom Replacement Logic

def custom_replacer(item, key, pattern_type, breadcrumbs):
    if pattern_type == 'email':
        return "***REDACTED_EMAIL***"
    elif pattern_type == 'ssn':
        return "XXX-XX-XXXX"
    return "***CLASSIFIED***"

db = DoubleTake(callback=custom_replacer)

Precise Path Targeting

# Only replace PII at specific locations
db = DoubleTake(known_paths=[
    'customer.email',
    'billing.ssn',
    'contacts.emergency.phone'
])

🏗️ Architecture Highlights

Intelligent Strategy Selection

# Fast path: Uses JSONGrepper for simple replacements
db = DoubleTake()  

# Advanced path: Uses DataWalker for complex scenarios  
db = DoubleTake(use_faker=True)
db = DoubleTake(callback=my_function)

Performance Optimized

JSONGrepper: ~0.1s for 10,000 records (simple patterns)
DataWalker: ~0.3s for 10,000 records (with fake data generation)

🧪 Real-World Use Cases

API Response Sanitization

Perfect for sanitizing API responses before logging:

api_response = {
    "status": "success", 
    "data": {
        "users": [
            {"id": 1, "email": "[email protected]", "role": "admin"}
        ]
    }
}

db = DoubleTake()
safe_response = db.mask_data([api_response])[0]
# Safe to log without exposing PII

Database Export Anonymization

Anonymize database exports for development environments:

db_records = [
    {"patient_id": "PT001", "ssn": "123-45-6789", "email": "[email protected]"}
]

db = DoubleTake(use_faker=True)
anonymized_records = db.mask_data(db_records)
# Safe for development with realistic data

🔬 Quality & Testing

100% Test Coverage: Comprehensive test suite with edge case coverage
Type Safety: Full type hints and mypy compatibility
Input Validation: Robust configuration validation with clear error messages
Cross-Platform: Tested on Python 3.9+ across major platforms
Performance Tested: Benchmarked with large datasets

🤝 Contributing & Community

We're thrilled to have built something that we hope will be valuable to the Python community! This is just the beginning, and we're excited to see how you use doubletake in your projects.

Get Involved

🐛 Found a bug? Open an issue
💡 Have a feature idea? Start a discussion
🤝 Want to contribute? Check out our contributing guidelines

Development Setup

git clone https://github.com/dual/doubletake.git
cd doubletake
pipenv install --dev
pipenv run test

📋 What's Next?

We have exciting plans for future releases:

Additional PII pattern types (driver's licenses, passport numbers, etc.)
Performance optimizations for extremely large datasets
Plugin architecture for custom PII detectors
Integration with popular data processing frameworks
Enhanced documentation and tutorials

🙏 Acknowledgments

Special thanks to our early adopters, beta testers, and everyone who provided feedback during development. Your input was invaluable in making doubletake robust and user-friendly.

📄 License & Links

License: MIT License
PyPI: https://pypi.org/project/doubletake/
Documentation: https://github.com/dual/doubletake/wiki (coming soon)
Issues: https://github.com/dual/doubletake/issues
Security: See SECURITY.md

🎯 Installation

pip install doubletake

Minimum Requirements: Python 3.9+

Dependencies:

faker (for realistic fake data generation)
msgspec (for high-performance JSON processing)
typing_extensions (for enhanced type support)

Thank you for your interest in doubletake! We can't wait to see what you build with it.

Made with ❤️ for data privacy and security.

— The doubletake Team

Full Changelog: https://github.com/dual/doubletake/commits/1.0.0

Releases: dual/doubletake

1.1.0

Uh oh!

1.0.3

Uh oh!

1.0.0

🎉 doubletake v1.0.0 - First Official Release!

🌟 What is doubletake?

✨ Key Features in v1.0.0

🚀 Dual-Strategy Architecture

🎯 Smart PII Detection

🔧 Highly Configurable

📊 Realistic Fake Data Generation

🌳 Deep Structure Support

🛡️ Type Safe & Robust

🚀 Quick Start

🎛️ Advanced Configuration Examples

Generate Realistic Fake Data

Custom Replacement Logic

Precise Path Targeting

🏗️ Architecture Highlights

Intelligent Strategy Selection

Performance Optimized

🧪 Real-World Use Cases

API Response Sanitization

Database Export Anonymization

🔬 Quality & Testing

🤝 Contributing & Community

Get Involved

Development Setup

📋 What's Next?

🙏 Acknowledgments

📄 License & Links

🎯 Installation

Uh oh!