-
Notifications
You must be signed in to change notification settings - Fork 45
Support client-side schema validation using Pydantic #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support client-side schema validation using Pydantic #304
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is EPIC! Looks great, too. Love the validation test coverage. I had a couple of comments, and probably the only thing I felt strongly about was using an actual JSONPath library for traversing objects.
48883db
to
a1920f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 👍
d4f42bb
to
e406d76
Compare
This PR implements a layered architecture for managing and validating searchable data in Redis, with clear separation of concerns between schema definition, data validation, and storage operations. - `IndexSchema` provides the blueprint for data structure and constraints - Defines fields with specific types (TEXT, TAG, NUMERIC, GEO, VECTOR) - Supports different storage types (HASH, JSON) with appropriate configuration - `SchemaModelGenerator` dynamically creates Pydantic models from schema definitions - Implements a caching mechanism to avoid redundant model generation - Maps Redis field types to appropriate Python/Pydantic types - Provides type-specific validators: - VECTOR: validates dimensions and value ranges (e.g., INT8 range checks) - GEO: validates geographic coordinate format - NUMERIC: prevents boolean values - `BaseStorage` is the abstract class provides the foundation for Redis operations - Specialized implementations (HashStorage, JsonStorage) for different Redis data types - Enforces schema validation during write operations when set to True - Implements optimized batch operations using Redis pipelines - Supports both synchronous and asynchronous interfaces - Handles key generation, preprocessing, and error handling The `SearchIndex` contains the setting `validate_on_load`, which defaults on `False`. Objects are preprocessed and validated against the schema Objects are prepared with appropriate keys Batch writing occurs using Redis pipelines for efficiency TTL (expiration) can be applied if specified Keys are fetched in batches using pipelines Data is converted from Redis format to Python objects Bytes are automatically converted to appropriate types
This PR implements a layered architecture for managing and validating searchable data in Redis, with clear separation of concerns between schema definition, data validation, and storage operations. ## Key Components ### 1. Schema Definition Layer - `IndexSchema` provides the blueprint for data structure and constraints - Defines fields with specific types (TEXT, TAG, NUMERIC, GEO, VECTOR) - Supports different storage types (HASH, JSON) with appropriate configuration ### 2. Validation Layer - `SchemaModelGenerator` dynamically creates Pydantic models from schema definitions - Implements a caching mechanism to avoid redundant model generation - Maps Redis field types to appropriate Python/Pydantic types - Provides type-specific validators: - VECTOR: validates dimensions and value ranges (e.g., INT8 range checks) - GEO: validates geographic coordinate format - NUMERIC: prevents boolean values ### 3. Storage Layer - `BaseStorage` is the abstract class provides the foundation for Redis operations - Specialized implementations (HashStorage, JsonStorage) for different Redis data types - Enforces schema validation during write operations when set to True - Implements optimized batch operations using Redis pipelines - Supports both synchronous and asynchronous interfaces - Handles key generation, preprocessing, and error handling ### 4. Index Layer The `SearchIndex` contains the setting `validate_on_load`, which defaults on `False`. ## Data Flow ### Write Flow: Objects are preprocessed and validated against the schema Objects are prepared with appropriate keys Batch writing occurs using Redis pipelines for efficiency TTL (expiration) can be applied if specified ### Read Flow: Keys are fetched in batches using pipelines Data is converted from Redis format to Python objects Bytes are automatically converted to appropriate types
This PR implements a layered architecture for managing and validating searchable data in Redis, with clear separation of concerns between schema definition, data validation, and storage operations. ## Key Components ### 1. Schema Definition Layer - `IndexSchema` provides the blueprint for data structure and constraints - Defines fields with specific types (TEXT, TAG, NUMERIC, GEO, VECTOR) - Supports different storage types (HASH, JSON) with appropriate configuration ### 2. Validation Layer - `SchemaModelGenerator` dynamically creates Pydantic models from schema definitions - Implements a caching mechanism to avoid redundant model generation - Maps Redis field types to appropriate Python/Pydantic types - Provides type-specific validators: - VECTOR: validates dimensions and value ranges (e.g., INT8 range checks) - GEO: validates geographic coordinate format - NUMERIC: prevents boolean values ### 3. Storage Layer - `BaseStorage` is the abstract class provides the foundation for Redis operations - Specialized implementations (HashStorage, JsonStorage) for different Redis data types - Enforces schema validation during write operations when set to True - Implements optimized batch operations using Redis pipelines - Supports both synchronous and asynchronous interfaces - Handles key generation, preprocessing, and error handling ### 4. Index Layer The `SearchIndex` contains the setting `validate_on_load`, which defaults on `False`. ## Data Flow ### Write Flow: Objects are preprocessed and validated against the schema Objects are prepared with appropriate keys Batch writing occurs using Redis pipelines for efficiency TTL (expiration) can be applied if specified ### Read Flow: Keys are fetched in batches using pipelines Data is converted from Redis format to Python objects Bytes are automatically converted to appropriate types
This PR implements a layered architecture for managing and validating searchable data in Redis, with clear separation of concerns between schema definition, data validation, and storage operations.
Key Components
1. Schema Definition Layer
IndexSchema
provides the blueprint for data structure and constraints2. Validation Layer
SchemaModelGenerator
dynamically creates Pydantic models from schema definitions3. Storage Layer
BaseStorage
is the abstract class provides the foundation for Redis operations4. Index Layer
The
SearchIndex
contains the settingvalidate_on_load
, which defaults onFalse
.Data Flow
Write Flow:
Objects are preprocessed and validated against the schema
Objects are prepared with appropriate keys
Batch writing occurs using Redis pipelines for efficiency
TTL (expiration) can be applied if specified
Read Flow:
Keys are fetched in batches using pipelines
Data is converted from Redis format to Python objects
Bytes are automatically converted to appropriate types