-
Notifications
You must be signed in to change notification settings - Fork 308
Open
Labels
Description
Epic Overview
This epic tracks the comprehensive improvements to address existing limitations in the search functionality of Apicurio Registry's SQL storage implementation.
Problem Statement
The current search implementation uses basic SQL LIKE queries which have significant limitations:
Performance Limitations
- Full table scans:
LIKE '%term%'patterns cannot use B-tree indexes - No index optimization: Current indexes on name/description are B-tree, not optimized for substring search
- Correlated subqueries: Label and content searches use EXISTS with correlated subqueries
- N+1 query pattern: Labels are fetched separately after main query results
Feature Limitations
| Feature | Current State |
|---|---|
| Full-text search | Not supported |
| Relevance ranking | Not implemented |
| Multi-field search | Not supported |
| Fuzzy matching | Not supported |
| Phrase search | Not supported |
| Boolean operators | Limited (only AND) |
| Content search | Hash-based only |
| Faceted search | Not supported |
| Date range queries | Not supported |
| Auto-complete/suggestions | Not implemented |
Proposed Solution
A phased approach to address limitations while maintaining backward compatibility:
Phase 1: Query Optimization (Foundation)
- Add trigram indexes for PostgreSQL
- Implement unified count with window functions
- Optimize label search queries
- Add database-specific query optimizations
Phase 2: Full-Text Search (Core Enhancement)
- Implement PostgreSQL full-text search with tsvector/tsquery
- Implement MySQL FULLTEXT support
- Implement SQL Server Full-Text Search
- Graceful fallback for H2
- Add relevance ranking to search results
Phase 3: Advanced Search Features
- Add new search filter types (owner, date ranges, hasLabels, hasReferences)
- Implement query expression parser for advanced queries
- Add faceted search support
- Implement search suggestions/auto-complete
Phase 4: Performance & Scalability
- Implement search result caching
- Add cursor-based pagination
- Implement async search for large datasets
- Performance testing and tuning
Related Files
AbstractSqlRegistryStorage.java- Main search implementationSearchFilterType.java- Filter type enumerationSearchFilter.java- Filter model classSqlStatements.java- SQL statement interfaceSearchResourceImpl.java- REST API implementation- Database DDL files (postgresql.ddl, mysql.ddl, mssql.ddl, h2.ddl)
Design Document
Full design specification available at: docs/sql-search-improvements-design.md
Sub-Issues
- [Search Improvements] Phase 1: Query Optimization #7010 - Phase 1: Query Optimization
- [Search Improvements] Phase 2: Full-Text Search Implementation #7011 - Phase 2: Full-Text Search Implementation
- [Search Improvements] Phase 3: Advanced Search Features #7012 - Phase 3: Advanced Search Features
- [Search Improvements] Phase 4: Performance & Scalability #7013 - Phase 4: Performance & Scalability
Acceptance Criteria
- All search queries optimized for large datasets (10k+ artifacts)
- Full-text search available for PostgreSQL, MySQL, and SQL Server
- Graceful degradation for unsupported databases (H2)
- Backward compatible REST API with new optional parameters
- Comprehensive test coverage for all search scenarios
- Documentation updated with new search capabilities
- Migration scripts for database schema changes
Labels: enhancement, storage, search
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status