Skip to content

[EPIC] SQL Search Functionality Improvements #7009

@carlesarnal

Description

@carlesarnal

Epic Overview

This epic tracks the comprehensive improvements to address existing limitations in the search functionality of Apicurio Registry's SQL storage implementation.

Problem Statement

The current search implementation uses basic SQL LIKE queries which have significant limitations:

Performance Limitations

  • Full table scans: LIKE '%term%' patterns cannot use B-tree indexes
  • No index optimization: Current indexes on name/description are B-tree, not optimized for substring search
  • Correlated subqueries: Label and content searches use EXISTS with correlated subqueries
  • N+1 query pattern: Labels are fetched separately after main query results

Feature Limitations

Feature Current State
Full-text search Not supported
Relevance ranking Not implemented
Multi-field search Not supported
Fuzzy matching Not supported
Phrase search Not supported
Boolean operators Limited (only AND)
Content search Hash-based only
Faceted search Not supported
Date range queries Not supported
Auto-complete/suggestions Not implemented

Proposed Solution

A phased approach to address limitations while maintaining backward compatibility:

Phase 1: Query Optimization (Foundation)

  • Add trigram indexes for PostgreSQL
  • Implement unified count with window functions
  • Optimize label search queries
  • Add database-specific query optimizations

Phase 2: Full-Text Search (Core Enhancement)

  • Implement PostgreSQL full-text search with tsvector/tsquery
  • Implement MySQL FULLTEXT support
  • Implement SQL Server Full-Text Search
  • Graceful fallback for H2
  • Add relevance ranking to search results

Phase 3: Advanced Search Features

  • Add new search filter types (owner, date ranges, hasLabels, hasReferences)
  • Implement query expression parser for advanced queries
  • Add faceted search support
  • Implement search suggestions/auto-complete

Phase 4: Performance & Scalability

  • Implement search result caching
  • Add cursor-based pagination
  • Implement async search for large datasets
  • Performance testing and tuning

Related Files

  • AbstractSqlRegistryStorage.java - Main search implementation
  • SearchFilterType.java - Filter type enumeration
  • SearchFilter.java - Filter model class
  • SqlStatements.java - SQL statement interface
  • SearchResourceImpl.java - REST API implementation
  • Database DDL files (postgresql.ddl, mysql.ddl, mssql.ddl, h2.ddl)

Design Document

Full design specification available at: docs/sql-search-improvements-design.md

Sub-Issues

Acceptance Criteria

  • All search queries optimized for large datasets (10k+ artifacts)
  • Full-text search available for PostgreSQL, MySQL, and SQL Server
  • Graceful degradation for unsupported databases (H2)
  • Backward compatible REST API with new optional parameters
  • Comprehensive test coverage for all search scenarios
  • Documentation updated with new search capabilities
  • Migration scripts for database schema changes

Labels: enhancement, storage, search

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions