Skip to content

[Search Improvements] Phase 2: Full-Text Search Implementation #7011

@carlesarnal

Description

@carlesarnal

Parent Epic

Part of #7009 - SQL Search Functionality Improvements

Overview

Implement native full-text search capabilities using database-specific features for PostgreSQL, MySQL, and SQL Server, with graceful fallback for H2.

Goals

  1. Create search capabilities abstraction layer
  2. Implement PostgreSQL full-text search with tsvector/tsquery
  3. Implement MySQL FULLTEXT support
  4. Implement SQL Server Full-Text Search
  5. Add relevance ranking to search results
  6. Graceful fallback for H2

Tasks

1. Search Capabilities Abstraction

  • Create SearchCapabilities interface:
    public interface SearchCapabilities {
        boolean supportsFullTextSearch();
        boolean supportsFuzzyMatch();
        boolean supportsRelevanceRanking();
        boolean supportsTrigram();
        FullTextSearchEngine getSearchEngine();
    }
  • Implement per-database capabilities detection
  • Add FullTextSearchEngine enum (NONE, POSTGRESQL_FTS, MYSQL_FTS, MSSQL_FTS)

2. PostgreSQL Full-Text Search

  • Add search vector columns to schema:
    ALTER TABLE artifacts ADD COLUMN search_vector tsvector;
    ALTER TABLE versions ADD COLUMN search_vector tsvector;
    ALTER TABLE groups ADD COLUMN search_vector tsvector;
  • Create GIN indexes for search vectors
  • Create triggers to auto-update search vectors on INSERT/UPDATE
  • Implement migration script to populate existing data
  • Add weighted search with field priorities (name: A, description: B, artifactId: A)
  • Implement plainto_tsquery for user queries
  • Add ts_rank for relevance scoring

3. MySQL Full-Text Search

  • Add FULLTEXT indexes:
    ALTER TABLE artifacts ADD FULLTEXT INDEX FT_artifacts_search (name, description);
    ALTER TABLE versions ADD FULLTEXT INDEX FT_versions_search (name, description);
  • Implement natural language mode queries
  • Add relevance scoring with MATCH...AGAINST
  • Handle minimum word length limitations

4. SQL Server Full-Text Search

  • Create Full-Text Catalog:
    CREATE FULLTEXT CATALOG RegistrySearchCatalog AS DEFAULT;
    CREATE FULLTEXT INDEX ON artifacts(name, description) KEY INDEX PK_artifacts;
  • Implement CONTAINS/FREETEXT queries
  • Add relevance ranking

5. H2 Fallback

  • Document limitations for in-memory database
  • Ensure optimized LIKE queries are used
  • Apply case-insensitive comparison consistently

6. New Search Filter Type

  • Add fullText to SearchFilterType enum
  • Implement SearchFilter.ofFullText(String query) factory method
  • Update AbstractSqlRegistryStorage.searchArtifacts() to handle fullText filter

7. REST API Updates

  • Add q query parameter to search endpoints for full-text queries
  • Update OpenAPI specification
  • Add relevanceScore field to search result DTOs
  • Document new search capabilities in API docs

8. Testing

  • Unit tests for each database's full-text implementation
  • Integration tests for full-text search on PostgreSQL
  • Integration tests for relevance ranking
  • Test search with special characters and edge cases
  • Performance comparison tests

Files to Modify

  • app/src/main/java/io/apicurio/registry/storage/dto/SearchFilterType.java
  • app/src/main/java/io/apicurio/registry/storage/dto/SearchFilter.java
  • app/src/main/java/io/apicurio/registry/storage/dto/SearchedArtifactDto.java
  • app/src/main/java/io/apicurio/registry/storage/impl/sql/AbstractSqlRegistryStorage.java
  • app/src/main/java/io/apicurio/registry/storage/impl/sql/SqlStatements.java
  • app/src/main/java/io/apicurio/registry/storage/impl/sql/PostgreSQLSqlStatements.java
  • app/src/main/java/io/apicurio/registry/storage/impl/sql/MySQLSqlStatements.java
  • app/src/main/java/io/apicurio/registry/storage/impl/sql/SQLServerSqlStatements.java
  • app/src/main/java/io/apicurio/registry/rest/v3/impl/SearchResourceImpl.java
  • Database upgrade DDL files

New Files

  • app/src/main/java/io/apicurio/registry/storage/impl/sql/search/SearchCapabilities.java
  • app/src/main/java/io/apicurio/registry/storage/impl/sql/search/FullTextSearchEngine.java
  • app/src/main/resources/io/apicurio/registry/storage/impl/sql/upgrades/103/postgresql.upgrade.ddl
  • app/src/main/resources/io/apicurio/registry/storage/impl/sql/upgrades/103/mysql.upgrade.ddl
  • app/src/main/resources/io/apicurio/registry/storage/impl/sql/upgrades/103/mssql.upgrade.ddl

Acceptance Criteria

  • Full-text search works on PostgreSQL with relevance ranking
  • Full-text search works on MySQL with relevance ranking
  • Full-text search works on SQL Server with relevance ranking
  • H2 gracefully falls back to LIKE-based search
  • New q parameter available on search REST endpoints
  • Search results include relevance score when applicable
  • Existing search functionality unchanged
  • All tests pass on all supported databases

Configuration

apicurio.search.fulltext.enabled=true
apicurio.search.fulltext.language=english
apicurio.search.fulltext.min-word-length=2

API Changes

# New query parameter
- name: q
  in: query
  description: Full-text search query across name, description, and artifactId
  schema:
    type: string

# New response field
SearchedArtifact:
  properties:
    relevanceScore:
      type: number
      description: Search relevance score (0.0-1.0), only present for full-text queries

Labels: enhancement, storage, search, full-text

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions