-
Notifications
You must be signed in to change notification settings - Fork 308
Open
Labels
Description
Parent Epic
Part of #7009 - SQL Search Functionality Improvements
Overview
Improve search performance and scalability through caching, optimized pagination, and async search capabilities for large datasets.
Goals
- Implement search result caching
- Add cursor-based pagination for efficient deep pagination
- Implement async search for large result sets
- Performance testing and tuning
- Documentation and monitoring
Tasks
1. Search Result Caching
- Add caching infrastructure for search queries:
@ConfigProperty(name = "apicurio.search.cache.enabled", defaultValue = "false") boolean searchCacheEnabled; @ConfigProperty(name = "apicurio.search.cache.ttl-seconds", defaultValue = "60") int searchCacheTtlSeconds; @ConfigProperty(name = "apicurio.search.cache.max-entries", defaultValue = "1000") int searchCacheMaxEntries;
- Implement cache key generation from search parameters
- Add cache invalidation on data changes (artifact create/update/delete)
- Use existing Quarkus caching infrastructure (Caffeine)
- Add cache hit/miss metrics
- Implement cache warm-up for common queries (optional)
2. Cursor-Based Pagination
- Create
SearchCursorclass:public class SearchCursor { private String lastGroupId; private String lastArtifactId; private Object lastSortValue; public String encode() { /* Base64 encode */ } public static SearchCursor decode(String cursor) { /* Decode */ } }
- Implement keyset/seek pagination:
-- Instead of OFFSET/LIMIT SELECT * FROM artifacts a WHERE (a.name, a.groupId, a.artifactId) > (?, ?, ?) ORDER BY a.name, a.groupId, a.artifactId LIMIT ?
- Add
cursorandnextCursorto search results:public class ArtifactSearchResultsDto { // Existing private List<SearchedArtifactDto> artifacts; private Integer count; // New private String nextCursor; private String prevCursor; }
- Support both offset-based and cursor-based pagination (backward compatible)
- Add REST API parameter for cursor
3. Async Search for Large Datasets
- Create async search infrastructure:
public class SearchJob { private String jobId; private SearchJobStatus status; private Instant createdOn; private Instant completedOn; private Integer totalResults; } public enum SearchJobStatus { PENDING, RUNNING, COMPLETED, FAILED, EXPIRED }
- Implement async search endpoints:
@POST @Path("/search/artifacts/async") public SearchJob startAsyncSearch(SearchRequest request); @GET @Path("/search/jobs/{jobId}") public SearchJobStatus getSearchStatus(@PathParam("jobId") String jobId); @GET @Path("/search/jobs/{jobId}/results") public ArtifactSearchResults getSearchResults( @PathParam("jobId") String jobId, @QueryParam("offset") int offset, @QueryParam("limit") int limit );
- Store async search results temporarily (configurable TTL)
- Implement job cleanup for expired/completed jobs
- Add progress tracking for long-running searches
4. Query Optimization
- Implement query plan analysis and logging:
@ConfigProperty(name = "apicurio.search.explain.enabled", defaultValue = "false") boolean explainEnabled;
- Add slow query detection and logging
- Implement query complexity limits to prevent resource exhaustion
- Add database connection pool tuning recommendations
- Optimize N+1 queries for label fetching
5. Performance Testing
- Create performance test suite:
- 10,000 artifacts search benchmark
- 100,000 artifacts search benchmark
- Concurrent search load testing
- Deep pagination performance tests
- Full-text search performance comparison
- Establish performance baselines
- Document performance characteristics per database
- Create performance regression tests for CI
6. Monitoring & Observability
- Add search-specific metrics:
apicurio_search_requests_total apicurio_search_duration_seconds apicurio_search_cache_hits_total apicurio_search_cache_misses_total apicurio_search_results_count apicurio_search_slow_queries_total - Add search query logging (configurable)
- Integrate with existing Micrometer metrics
- Add Grafana dashboard template for search metrics
7. Documentation
- Document search performance best practices
- Add capacity planning guidelines
- Document database-specific tuning recommendations
- Update configuration reference
- Add troubleshooting guide for slow searches
Files to Modify
app/src/main/java/io/apicurio/registry/storage/impl/sql/AbstractSqlRegistryStorage.javaapp/src/main/java/io/apicurio/registry/storage/dto/ArtifactSearchResultsDto.javaapp/src/main/java/io/apicurio/registry/rest/v3/impl/SearchResourceImpl.javaapp/src/main/java/io/apicurio/registry/rest/v3/SearchResource.javaapp/src/main/resources/application.propertiescommon/src/main/resources/META-INF/openapi.json
New Files
app/src/main/java/io/apicurio/registry/storage/search/SearchCache.javaapp/src/main/java/io/apicurio/registry/storage/search/SearchCursor.javaapp/src/main/java/io/apicurio/registry/storage/search/SearchJob.javaapp/src/main/java/io/apicurio/registry/storage/search/SearchJobManager.javaapp/src/main/java/io/apicurio/registry/metrics/SearchMetrics.javaintegration-tests/src/test/java/io/apicurio/tests/performance/SearchPerformanceIT.java
Acceptance Criteria
- Search caching reduces database load for repeated queries
- Cursor-based pagination performs consistently regardless of offset
- Async search available for queries that may take >30 seconds
- Performance benchmarks documented
- Slow query detection and logging operational
- Search metrics available in Prometheus format
- Documentation complete
Configuration
# Caching
apicurio.search.cache.enabled=false
apicurio.search.cache.ttl-seconds=60
apicurio.search.cache.max-entries=1000
# Pagination
apicurio.search.max-results=1000
apicurio.search.default-limit=20
apicurio.search.cursor.enabled=true
# Async search
apicurio.search.async.enabled=true
apicurio.search.async.threshold-ms=5000
apicurio.search.async.result-ttl-minutes=30
# Performance
apicurio.search.slow-query-threshold-ms=1000
apicurio.search.explain.enabled=false
apicurio.search.max-complexity=100Performance Targets
| Scenario | Target |
|---|---|
| Simple search (10k artifacts) | < 100ms |
| Full-text search (10k artifacts) | < 200ms |
| Faceted search (10k artifacts) | < 300ms |
| Deep pagination (page 1000) | < 200ms with cursor |
| Concurrent searches (100 req/s) | < 500ms p99 |
API Changes
Cursor Pagination
GET /search/artifacts?cursor=eyJsYXN0R3JvdXBJZCI6...&limit=20
Response:
{
"artifacts": [...],
"count": 5000,
"nextCursor": "eyJsYXN0R3JvdXBJZCI6...",
"prevCursor": "eyJsYXN0R3JvdXBJZCI6..."
}
Async Search
POST /search/artifacts/async
{
"filters": {...},
"orderBy": "name",
"limit": 10000
}
Response:
{
"jobId": "abc123",
"status": "PENDING",
"createdOn": "2024-01-15T10:30:00Z"
}
GET /search/jobs/abc123
Response:
{
"jobId": "abc123",
"status": "COMPLETED",
"totalResults": 8542,
"completedOn": "2024-01-15T10:30:05Z"
}
GET /search/jobs/abc123/results?offset=0&limit=100
Labels: enhancement, storage, search, performance, scalability
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status