Skip to content

Separate Index and Document Definitions #638

@elicore

Description

@elicore

Index definition is currently used as part of the document (or hash) definition in the annotations.
To support advanced indexing strategies, it would be very beneficial to separate the two.

For example, the following:

@Document("ticket", indexName = "ticket_idx")
public class Ticket {
    // Fields
}

uses the document definition to set the index name and define the key prefix at the same time. The use of @IndexingOptions allows better fine-tuning for the index, but it is also used as part of the document definition.

Ideally, we would have the following independent from each other:

  1. The index definition itself, mainly the index name and the prefixes it tracks.
  2. The document definition which describes how the fields are serialized and indexed.
  3. The document prefix - this will allow consistent document definitions across prefixes without having to have multiple classes for the same document with different prefixes.

There are a few use cases where this would be helpful, for example with the following strategies:

Index Aliasing

In this strategy, described in Support Index Aliasing (for Blue/Green datasets, CQRS), we can define an alias for the index. The alias will be the point of reference for the client code while the underlying index may be changed by updating the alias for e.g. a refreshed dataset. The keys in this case will be under different prefixes (for example denoted by the refresh date), but their schema and index definitions will be the same.

Multiple Indexes on a Subset of the Data

In this strategy, we can define a single document (with a single prefix), but create multiple indices on top of the document space by filtering based on an attribute.

For example, in this definition:

@Document("ticket")
@IndexingOptions(
    indexName = "ticket_idx",
    filter = { "@team == \"TeamA\" " }
)
public class Ticket {
    // Fields
    @Indexed
    String team;
    ...
}

the @IndexingOptions annotation would be better suited on a Repository (or EntityStream), allowing multiple different indexes for the same dataset, with an easy way to switch a document between them - change the team value.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions