-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
We often want to inline values of complex type to aid indexing/matching and presentation without traversing the graph to fetch those entities.
e.g.
- Address into
address
- Identification into
- Sanction (program) potentially for programCode
- Categorisation potentially, e.g. the Czech statistical categorisation
A concern with this is that there are continually new identifier types that we'd like to support. With the current approach, each of these results in a schema change.
It would be nice to consider options to support inlining identifiers in a way that scales. This could potentially also provide a strategy we apply when we inline other kinds of things.
Some things to consider:
- It's nice to be able to apply stronger validation to specific types with well-defined formats as we currently do with specific identifier props
- It might be good to have a way to version these, to support migrating from a scheme if we need to
- Some types are only applicable to some schemata e.g.
Organization:giiNumber
butCompany:bikCode
. That's easy to express as properties - We have varying degrees of certainty of the type of a value, e.g. it might be a bank account number, we might know that it's russian. A source might express it as a generic identification value with no further information, or even indicate its type incorrectly.
A couple of ideas we've floated
- Simply index nested versions
- Add some kind of scheme or type information to inlined values
- e.g.
1:ru_bik:123456789
where1
is the versionru_bik
is the type123456789
is the value
- To aid matching, perhaps more generic types can be defined for related but different types, with a library to fill down to more and more generic types, e.g.
ru_bik
can also be placed underbik
andbank_account
either before publication, or just before indexing - types might indicate how strong and how specific they are, which might be used in scoring matches
- Maybe categorisation could be expressed as a particularly weak form of an identifier
- e.g.
If we go in the direction of (2), we might want to consider applying a similar approach to inlining other kinds of fields, e.g. programCode on the entity might be scoped from the start
Metadata
Metadata
Assignees
Labels
No labels