Skip to content

Commit 4e397ad

Browse files
romuletskibanamachinehop-dev
authored andcommitted
[Entity Store] Add Upsert Entity API (#234454)
## Summary Add Upsert Entity API which reflects changes made via the API directly in the final entities index. #### What is implemented - Update documents - Allowed fields: - `entity.attributes.*` - `entity.lifecyle.*` - `entity.behavior.*` - Force update documents #### Added ES Assets: - Component Template `security_${type}_default-updates@platform` - Index Template `entities_v1_updates_security_${type}_default_index_template` - Index `.entities.v1.updates.security_${type}_default` #### What is not implemented - Create - ILM Policy to delete update documents #### How to test Ingest entities and run in the dev console: ``` PUT kbn:/api/entity_store/entities/generic { "entity": { "id": "<ID>", "attributes": { "StorageClass": "hot" } } } ``` ### How it works Before explaining the API itself, a refresher on the entity store <details> <summary> Entity Store Diagram </summary> ```mermaid flowchart TB subgraph Main Flow A[(.logs*)] ~~~~ B[Transform] B ---> |Fetches raw entity data| A B ---> | Sends Aggregated Data | G{Ingest Pipeline} G --> | Combines new and old data and stores it| C[(.entity.v1.latest*)] end G -.-> | Fetches data older than transform retention policy| D[(.enrich-index-entities)] subgraph Retention Policy Flow direction LR E((Kibana Task)) -->|trigger every hour| F[Enrich Policy Entities] F -.->| Fetches most upto date entities| C F --->| Stores data | D end ``` </details> Entity store works based on a Transform which has a look back period of X hours (current 3h). That means data older than look period won't be retained. To solve that an Enrich Policy is set in place that takes hourly snapshots of the current state of the entity store and makes it available to, via ingest pipeline, enrich entity updates and make sure that we have data older than look back period present. Awesome. This adds complexity to this feature. The goal is add an api that once called reflects data changes immediately in the latest index. A few things were considered: - ❌ Add a new document to an update index to be picked up by the transform. - That doesn't satisfy the requirement because changes will be made available only after a transform finishes its run - ❌ Perform update by query in the latest index. - That works great if the entity in the latest index doesn't get any other update via the transform - what we can't guarantee of course. So the arrived solution was to both perform update by query in the latest index and publish an update document to be picked up by the transform, this way we get the best of both worlds. - So first Update by query on `.entities.v1.latest.security_$TYPE_default` (update made via painless) - Indexes a new document on `.entities.v1.updates.security_$TYPE_default` to be picked up by the transform. ```mermaid flowchart LR A[User] -->|PUT /api/entity_store/entities/$TYPE| B[Kibana] B --> |update by query| C[(.entities.v1.latest.security_$TYPE_default)] B --> |create new doc| D[(.entities.v1.updates.security_$TYPE_default)] ``` We have considered adding a priority mechanism to the update index so we would make sure that documents published to it would be picked up. First we found out that we don't need to make sure a document is seen by the transform. By its definition, transforms process every document - it doesn't have any mechanism to drop documents in case processing is taking too long. Second, we can't do it because the aggregations we run on already sort to find latest values, and sort on multiple fields is not possible. ### Fields and Schema Prior to this PR non generic entities (`user`, `host`, and `service`) had no exposure to concepts defined in the proposed `entity.*` ECS Schema. We had to address this to be able to make changes to `entity.attributes`, `entity.lifecyle` and `entity.behavior` fields. [The current direction](elastic/ecs#2513) is that `entity.*` fields will be nested under `user`, `host`,`service` and `generic` for data input and the latest index, with the final entities, would have a root `entity.*` field set. In other words, there is a difference between entity data input location and entity data output location. The document ```json { "user": { "entity": { "id" : "romulo", "type": "aws-user" } } } ``` Will be represented in the latest index as ```json { "entity": { "id" : "romulo", "type": "aws-user" } } ``` Because of the current direction of the discussion we decided to go towards there already. Therefore this PR contains changes to the entity definitions themselves adding entity fields that uses data source `{TYPE}.entity.*` and as destination `entity.*` (`x-pack/solutions/security/plugins/security_solution/server/lib/entity_analytics/entity_store/entity_definitions/entity_descriptions/common.ts`). That also posed another question, what will be the input like? Will it accept entity "input" or entity "output" format? I had decided to stay close to "output" format, therefore accept `entity.*` json fields and would be applied to the entity store. The reason behind it is simplicity of API. I believe that having a inconsistent placement for `entity` in the api isn't a great experience, therefore always accepting ```json { "entity": { "id" : "romulo", "type": "aws-user" } } ``` is better imo. **That's contradictory to the input via logs however**. Curious to hear people's opinion. There is another problem that further deviates the API from any ECS definition (input or output). For fields under `entity.attributes`, `entity.lifecyle` and `entity.behavior` we decided to define them on ECS. And because they are "custom fields" product would like them to have a `Capital_snake_case` format, which is not a traditional and developing with TS in such a case is not really allowed at the moment. To curb that, the api expose those fields as `snake_case` and before storing convert them to `Capital_snake_case`. That was the best way I found while still having field definition on OpenAPI spec. --------- Co-authored-by: kibanamachine <[email protected]> Co-authored-by: Mark Hopkin <[email protected]>
1 parent 0cadb16 commit 4e397ad

File tree

69 files changed

+3340
-636
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+3340
-636
lines changed

oas_docs/output/kibana.serverless.yaml

Lines changed: 104 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -12223,6 +12223,51 @@ paths:
1222312223
x-metaTags:
1222412224
- content: Kibana, Elastic Cloud Serverless
1222512225
name: product_name
12226+
/api/entity_store/entities/{entityType}:
12227+
put:
12228+
description: |
12229+
Update or create an entity in Entity Store.
12230+
If the specified entity already exists, it is updated with the provided values. If the entity does not exist, a new one is created. By default, only the following fields can be updated: * `entity.attributes.*` * `entity.lifecycle.*` * `entity.behavior.*` To update other fields, set the `force` query parameter to `true`. > info > Some fields always retain the first observed value. Updates to these fields will not appear in the final index.
12231+
> Due to technical limitations, not all updates are guaranteed to appear in the final list of observed values.
12232+
operationId: UpsertEntity
12233+
parameters:
12234+
- in: path
12235+
name: entityType
12236+
required: true
12237+
schema:
12238+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityType'
12239+
- in: query
12240+
name: force
12241+
required: false
12242+
schema:
12243+
default: false
12244+
type: boolean
12245+
requestBody:
12246+
content:
12247+
application/json:
12248+
schema:
12249+
$ref: '#/components/schemas/Security_Entity_Analytics_API_Entity'
12250+
description: Schema for the updating a single entity
12251+
required: true
12252+
responses:
12253+
'200':
12254+
content:
12255+
application/json:
12256+
schema:
12257+
$ref: '#/components/schemas/Security_Entity_Analytics_API_Entity'
12258+
description: Entity updated or created
12259+
'403':
12260+
description: Operation on a restricted field
12261+
'404':
12262+
description: Entity not found
12263+
'503':
12264+
description: Operation on an uninitialized Engine or in a cluster without CRUD API Enabled
12265+
summary: Upsert an entity in Entity Store
12266+
tags:
12267+
- Security Entity Analytics API
12268+
x-metaTags:
12269+
- content: Kibana, Elastic Cloud Serverless
12270+
name: product_name
1222612271
/api/entity_store/entities/list:
1222712272
get:
1222812273
description: List entities records, paging, sorting and filtering as needed.
@@ -71577,6 +71622,7 @@ components:
7157771622
- entity_engine
7157871623
- entity_definition
7157971624
- index
71625+
- data_stream
7158071626
- component_template
7158171627
- index_template
7158271628
- ingest_pipeline
@@ -71682,6 +71728,7 @@ components:
7168271728
- status
7168371729
- fieldHistoryLength
7168471730
Security_Entity_Analytics_API_EngineMetadata:
71731+
additionalProperties: false
7168571732
type: object
7168671733
properties:
7168771734
Type:
@@ -71736,6 +71783,40 @@ components:
7173671783
required:
7173771784
- has_all_required
7173871785
- privileges
71786+
Security_Entity_Analytics_API_EntityField:
71787+
additionalProperties: false
71788+
type: object
71789+
properties:
71790+
attributes:
71791+
additionalProperties: false
71792+
type: object
71793+
properties:
71794+
privileged:
71795+
type: boolean
71796+
behavior:
71797+
additionalProperties: false
71798+
type: object
71799+
EngineMetadata:
71800+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EngineMetadata'
71801+
id:
71802+
type: string
71803+
lifecycle:
71804+
additionalProperties: false
71805+
type: object
71806+
properties:
71807+
first_seen:
71808+
format: date-time
71809+
type: string
71810+
name:
71811+
type: string
71812+
source:
71813+
type: string
71814+
sub_type:
71815+
type: string
71816+
type:
71817+
type: string
71818+
required:
71819+
- id
7173971820
Security_Entity_Analytics_API_EntityRiskLevels:
7174071821
enum:
7174171822
- Unknown
@@ -71819,74 +71900,50 @@ components:
7181971900
- generic
7182071901
type: string
7182171902
Security_Entity_Analytics_API_GenericEntity:
71903+
additionalProperties: false
7182271904
type: object
7182371905
properties:
7182471906
'@timestamp':
7182571907
format: date-time
7182671908
type: string
7182771909
asset:
71910+
additionalProperties: false
7182871911
type: object
7182971912
properties:
7183071913
criticality:
7183171914
$ref: '#/components/schemas/Security_Entity_Analytics_API_AssetCriticalityLevel'
7183271915
required:
7183371916
- criticality
7183471917
entity:
71835-
type: object
71836-
properties:
71837-
category:
71838-
type: string
71839-
EngineMetadata:
71840-
$ref: '#/components/schemas/Security_Entity_Analytics_API_EngineMetadata'
71841-
id:
71842-
type: string
71843-
name:
71844-
type: string
71845-
source:
71846-
type: string
71847-
type:
71848-
type: string
71849-
required:
71850-
- id
71851-
- name
71852-
- type
71918+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityField'
7185371919
required:
7185471920
- entity
7185571921
Security_Entity_Analytics_API_HostEntity:
71922+
additionalProperties: false
7185671923
type: object
7185771924
properties:
7185871925
'@timestamp':
7185971926
format: date-time
7186071927
type: string
7186171928
asset:
71929+
additionalProperties: false
7186271930
type: object
7186371931
properties:
7186471932
criticality:
7186571933
$ref: '#/components/schemas/Security_Entity_Analytics_API_AssetCriticalityLevel'
7186671934
required:
7186771935
- criticality
7186871936
entity:
71869-
type: object
71870-
properties:
71871-
EngineMetadata:
71872-
$ref: '#/components/schemas/Security_Entity_Analytics_API_EngineMetadata'
71873-
name:
71874-
type: string
71875-
source:
71876-
type: string
71877-
type:
71878-
type: string
71879-
required:
71880-
- name
71881-
- source
71882-
- type
71937+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityField'
7188371938
event:
71939+
additionalProperties: false
7188471940
type: object
7188571941
properties:
7188671942
ingested:
7188771943
format: date-time
7188871944
type: string
7188971945
host:
71946+
additionalProperties: false
7189071947
type: object
7189171948
properties:
7189271949
architecture:
@@ -71897,6 +71954,8 @@ components:
7189771954
items:
7189871955
type: string
7189971956
type: array
71957+
entity:
71958+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityField'
7190071959
hostname:
7190171960
items:
7190271961
type: string
@@ -71924,7 +71983,6 @@ components:
7192471983
required:
7192571984
- name
7192671985
required:
71927-
- host
7192871986
- entity
7192971987
Security_Entity_Analytics_API_IdField:
7193071988
enum:
@@ -72118,50 +72176,42 @@ components:
7211872176
- description
7211972177
- category
7212072178
Security_Entity_Analytics_API_ServiceEntity:
72179+
additionalProperties: false
7212172180
type: object
7212272181
properties:
7212372182
'@timestamp':
7212472183
format: date-time
7212572184
type: string
7212672185
asset:
72186+
additionalProperties: false
7212772187
type: object
7212872188
properties:
7212972189
criticality:
7213072190
$ref: '#/components/schemas/Security_Entity_Analytics_API_AssetCriticalityLevel'
7213172191
required:
7213272192
- criticality
7213372193
entity:
72134-
type: object
72135-
properties:
72136-
EngineMetadata:
72137-
$ref: '#/components/schemas/Security_Entity_Analytics_API_EngineMetadata'
72138-
name:
72139-
type: string
72140-
source:
72141-
type: string
72142-
type:
72143-
type: string
72144-
required:
72145-
- name
72146-
- source
72147-
- type
72194+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityField'
7214872195
event:
72196+
additionalProperties: false
7214972197
type: object
7215072198
properties:
7215172199
ingested:
7215272200
format: date-time
7215372201
type: string
7215472202
service:
72203+
additionalProperties: false
7215572204
type: object
7215672205
properties:
72206+
entity:
72207+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityField'
7215772208
name:
7215872209
type: string
7215972210
risk:
7216072211
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityRiskScoreRecord'
7216172212
required:
7216272213
- name
7216372214
required:
72164-
- service
7216572215
- entity
7216672216
Security_Entity_Analytics_API_StoreStatus:
7216772217
enum:
@@ -72237,40 +72287,31 @@ components:
7223772287
- exponential_avg_documents_indexed
7223872288
- exponential_avg_documents_processed
7223972289
Security_Entity_Analytics_API_UserEntity:
72290+
additionalProperties: false
7224072291
type: object
7224172292
properties:
7224272293
'@timestamp':
7224372294
format: date-time
7224472295
type: string
7224572296
asset:
72297+
additionalProperties: false
7224672298
type: object
7224772299
properties:
7224872300
criticality:
7224972301
$ref: '#/components/schemas/Security_Entity_Analytics_API_AssetCriticalityLevel'
7225072302
required:
7225172303
- criticality
7225272304
entity:
72253-
type: object
72254-
properties:
72255-
EngineMetadata:
72256-
$ref: '#/components/schemas/Security_Entity_Analytics_API_EngineMetadata'
72257-
name:
72258-
type: string
72259-
source:
72260-
type: string
72261-
type:
72262-
type: string
72263-
required:
72264-
- name
72265-
- source
72266-
- type
72305+
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityField'
7226772306
event:
72307+
additionalProperties: false
7226872308
type: object
7226972309
properties:
7227072310
ingested:
7227172311
format: date-time
7227272312
type: string
7227372313
user:
72314+
additionalProperties: false
7227472315
type: object
7227572316
properties:
7227672317
domain:
@@ -72297,14 +72338,14 @@ components:
7229772338
type: string
7229872339
risk:
7229972340
$ref: '#/components/schemas/Security_Entity_Analytics_API_EntityRiskScoreRecord'
72341+
additionalProperties: false
7230072342
roles:
7230172343
items:
7230272344
type: string
7230372345
type: array
7230472346
required:
7230572347
- name
7230672348
required:
72307-
- user
7230872349
- entity
7230972350
Security_Entity_Analytics_API_UserName:
7231072351
type: object

0 commit comments

Comments
 (0)