Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Events generated by agent must comply with the common schema #253

Open
sdvendramini opened this issue Oct 28, 2024 · 13 comments
Open

Events generated by agent must comply with the common schema #253

sdvendramini opened this issue Oct 28, 2024 · 13 comments
Assignees
Labels
level/epic Epic issue module/agent mvp Minimum Viable Product refinement type/enhancement Enhancement issue

Comments

@sdvendramini
Copy link
Member

sdvendramini commented Oct 28, 2024

Parent Issue: #241

Description

The events generated by the agent must adhere to the common schema for consistency and compatibility across systems.

Details

Format body stateless and stateful

Stateless
{
  "agent": {
      "id": "2887e1cf-9bf2-431a-b066-a46860080f56",
      "name": "agent1",
      "type": "endpoint",
      "version": "5.0.0",
      "groups": ["group1", "group2"],
      "host": {
          "hostname": "myhost",
          "os": {
              "name": "Amazon Linux 2"
              "platform": "Linux"
          },
          "ip": ["192.168.1.2"],
          "architecture": "x86_64"
      }
  }
}
{
  "module": "logcollector",
  "type": "file"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "tags": ["string"],
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}
{
  "module": "inventory",
  "type": "package"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "tags": ["string"],
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}
Stateful
{
  "agent": {
      "id": "2887e1cf-9bf2-431a-b066-a46860080f56",
      "name": "agent1",
      "type": "endpoint",
      "version": "5.0.0",
      "groups": ["group1", "group2"],
      "host": {
          "hostname": "myhost",
          "os": {
              "name": "Amazon Linux 2"
              "platform": "Linux"
          },
          "ip": ["192.168.1.2"],
          "architecture": "x86_64"
      }
  }
}
{
  "module": "inventory",
  "type": "package",
  "operation": "modified",
  "id": "lskdjf023984902358"
}
{
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
    "architecture": "string",
    "description": "string",
    "installed": "2024-10-28T18:26:10.634Z",
    "name": "string",
    "path": "string",
    "size": 0,
    "type": "string",
    "version": "string"
  }
}
{
  "module": "inventory",
  "type": "network",
  "operation": "add",
  "id": "lskdjf023984902358"
}
{
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
    "architecture": "string",
    "description": "string",
    "installed": "2024-10-28T18:26:10.634Z",
    "name": "string",{ 
      "agent": { 
        "uuid": "UUID", 
        "groups": [ ], 
        "os": "Amazon Linux 2", 
        "platform": "Linux", 
        "type": "Endpoint", 
        "version": "5.0.0", 
        "ip": "192.168.1.2" } 
    }
    "path": "string",
    "size": 0,
    "type": "string",
    "version": "string"
  }
}
{
  "module": "inventory",
  "type": "network",
  "operation": "delete",
  "id": "asdfsdfkdsj98237498325"
}

Tasks

@wazuhci wazuhci moved this to Backlog in Release 5.0.0 Oct 28, 2024
@vikman90 vikman90 added mvp Minimum Viable Product refinement and removed mvp labels Oct 29, 2024
@LucioDonda
Copy link
Member

Hi @sdvendramini While I'm looking for them:
Have you detected which fields or in which situation did the agent generate any non-ECS compliant event field?
Where they part of any particular module ?
TIA

@GGP1
Copy link
Member

GGP1 commented Oct 29, 2024

@LucioDonda the ECS templates have been modified recently, it is highly likely that the agent is generating events with an outdated format.

I've been working on Update stateful events data models #26568 which covers the same case but for the Communications API POST /events/stateful endpoint.

Here are some of the structures we are accepting in JSON format.

FIM
{
  "agent": {
	"id": "string",
	"groups": []
  },
  "file": {
	"attributes": [
  	"string"
	],
	"name": "string",
	"path": "string",
	"gid": 0,
	"group": "string",
	"inode": "string",
	"mtime": "2024-10-28T18:26:10.634Z",
	"mode": "string",
	"size": 0,
	"target_path": "string",
	"type": "string",
	"uid": 0,
	"owner": "string",
	"hash": {
  	"md5": "string",
  	"sha1": "string",
  	"sha256": "string"
	}
  },
  "registry": {
	"key": "string",
	"value": "string"
  }
}
Inventory package
{
  "agent": {
	"id": "string",
	"groups": []
  },
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
	"architecture": "string",
	"description": "string",
	"installed": "2024-10-28T18:26:10.634Z",
	"name": "string",
	"path": "string",
	"size": 0,
	"type": "string",
	"version": "string"
  }
}
Inventory processes
{
  "agent": {
	"id": "string",
	"groups": []
  },
  "scan_time": "2024-10-28T18:26:10.634Z",
  "process": {
	"pid": 0,
	"name": "string",
	"parent": {
  	"pid": 0
	},
	"command_line": "string",
	"args": [
  	"string"
	],
	"user": {
  	"id": "string"
	},
	"real_user": {
  	"id": "string"
	},
	"saved_user": {
  	"id": "string"
	},
	"group": {
  	"id": "string"
	},
	"real_group": {
  	"id": "string"
	},
	"saved_group": {
  	"id": "string"
	},
	"start": "2024-10-28T18:26:10.635Z",
	"thread": {
  	"id": "string"
	}
  }
}
Inventory system
{
  "agent": {
	"id": "string",
	"groups": []
  },
  "scan_time": "2024-10-28T18:26:10.635Z",
  "host": {
	"architecture": "string",
	"hostname": "string",
	"os": {
  	"kernel": "string",
  	"full": "string",
  	"name": "string",
  	"platform": "string",
  	"version": "string",
  	"type": "string"
	}
  }
}
Vulnerability
{
  "agent": {
	"id": "string",
	"groups": []
	"name": "string",
	"type": "string",
	"version": "string"
  },
  "host": {
	"os": {
  	"kernel": "string",
  	"full": "string",
  	"name": "string",
  	"platform": "string",
  	"version": "string",
  	"type": "string"
	}
  },
  "package": {
	"architecture": "string",
	"build_version": "string",
	"checksum": "string",
	"description": "string",
	"install_scope": "string",
	"installed": "2024-10-28T18:26:10.635Z",
	"license": "string",
	"name": "string",
	"path": "string",
	"reference": "string",
	"size": 0,
	"type": "string",
	"version": "string"
  },
  "scanner": {
	"source": "string",
	"vendor": "string"
  },
  "score": {
	"base": 0,
	"environmental": 0,
	"temporal": 0,
	"version": "string"
  },
  "category": "string",
  "classification": "string",
  "description": "string",
  "detected_at": "2024-10-28T18:26:10.635Z",
  "enumeration": "string",
  "id": "string",
  "published_at": "2024-10-28T18:26:10.635Z",
  "reference": "string",
  "report_id": "string",
  "severity": "string",
  "under_evaluation": true
}
Command result
{
  "document_id": "string",
  "result": {
      "code": "string",
      "message": "string",
      "data": "string"
  }
}

At the same time, those objects have to be inside the data field of a wrapper object that also includes a module field. For example, an inventory package event would look like this:

{
  "data": {
    "agent": {
      "id": "string",
      "groups": []
    },
    "scan_time": "2024-10-28T18:26:10.634Z",
    "package": {
      "architecture": "string",
      "description": "string",
      "installed": "2024-10-28T18:26:10.634Z",
      "name": "string",
      "path": "string",
      "size": 0,
      "type": "string",
      "version": "string"
    }
  },
  "module": "inventory_package"
}

If you have any doubts or comments, we can arrange a meeting to discuss this further.

@cborla cborla self-assigned this Oct 30, 2024
@wazuhci wazuhci moved this from Backlog to In progress in Release 5.0.0 Oct 30, 2024
@vikman90
Copy link
Member

vikman90 commented Oct 31, 2024

Module format

  • module.name: Name of the module
    • command
    • fim
    • inventory
    • vulnerability
    • sca
  • module.type: Data type, depending on the module (optional)
    • hotfix
    • network
    • package
    • port
    • process
    • system

Examples

{
  "module": {
    "name": "inventory",
    "type": "package"
  },
  "data": { ... }
}

{
  "module": {
    "name": "vulnerability"
  },
  "data": { ... }
}

{
  "module": {
    "name": "data"
  },
  "data": { ... } 
}

@vikman90
Copy link
Member

vikman90 commented Oct 31, 2024

Stateless: Logcollector

{
  "module": { "name": "logcollector" },
  "data": {
    "file": { "path": "/var/log/syslog" },
    "event": { "original": "2024-10-31T16:21:25.198579+01:00 Rocket systemd-resolved[176]: Clock change detected. Flushing caches." }
  }
}

@cborla
Copy link
Member

cborla commented Nov 1, 2024

Inventory analysis

The following analysis is based on the following sources.

The indexer (master) currently supports the following data structures for inventory.

@dataclass
class OS:
    """OS data model."""
    kernel: str
    full: str
    name: str
    platform: str
    version: str
    type: str
    family: str


@dataclass
class Host:
    """Host data model."""
    architecture: str
    hostname: str
    os: OS


@dataclass
class ProcessHash:
    md5: str


@dataclass
class Process:
    """Process data model."""
    hash: ProcessHash


@dataclass
class InventoryEvent(BaseModel):
    """Inventory events data model."""
    host: Host
    process: Process

    def get_index_name(self) -> str:
        """Get the index name for the event type.
        
        Returns
        -------
        str
            Index name.
        """
        return INVENTORY_INDEX
       

From the above classes of the Inventory Stateful event, we can obtain the following diagram.

InventoryEvent
│
├── Host
│   ├── architecture : str
│   ├── hostname : str
│   └── os : OS
│       ├── kernel : str
│       ├── full : str
│       ├── name : str
│       ├── platform : str
│       ├── version : str
│       ├── type : str
│       └── family : str
│
└── Process
    └── hash : ProcessHash
        └── md5 : str

There is a very big difference in the amount of data and data structures being shared. Currently the inventory gets 9 types of structures, with their corresponding information.

  • network_iface
  • network_protocol
  • network_address
  • packages
  • hotfixes
  • ports
  • processes
  • osinfo
  • hwinfo

As a first development, it can be adapted to the structure model proposed by the indexer.

@cborla
Copy link
Member

cborla commented Nov 1, 2024

Inventory analysis

New sources.

@cborla
Copy link
Member

cborla commented Nov 2, 2024

Update 1/11

  • New queue field included (modul_type) to support new packet format for both stateless events and stateful evnets, which is included in the package to be shipped in case it exists.
{
  "module": {
    "name": "inventory",
    "type": "package"
  },
  "data": { ... }
}

{
  "module": {
    "name": "vulnerability"
  },
  "data": { ... }
}
  • Update of the format of the messages to be sent to the server from logcollector.
{
  "module": { "name": "logcollector" },
  "data": {
    "file": { "path": "/var/log/syslog" },
    "event": { "original": "2024-10-31T16:21:25.198579+01:00 Rocket systemd-resolved[176]: Clock change detected. Flushing caches." }
  }
}
  • Updating of inventory tables and normalisation of fields. (WIP)

@cborla
Copy link
Member

cborla commented Nov 4, 2024

Stateful: Inventory

The agent is currently sending inventory messages in the following format. It remains to adapt the fields according to ECS, but within the format the operation to be carried out must be included.

{
    "data":
    {
        "argvs": null,
        "checksum": "ab94278230d240b66082ba6cbf52106cebff41ac",
        "cmd": null,
        "egroup": "root",
        "euser": "root",
        "fgroup": "root",
        "name": "kworker/u9:0-tt",
        "nice": -20,
        "nlwp": 1,
        "pgrp": 0,
        "pid": "86",
        "ppid": 2,
        "priority": 0,
        "processor": 2,
        "resident": 0,
        "rgroup": "root",
        "ruser": "root",
        "scan_time": "2024/11/02 01:55:47",
        "session": 0,
        "sgroup": "root",
        "share": 0,
        "size": 0,
        "start_time": 1730351047,
        "state": "I",
        "stime": 0,
        "suser": "root",
        "tgid": 86,
        "tty": 0,
        "utime": 0,
        "vm_size": 0
    },
    "operation": "DELETED",
    "type": "dbsync_processes"
}

@cborla
Copy link
Member

cborla commented Nov 5, 2024

Agent meta data

  • Token. UUID
  • Cabecera. user-agent
    • Agent version. 5.0.0
    • Agent type. Endpoint
    • Agent arch. x86_64
    • Agent platform. Linux
      Ejemplo: WazuhXDR/5.0.0 (Endpoint; x86_64; Linux)
  • La información con todo el detalle, en un objeto del body
    • OS. Amazon Linux 2
    • Platform. amzn
    • Type. Endpoint
    • Version. 5.0.0
    • Arch. x86_64
    • Grupos. ["pepa", "pepe"]
    • IP. 192.168.1.23
    • UUID. skjrowegj12355

Format body stateless y stateful

Stateless

{
  "agent": {
    "uuid": "UUID", 
    "groups": [ ], 
    "os": "Amazon Linux 2", 
    "platform": "Linux", 
    "type": "Endpoint", 
    "version": "5.0.0", 
    "ip": "192.168.1.2" 
  }
}
{
  "module": "logcollector",
  "type": "file"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "base": {
    "tags": "string"
  },
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}
{
  "module": "inventory",
  "type": "package"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "base": {
    "tags": "string"
  },
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}

Stateful

{ 
  "agent": { 
    "uuid": "UUID", 
    "groups": [ ], 
    "os": "Amazon Linux 2", 
    "platform": "Linux", 
    "type": "Endpoint", 
    "version": "5.0.0", 
    "ip": "192.168.1.2" } 
}
{
  "module": "inventory",
  "type": "package",
  "operation": "modified",
  "id": "lskdjf023984902358"
}
{
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
    "architecture": "string",
    "description": "string",
    "installed": "2024-10-28T18:26:10.634Z",
    "name": "string",
    "path": "string",
    "size": 0,
    "type": "string",
    "version": "string"
  }
}
{
  "module": "inventory",
  "type": "network",
  "operation": "add",
  "id": "lskdjf023984902358"
}
{
  "scan_time": "2024-10-28T18:26:10.634Z",
  "package": {
    "architecture": "string",
    "description": "string",
    "installed": "2024-10-28T18:26:10.634Z",
    "name": "string",
    "path": "string",
    "size": 0,
    "type": "string",
    "version": "string"
  }
}
{
  "module": "inventory",
  "type": "network",
  "operation": "delete",
  "id": "asdfsdfkdsj98237498325"
}

@cborla
Copy link
Member

cborla commented Nov 6, 2024

A new column is added to the queue, to store the module metadata.

New queue structure:

module_name module_type metadata data

This will allow the module data to be included in the new object, and will allow the data pair to be shared on a per-event basis.

{
  "module": "logcollector",
  "type": "file"
}

and

{
  "module": "inventory",
  "type": "network",
  "operation": "delete",
  "id": "asdfsdfkdsj98237498325"
}

@cborla
Copy link
Member

cborla commented Nov 7, 2024

Update 6/11

  • Added new metadata column for queue.
  • Adjusted unit tests for all modules or components that use the queue.
  • Started changes to support new Stateless structure.

@cborla
Copy link
Member

cborla commented Nov 8, 2024

Update 7/11

  • Fixed UT.
  • Fix rebase conflicts.
  • Complete logcollector Stateless changes.
  • Testing logcollector Stataless events.

@cborla
Copy link
Member

cborla commented Nov 12, 2024

Update 8/11

  • Fixed logcollector Stataless.
  • Rebase to master.
  • Refactoring inventory tables to comply ECS format.
Stateless
{
  "module": "logcollector",
  "type": "file"
}
{
  "log": {
    "file": {
      "path": "string"
    }
  },
  "tags": ["string"],
  "event": {
    "original": "string",
    "ingested": "string",
    "module": "string",
    "provider": "string"
  }
}
Example
{
    "module": "logcollector",
    "type": "file"
}
{
    "event":
    {
        "ingested": "",
        "module": "logcollector",
        "original": "Testing message!",
        "provider": "syslog"
    },
    "log":
    {
        "file":
        {
            "path": "/tmp/test.log"
        }
    },
    "tags":
    [
        "mvp"
    ]
}

Update 11/11

  • After an E2E test, a problem was found in the event consumption mechanism, which has now been fixed.
  • Rebase to master.
  • PR ready to review and merge to master.
  • Queue updated.
  • Logcollector stateful message.
Example event collected with mock server.
[2024-11-12 02:25:06] POST /api/v1/events/stateless
Headers:
Host: localhost
User-Agent: WazuhXDR/5.0.0 (Endpoint; x86_64; Linux)
Accept: application/json
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJ3YXp1aCIsImF1ZCI
6IldhenVoIENvbW11bmljYXRpb25zIEFQSSIsImlhdCI6MTczMTM3ODI5NCwiZXhwIjoxNzMxMzc4MzU0LCJ1d
WlkIjoiZWRhYjllZjYtZjAyZC00YTRiLWJhYTQtZjJhZDEyNzg5ODkwIn0.aiAqjq2Nm9giF7jKGz8L8rsA1JX
b5L25rNuKUZvwLAg
Content-Type: application/json
Content-Length: 404

Body:
{"agent":{"groups":[],"host":{"architecture":"x86_64","hostname":"chb-VBox","ip":"10.0.2.5","os":{"name":"Ubuntu","platform":"Linux"}},"id":"ee9009ba-f2db-4ac4-a74f-77f52c2d421a","type":"Endpoint","version":"5.0.0"}}
{"module":"logcollector","type":"file"}
{"event":{"ingested":"","module":"logcollector","original":"hola wazuh","provider":"syslog"},"log":{"file":{"path":"/tmp/test.log"}},"tags":["mvp"]}

Update 12/11

@vikman90 vikman90 added level/epic Epic issue and removed level/task Task issue level/epic Epic issue labels Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level/epic Epic issue module/agent mvp Minimum Viable Product refinement type/enhancement Enhancement issue
Projects
Status: In progress
Development

No branches or pull requests

6 participants