Skip to content

tracee event context has type overflows #3690

Open
@rafaeldtinoco

Description

@rafaeldtinoco

Description

While aware of #2870, we still need to support/fix current tracee event context, as it has data types overflows that are likely causing context to be lost (all uint32 values, from the eBPF context, might be overflowed when decoded into Event struct, such as the PIDNS value, declared as int, which defaults to very high numbers in my env).

The new event.proto takes care of correct types (using google.protobuf.UInt32Value in the right places, kudos to @josedonizetti).

More details

While playing with some marshalling and unmarshalling ideas (for other output printers, for example), I got the following error after a event -> json -> parquet conversion:

{
  "level": "error",
  "ts": 1700017058.446953,
  "msg": "Error encoding event to parquet",
  "error": "integer overflow on token 4026531841",
  "event": {
    "timestamp": 1700017058317424171,
    "threadStartTime": 1699918738371868249,
    "processorId": 7,
    "processId": 463085,
    "cgroupId": 4794,
    "threadId": 463085,
    "parentProcessId": 462937,
    "hostProcessId": 463085,
    "hostThreadId": 463085,
    "hostParentProcessId": 462937,
    "userId": 1000,
    "mountNamespace": 4026531841,
    "pidNamespace": 4026531836,
    "processName": "code",
    "executable": {
      "path": ""
    },
    "hostName": "rugged",
    "containerId": "",
    "container": {},
    "kubernetes": {},
    "eventId": "257",
    "eventName": "openat",
    "matchedPolicies": [
      ""
    ],
    "argsNum": 4,
    "returnValue": 45,
    "syscall": "openat",
    "stackAddresses": null,
    "contextFlags": {
      "containerStarted": false,
      "isCompat": false
    },
    "threadEntityId": 1656711223,
    "processEntityId": 1656711223,
    "parentEntityId": 723639701,
    "args": [
      {
        "name": "dirfd",
        "type": "int",
        "value": -100
      },
      {
        "name": "pathname",
        "type": "const char*",
        "value": "/proc/463548/cmdline"
      },
      {
        "name": "flags",
        "type": "string",
        "value": ""
      },
      {
        "name": "mode",
        "type": "mode_t",
        "value": 0
      }
    ]
  }
}

and:

$ sudo lsns | grep 4026531836
4026531836 pid       316      1 root            /sbin/init

and

11101111111111111111111111111100

isn't quite an unsigned 32 bit overflow but it is a signed 32 bit overflow.

Meaning that a bunch of Event fields that are int should actually be uint32:

type Context struct {
	Ts              uint64
	StartTime       uint64
	CgroupID        uint64
	Pid             uint32
	Tid             uint32
	Ppid            uint32
	HostPid         uint32
	HostTid         uint32
	HostPpid        uint32
	Uid             uint32
	MntID           uint32
	PidID           uint32
	Comm            [16]byte
	UtsName         [16]byte
	Flags           uint32
	LeaderStartTime uint64
	ParentStartTime uint64
	EventID         events.ID // int32
	Syscall         int32
	MatchedPolicies uint64
	Retval          int64
	StackID         uint32
	ProcessorId     uint16
	_               [2]byte // padding
}
type Event struct {
	Timestamp             int          `json:"timestamp"`
	ThreadStartTime       int          `json:"threadStartTime"`
	ProcessorID           int          `json:"processorId"`
	ProcessID             int          `json:"processId"`
	CgroupID              uint         `json:"cgroupId"`
	ThreadID              int          `json:"threadId"`
	ParentProcessID       int          `json:"parentProcessId"`
	HostProcessID         int          `json:"hostProcessId"`
	HostThreadID          int          `json:"hostThreadId"`
	HostParentProcessID   int          `json:"hostParentProcessId"`
	UserID                int          `json:"userId"`
	MountNS               int          `json:"mountNamespace"`
	PIDNS                 int          `json:"pidNamespace"`
	ProcessName           string       `json:"processName"`
	Executable            File         `json:"executable"`
	HostName              string       `json:"hostName"`
	ContainerID           string       `json:"containerId"`
	Container             Container    `json:"container,omitempty"`
	Kubernetes            Kubernetes   `json:"kubernetes,omitempty"`
	EventID               int          `json:"eventId,string"`
	EventName             string       `json:"eventName"`
	PoliciesVersion       uint16       `json:"-"`
	MatchedPoliciesKernel uint64       `json:"-"`
	MatchedPoliciesUser   uint64       `json:"-"`
	MatchedPolicies       []string     `json:"matchedPolicies,omitempty"`
	ArgsNum               int          `json:"argsNum"`
	ReturnValue           int          `json:"returnValue"`
	Syscall               string       `json:"syscall"`
	StackAddresses        []uint64     `json:"stackAddresses"`
	ContextFlags          ContextFlags `json:"contextFlags"`
	ThreadEntityId        uint32       `json:"threadEntityId"`  // thread task unique identifier (*)
	ProcessEntityId       uint32       `json:"processEntityId"` // process unique identifier (*)
	ParentEntityId        uint32       `json:"parentEntityId"`  // parent process unique identifier (*)
	Args                  []Argument   `json:"args"`            // args are ordered according their appearance in the original event
	Metadata              *Metadata    `json:"metadata,omitempty"`
}

The event struct dates from 2020 and since the beginning has uin32 types declared as ints. We probably never payed hard attention because most of the values don't overflow often (and if they do we wouldn't catch in any test I believe). But the default PID namespaces are overflowing in my environments. It happened by accident because Parquet does not support unsigned integers on its data format.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions