Description
Description
While aware of #2870, we still need to support/fix current tracee event context, as it has data types overflows that are likely causing context to be lost (all uint32 values, from the eBPF context, might be overflowed when decoded into Event struct
, such as the PIDNS value, declared as int
, which defaults to very high numbers in my env).
The new
event.proto
takes care of correct types (usinggoogle.protobuf.UInt32Value
in the right places, kudos to @josedonizetti).
More details
While playing with some marshalling and unmarshalling ideas (for other output printers, for example), I got the following error after a event -> json -> parquet conversion:
{
"level": "error",
"ts": 1700017058.446953,
"msg": "Error encoding event to parquet",
"error": "integer overflow on token 4026531841",
"event": {
"timestamp": 1700017058317424171,
"threadStartTime": 1699918738371868249,
"processorId": 7,
"processId": 463085,
"cgroupId": 4794,
"threadId": 463085,
"parentProcessId": 462937,
"hostProcessId": 463085,
"hostThreadId": 463085,
"hostParentProcessId": 462937,
"userId": 1000,
"mountNamespace": 4026531841,
"pidNamespace": 4026531836,
"processName": "code",
"executable": {
"path": ""
},
"hostName": "rugged",
"containerId": "",
"container": {},
"kubernetes": {},
"eventId": "257",
"eventName": "openat",
"matchedPolicies": [
""
],
"argsNum": 4,
"returnValue": 45,
"syscall": "openat",
"stackAddresses": null,
"contextFlags": {
"containerStarted": false,
"isCompat": false
},
"threadEntityId": 1656711223,
"processEntityId": 1656711223,
"parentEntityId": 723639701,
"args": [
{
"name": "dirfd",
"type": "int",
"value": -100
},
{
"name": "pathname",
"type": "const char*",
"value": "/proc/463548/cmdline"
},
{
"name": "flags",
"type": "string",
"value": ""
},
{
"name": "mode",
"type": "mode_t",
"value": 0
}
]
}
}
and:
$ sudo lsns | grep 4026531836
4026531836 pid 316 1 root /sbin/init
and
11101111111111111111111111111100
isn't quite an unsigned 32 bit overflow but it is a signed 32 bit overflow.
Meaning that a bunch of Event fields that are int
should actually be uint32
:
type Context struct {
Ts uint64
StartTime uint64
CgroupID uint64
Pid uint32
Tid uint32
Ppid uint32
HostPid uint32
HostTid uint32
HostPpid uint32
Uid uint32
MntID uint32
PidID uint32
Comm [16]byte
UtsName [16]byte
Flags uint32
LeaderStartTime uint64
ParentStartTime uint64
EventID events.ID // int32
Syscall int32
MatchedPolicies uint64
Retval int64
StackID uint32
ProcessorId uint16
_ [2]byte // padding
}
type Event struct {
Timestamp int `json:"timestamp"`
ThreadStartTime int `json:"threadStartTime"`
ProcessorID int `json:"processorId"`
ProcessID int `json:"processId"`
CgroupID uint `json:"cgroupId"`
ThreadID int `json:"threadId"`
ParentProcessID int `json:"parentProcessId"`
HostProcessID int `json:"hostProcessId"`
HostThreadID int `json:"hostThreadId"`
HostParentProcessID int `json:"hostParentProcessId"`
UserID int `json:"userId"`
MountNS int `json:"mountNamespace"`
PIDNS int `json:"pidNamespace"`
ProcessName string `json:"processName"`
Executable File `json:"executable"`
HostName string `json:"hostName"`
ContainerID string `json:"containerId"`
Container Container `json:"container,omitempty"`
Kubernetes Kubernetes `json:"kubernetes,omitempty"`
EventID int `json:"eventId,string"`
EventName string `json:"eventName"`
PoliciesVersion uint16 `json:"-"`
MatchedPoliciesKernel uint64 `json:"-"`
MatchedPoliciesUser uint64 `json:"-"`
MatchedPolicies []string `json:"matchedPolicies,omitempty"`
ArgsNum int `json:"argsNum"`
ReturnValue int `json:"returnValue"`
Syscall string `json:"syscall"`
StackAddresses []uint64 `json:"stackAddresses"`
ContextFlags ContextFlags `json:"contextFlags"`
ThreadEntityId uint32 `json:"threadEntityId"` // thread task unique identifier (*)
ProcessEntityId uint32 `json:"processEntityId"` // process unique identifier (*)
ParentEntityId uint32 `json:"parentEntityId"` // parent process unique identifier (*)
Args []Argument `json:"args"` // args are ordered according their appearance in the original event
Metadata *Metadata `json:"metadata,omitempty"`
}
The event struct dates from 2020 and since the beginning has uin32 types declared as ints. We probably never payed hard attention because most of the values don't overflow often (and if they do we wouldn't catch in any test I believe). But the default PID namespaces are overflowing in my environments. It happened by accident because Parquet does not support unsigned integers on its data format.