-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log raw events to a separate log file #38767
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
7bd14fa
to
bbd134f
Compare
bbd134f
to
70fcf1d
Compare
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Unclear why the DialContext change is a part of this PR
- A linter issue needs to be fixed
- Looks like there is a forgotten replace directive that fails the check and docs CI steps.
3335e79
to
4431df4
Compare
This pull request is now in conflicts. Could you fix it? 🙏
|
Use the event logger instead of "trace level" for debug logs containing events, both under Elastic-Agent and standalone Beat.
Fix the flaky python test. Moving the file instead of truncating it seems to solve the problem. Maybe the write was not properly synced to disk. The advantage of moving the file is that if the test runs we can later inspect it.
Ensure the event data is not present in the normal log file
This commit extents the integration tests framework to read the events log file.
5f060aa
to
f888e26
Compare
@belimawr my only concern is that I can not make it work inside k8s environment as described here: #38767 (comment) This is how I start the filebeat (that I built from your pr : kubectl exec `kubectl get pod -n kube-system -l k8s-app=filebeat -o jsonpath='{.items[].metadata.name}'` -n kube-system -- bash -c “filebeat -e -c /etc/filebeat.yml -e logging.event_data.to_stderr=true"
And still don't see any event logging, only my filebeat logs: root@kind-control-plane:/usr/share/filebeat# ls -lrt /var/log/containers/ | grep filebeat
lrwxrwxrwx 1 root root 92 May 22 07:36 filebeat-lkmpf_kube-system_filebeat-eff4163a56a2c461dc4f6ec77c971623869181261e251faec8ba2f145b5d1ff2.log -> /var/log/pods/kube-system_filebeat-lkmpf_241373d0-5916-4080-8c76-8e4b1152f7b5/filebeat/0.log
Should we test in k8s, or rephrase is this out of scope? Am I doing something wrong in my testing? |
Hi @gizas There is a typo in the command you ran, the
I did test it again in Kubernetes (using Kind) with a fresh build from this PR and it works as expected. I edited the CLI arguments in the manifest directly because that felt easier for me. Here is the manifest I used: filebeat-kubernetes.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: filebeat
labels:
k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
- nodes
verbs:
- get
- watch
- list
- apiGroups: ["apps"]
resources:
- replicasets
verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
resources:
- jobs
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: filebeat
# should be the namespace where filebeat is running
namespace: kube-system
labels:
k8s-app: filebeat
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: filebeat-kubeadm-config
namespace: kube-system
labels:
k8s-app: filebeat
rules:
- apiGroups: [""]
resources:
- configmaps
resourceNames:
- kubeadm-config
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: filebeat
namespace: kube-system
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: Role
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: filebeat-kubeadm-config
namespace: kube-system
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: Role
name: filebeat-kubeadm-config
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: kube-system
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
filebeat.inputs:
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
namespace: default
hints.enabled: true
hints.default_config:
type: filestream
id: kubernetes-container-logs-${data.kubernetes.pod.name}-${data.kubernetes.container.id}
paths:
- /var/log/containers/*-${data.kubernetes.container.id}.log
parsers:
- container: ~
prospector:
scanner:
fingerprint.enabled: true
symlinks: true
file_identity.fingerprint: ~
output.discard:
enabled: true
logging:
level: debug
selectors:
- processors
- filestream
- input.filestream
- input
# event_data:
# files:
# name: filebeat-events-data
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
k8s-app: filebeat
template:
metadata:
labels:
k8s-app: filebeat
spec:
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: filebeat
image: "docker.elastic.co/beats/filebeat:8.15.0-SNAPSHOT"
args: [
"-c", "/etc/filebeat.yml",
"-e",
# "-E",
# "logging.event_data.to_stderr=true"
]
env:
- name: ELASTICSEARCH_HOST
value: elasticsearch
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
value: changeme
- name: ELASTIC_CLOUD_ID
value:
- name: ELASTIC_CLOUD_AUTH
value:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
volumes:
- name: config
configMap:
defaultMode: 0640
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
# When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
path: /var/lib/filebeat-data
type: DirectoryOrCreate
--- A few things to notice:
|
@gizas I also added the steps I'm using to test on Kuberentes to the PR description. I hope it helps. Let me know if face any issues/need any more help. |
@belimawr thank you so much for this! I would not find the -E, no way !!!!
I guess this is an important info to be written somewhere !!! |
I created an issue about it last week: #39566. One thing I'm not 100% sure is where/how to document it. Probably the manifest example is one of the best places to add it. We could add some rules to ensure Filebeat does not collect its own logs or at least drop/do not collect debug logs. It will require a bit of time designing a robust solution. Just documenting "it's dangerous/don't do it" is not enough, we have to provide good solution/alternative to our users. |
Notes for Reviewers
elastic-agent-libs
to the latest version brought along some breaking changes intransport.Dialer
, so this PR also updates all types implementing this interface to be compatible with the new interface definition.Proposed commit message
This commit introduces a new logger core that can be configured through
logging.event_data
and is used to log any message that contains the whole event or could contain any sensitive data. This is accomplished by addinglog.type: event
to the log entry. The logger core is responsible for filtering the log entries and directing them to the correct files.At the moment it is used by multiple outputs to log indexing errors containing the whole event and errors returned by Elasticsearch that can potentially contain the whole event.
Expected behaviour when running under Elastic-Agent
When running under Elastic-Agent, Beats are started passing the the CLI flags
-E logging.sensitive.to_stderr=true -E logging.sensitive.to_files=false
. The Elastic-Agent collects the stderr and stdout from the Beat, wraps every line in a JSON containing some metadata and logs it to the correct log file. The concept of a event log file will be added to the Elastic-Agent.Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Author's Checklist
<beat>.reference.yml
elastic-agent-libs
once [logp] Add typed loggers allowing log entries to go to different outputs elastic-agent-libs#171 is mergedmain
once build(deps): bump github.com/elastic/elastic-agent-libs from 0.7.5 to 0.9.4 #39013 is mergedmain
once Disable logging when all outputs are disabled elastic-agent-libs#204 is mergedmain
once Remove shipper #39584 is mergedmain
once x-pack/filebeat/input/httpjson: skip flakey test on windows #39678 is merged or [Flaky Test] HTTPJSON input - TestMetrics/Test_pagination_metrics - expected non zero value for metric httpjson_interval_pages_execution_time #39676 is solvedHow to test this PR locally
Standalone Filebeat
Start Filebeat with the following configuration
Create the log file
/tmp/flog.log
with the following content:Raw events should be logged to a different log file, in the same folder as the normal logs, the filename matches the glob
filebeat-events-data*.ndjson
.By default the logs go in a
logs
folder that's created in the same folder you're running Filebeat from, here are the files created when running this branch:If you need to run the test again, either add more data to
/tmp/flog.log
or remove thedata
folder Filebeat created at star up, this will make Filebeat re-ingest the file.Filebeat in Kubernetes
DEV=true SNAPSHOT=true PACKAGES=docker PLATFORMS=linux/amd64 mage -v package
(adjust the platform as needed)kind create cluster
kind load docker-image docker.elastic.co/beats/filebeat:8.15.0-SNAPSHOT
kubectl create deployment flog --image=mingrammer/flog -- flog -d 1 -s 1 -l
kubectl apply -f filebeat-kubernetes.yaml
If you want to see the event logs going to stderr, uncomment the following lies from the configuration:
If you already created the Fielbeat pod, you'll have to delete it then recreate with the new configuration.
filebeat-kubernetes.yaml
A couple important things to notice about the provided Kubernetes manifest/Filebeat configuration:
Related issues
## Use cases## Screenshots## Logs