Skip to content

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Sep 12, 2025

Proposed commit message

This commit correctly handles the binary encoding used by
journald. Now when the message of an entry is in a binary format, we
convert everything to string instead of the raw byte slice.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

Events that have their message encoded in the binary format are correctly handled now, instead of appearing as a the string representation of a slice of bytes. This brings back the behaviour from before we moved to use journalctl.

## Author's Checklist

How to test this PR locally

Run the tests

cd filebeat
go test -v -count=1 ./input/journald/...

Or only the tests added by this PR:

 go test -v -count=1 -run=TestBinaryData ./input/journald

Run Filebeat using one of the test journals

cd filebeat/input/journald/testdata
gunzip --keep binary.journal.gz
cd ../../..

Run Filebeat with the following configuration:

filebeat.inputs:
  - type: journald
    id: jd-1
    paths:
      - input/journald/testdata/binary.journal

queue.mem:
  flush.timeout: 0

output.file:
  path: ${path.home}
  filename: output
  rotate_on_startup: false

logging:
  to_stderr: true

Ensure there are 9 entries in the output file:

wc -l output-*.ndjson

Look at the message field of every entry:

cat output-*.ndjson |jq '.message'

The output should look like this:

"\u0000\u0002\u0004\b\n\f\u000e\u0010\u0012"
"\u0000\n\u0014\u001e(2<FPZd"
"������������️������������������⠀������������������������������❗"
"FOO\\nBAR\\nFOO"
"🏠👁️🪵🪵🟠⠀🌊🟠🎀🪵💧❗"
"\u001b[?2004hroot@7aa80ab6eac4:/# echo foo bar\r"
"\u001b[?2004l\rfoo bar\r"
"\u001b[?2004hroot@7aa80ab6eac4:/# exit\r"
"\u001b[?2004l\rexit\r"

Alternatively, you can edit the configuration to send the events to
Elasticsearch and look on Kibana:

Screenshot_2025-09-05_11-49-26

Using your own test data

The following Go program can write directly to Journald's socket using
an encoding that supports binary data and strings containing
\n. This program is also at filebeat/input/journald/README.md.

main.go

package main

import (
	"bytes"
	"encoding/binary"
	"fmt"
	"log"
	"net"
)

func main() {
	jd, err := newJdWriter("experiment")
	if err != nil {
		log.Fatal(err)
	}
	defer jd.Close()

	messges := [][]byte{
		{0, 2, 4, 8, 10, 12, 14, 16, 18},
		{0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100},
		[]byte(`FOO\nBAR\nFOO`),
	}

	for _, msg := range messges {
		written, err := jd.Write(msg)
		if err != nil {
			log.Fatal(err)
		}
		fmt.Printf("%d bytes written to Journald socket\n", written)
	}

}

type jdWriter struct {
	id   string
	conn net.Conn
}

func newJdWriter(id string) (jdWriter, error) {
	conn, err := net.Dial("unixgram", "/run/systemd/journal/socket")
	if err != nil {
		return jdWriter{}, fmt.Errorf("cannot open unix socket: %w", err)
	}

	jd := jdWriter{
		id:   id,
		conn: conn,
	}

	return jd, nil
}

func (j jdWriter) Write(msg []byte) (int, error) {
	w := &bytes.Buffer{}

	fmt.Fprintf(w, "SYSLOG_IDENTIFIER=%s\n", j.id)
	w.WriteString("MESSAGE")
	w.WriteString("\n")
	l := len(msg)
	if err := binary.Write(w, binary.LittleEndian, uint64(l)); err != nil {
		log.Fatal(err)
	}

	w.Write(msg)
	w.WriteString("\n")

	return j.conn.Write(w.Bytes())
}

func (j jdWriter) Close() error {
	return j.conn.Close()
}

Then for the input configuration add a filter to the identifier you
defined:

filebeat.inputs:
  - type: journald
    id: jd-1
    syslog_identifiers:
      - experiment

Then run Filebeat and look at your output.

Related issues

## Use cases

Screenshots

I also compared how the implementation of this PR compares with Filebeat 8.12.0, which uses go-systemd instead of calling journalctl, and they both handle the binary format in the same way, see the screenshot below.
Screenshot_2025-09-05_15-12-32

## Logs


This is an automatic backport of pull request #46415 done by [Mergify](https://mergify.com).

This commit correctly handles the binary encoding used by
journald. Now when the message of an entry is in a binary format, we
convert everything to string instead of the raw byte slice.

---------

Co-authored-by: Anderson Queiroz <[email protected]>
(cherry picked from commit f8907e3)
@mergify mergify bot added the backport label Sep 12, 2025
@mergify mergify bot requested a review from a team as a code owner September 12, 2025 16:49
@mergify mergify bot requested review from rdner and removed request for a team September 12, 2025 16:49
@mergify mergify bot requested a review from mauri870 September 12, 2025 16:49
@mergify mergify bot added the backport label Sep 12, 2025
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 12, 2025
@github-actions github-actions bot added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Sep 12, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 12, 2025
Copy link
Contributor Author

mergify bot commented Sep 15, 2025

This pull request has not been merged yet. Could you please review and merge it @belimawr? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants