Lack of index on inode column in in_tail sqlite DB #10166

littlecatherine · 2025-04-02T18:08:48Z

Bug Report

Describe the bug
I have observed that the SQL schema for the tail input state database does not create an explicit index on the inode column which affects query performance especially when the number of records grows.

#define SQL_CREATE_FILES                                                \
    "CREATE TABLE IF NOT EXISTS in_tail_files ("                        \
    "  id      INTEGER PRIMARY KEY,"                                    \
    "  name    TEXT NOT NULL,"                                          \
    "  offset  INTEGER,"                                                \
    "  inode   INTEGER,"                                                \
    "  created INTEGER,"                                                \
    "  rotated INTEGER DEFAULT 0"                                       \
    ");"

#define SQL_GET_FILE                                                    \
    "SELECT * from in_tail_files WHERE inode=@inode order by id desc;"

To Reproduce

fluent-bit yaml config:

  service:
    flush: 5
    grace: 5
    daemon: "off"
    dns.mode: UDP
    log_level: debug
    http_server: "on"
    http_listen: 0.0.0.0
    http_port: 2020
    coro_stack_size: 24576
    scheduler.cap: 2000
    scheduler.base: 5
    json.convert_nan_to_null: false
    sp.convert_from_str_to_num: true
    Health_Check: "on"
    Hot_Reload: "on"

pipeline:
    inputs:
      - db: /fluent-bit/data/in_tail.db
        name: tail
        path: /logs/sub1/*.log,/logs/sub2/*.log
    outputs:
      - name: stdout

Steps to reproduce the problem:

create 6k log files in /logs/sub1/*.log
create 400k log files in /logs/sub2/*.log
run fluent-bit with debug log level, at the first time run fluent-bit will create a new db, take note how long it takes to append all files in path /logs/sub1/*.log during initialization process
let it finish processing all files in /logs/sub2/*.log, then stop fluent-bit
run fluent-bit again using the existing db, take note how long it takes this time to append all files in path /logs/sub1/*.log
you will find it takes much longer to process the same amount of files as the number of records grows

Example log message:
First run

[2025/03/25 14:31:24] [debug] [input:tail:in.tail.path] scanning path /logs/sub1/*.log
[2025/03/25 14:31:25] [debug] [input:tail:in.tail.path]  file will be read in POSIX_FADV_DONTNEED mode /logs/sub1/1.log
...
[2025/03/25 14:51:10] [debug] [input:tail:in.tail.path] 10000 new files found on path '/logs/sub1/*.log'
[2025/03/25 14:51:10] [debug] [input:tail:in.tail.path] scanning path /logs/sub2/*.log
...

Second run

[2025/03/25 15:41:17] [debug] [input:tail:in.tail.path] scanning path /logs/sub1/*.log
[2025/03/25 15:41:18] [debug] [input:tail:in.tail.path]  file will be read in POSIX_FADV_DONTNEED mode /logs/sub1/1.log
...
[2025/03/25 15:43:12] [debug] [input:tail:in.tail.path] 10000 new files found on path '/logs/sub1/*.log'
[2025/03/25 15:43:12] [debug] [input:tail:in.tail.path] scanning path /logs/sub2/*.log
...

Expected behavior
It should takes approx. same time to append the same amount of files regardless of the number of records in the table

Your Environment

Version used: 3.2.4

The text was updated successfully, but these errors were encountered:

littlecatherine added the status: waiting-for-triage label Apr 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lack of index on inode column in in_tail sqlite DB #10166

Lack of index on inode column in in_tail sqlite DB #10166

littlecatherine commented Apr 2, 2025 •

edited

Loading

Lack of index on inode column in in_tail sqlite DB #10166

Lack of index on inode column in in_tail sqlite DB #10166

Comments

littlecatherine commented Apr 2, 2025 • edited Loading

Bug Report

littlecatherine commented Apr 2, 2025 •

edited

Loading