Skip to content

Different sizing methods can cause  #260

Open
@BennettJames

Description

@BennettJames

Problem

Messages can be configured to be clamped with the parameter max_message_length. When set, this will truncate all messages with this value as a length.

However, it is then validated against bytesize, which is different then string size. This means that even with a max message length set to say 250k, well under the ~260k limit, you can still exceed the limit if the message has many characters > 1 byte in length.

Steps to replicate

Here's a simplified reproduction of the issue:

def truncate_and_validate(msg, truncation, limit)
    if truncation
        msg = msg.slice(0, truncation)
    end

    if msg.bytesize > limit
        raise "message too long"
    end

    return msg
end

puts truncate_and_validate("abcd", 2, 2) # outputs 'ab'
puts truncate_and_validate("❤️❤️❤️❤️", 2, 2) # throws 'message to long'

Expected Behavior or What you need to ask

Line length clamping and validation should use the same underlying values. E.g. do this for truncation:

if @max_message_length
  message = message.byteslice(0, @max_message_length)
end

One downside to this is it can output broken UTF - in this case, might want to do some validation to ensure the truncation is done to avoid invalid formatting.

Using Fluentd and CloudWatchLogs plugin versions

  • OS version
  • Bare Metal or within Docker or Kubernetes or others?
  • Fluentd v0.12 or v0.14/v1.0
    • paste result of fluentd --version or td-agent --version
  • Dependent gem versions
    • paste boot log of fluentd or td-agent
    • paste result of fluent-gem list, td-agent-gem list or your Gemfile.lock

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions