Skip to content

S3 Sink: org.apache.avro.SchemaParseException: Illegal character in #1822

@sbuliarca

Description

@sbuliarca

Hi!

We're doing a backup of all topics using Parquet and 'store.envelope'=true to store also the headers. Here are the details:

Stream-reactor-version: 8.1.30
Kafka connect version: based on Docker image confluentinc/cp-kafka-connect:7.8.0
Backup config:

{
"connector.class": "io.lenses.streamreactor.connect.aws.s3.sink.S3SinkConnector",
"tasks.max": "10",
"connect.s3.aws.auth.mode": "Default",
"connect.s3.compression.codec": "zstd",
"topics.regex": ".*",
"key.converter.schemas.enable": "false",
"connect.s3.kcql": "insert into msk-backup:msk-backup-parquet select * from * PARTITIONBY _topic, _partition STOREAS PARQUET PROPERTIES ('store.envelope'=true,'flush.count'=1000,'flush.interval'=300, 'flush.size'=50000000, 'partition.include.keys'=false);",
"name": "msk-s3-backup-sink-all-parquet",
"value.converter.schemas.enable": "false",
"connect.s3.compression.level": "9",
"connect.s3.aws.region": "eu-west-1",
"value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter",
"key.converter": "org.apache.kafka.connect.converters.ByteArrayConverter"
}

The Problem:
When encountering a message with the following headers:

{"traceparent":"00-fe5798fdc39c42a21010f92fea8a85fc-dbb78d479a0b3099-01","reason-for-dlq":"test from actionTestDlq","original-topic":"beb-dlq-test"}

if fails with the error:

org.apache.avro.SchemaParseException: Illegal character in: original-topic

Checked also on latest version 10.0.0 and the same issue happens

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions