Skip to content

[Bug] sink csv格式 启用gz压缩批量写入doris 会发生卡住 计数器释放错误 #613

@qq461613840

Description

@qq461613840

Search before asking

  • I had searched in the issues and found no similar issues.

Version

25.1.0

What's Wrong?

csv格式启用gz压缩选项 运行一定时间后会发生卡死
2025-09-09 15:54:51,372 INFO org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - Cache full, waiting for flush, currentBytes: 314572855, maxBlockedBytes: 314572800
2025-09-09 15:54:52,335 INFO org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - bufferMap is empty, no need to flush null
2025-09-09 15:54:52,372 INFO org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - Cache full, waiting for flush, currentBytes: 314572855, maxBlockedBytes: 314572800
2025-09-09 15:54:53,372 INFO org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - Cache full, waiting for flush, currentBytes: 314572855, maxBlockedBytes: 314572800
2025-09-09 15:54:54,335 INFO org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - bufferMap is empty, no need to flush null
2025-09-09 15:54:54,373 INFO org.apache.doris.flink.sink.batch.DorisBatchStreamLoad [] - Cache full, waiting for flush, currentBytes: 314572855, maxBlockedBytes: 314572800

我的配置是
Properties props = new Properties();
props.setProperty("column_separator", ",");
props.setProperty("line_delimiter", "\n");
props.setProperty("format", "csv");
props.setProperty("compress_type", "gz");

    return DorisExecutionOptions.builder()
            .setLabelPrefix(tableName + "-" + System.currentTimeMillis())
            .setDeletable(false)
            .setBatchMode(true)
            .setBufferFlushMaxRows(20000)
            .setBufferFlushIntervalMs(2000)
            .setStreamLoadProp(props)
            .build();

查看源码后猜想
记录cacheBeforeFlushBytes是压缩前的大小
发送数据后
currentCacheBytes.getAndAdd(-respContent.getLoadBytes()); 是压缩后的大小

导致数值不释放
然后发生问题Cache full, waiting for flush
一直在lock状态

What You Expected?

期望解决该bug

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions