Skip to content

[Bug] Read a random table with specific column type 'ipv6' and the value goes with 'x::x' .It will throw an Exeception #337

@ladebangbangde

Description

@ladebangbangde

Search before asking

  • I had searched in the issues and found no similar issues.

Version

25.0.1

What's Wrong?

When connector reads random table contains Specific "IPV6" column type with such "x :: x" value in it. It will throw a DORISEXCEPTION

What You Expected?

​Flaw Description:​​

I have identified the root cause of the issue, outlined as follows:

​Scenario Setup:​​
A table is created via JDBC (e.g., using Navicat).
This table contains a column defined with an IPV6 type.
An IPv6 address value like x::x (where x represents hexadecimal digits) is stored in this column.
​Reading Process:​​
The Doris Spark Connector is used to read data from this table.
The read process flows through:

  1. DorisPartitionReader.next()
  2. AbstractThriftReader.next()
  3. hasNext() check
  4. Initialization of a RowBatch using the retrieved information.
  5. Execution then enters the convertArrowToRowBatch() method within the RowBatch class.

​Problematic Conversion:​​
Within the type-specific switch logic handling the IPV6 column, the following conversion occurs:

String ipv6Str = new String(ipv6VarcharVector.get(rowIndex)); // Convert Arrow vector data to Java String
String ipv6Address = IPUtils.fromBigInteger(new BigInteger(ipv6Str)); // Attempt conversion via BigInteger

The Core Issue:​​ Using new BigInteger(ipv6Str) to parse a string containing :: (like x::x) is problematic. The BigInteger(String) constructor expects a string representing a single, contiguous decimal number. An IPv6 address in compressed format (::) does not conform to this format and will cause a NumberFormatException.

How to Reproduce?

  1. Store an IPv6 address in compressed format (containing ::) within a database column. Ensure the column is mapped to a String type or directly handled as a string during the read process.
  2. Use the Doris Spark Connector to read data from this table.
  3. When the connector encounters the compressed IPv6 string (x::x) and attempts to parse it using new BigInteger(...), an InputFormat error (specifically, a NumberFormatException) will occur due to the invalid string format for the BigInteger constructor.

Anything Else?

Why using break clause when encounter NULL value in brach "IPV6" in ROWBACTH? I thought the continue clause may make more sense.Plz guide me in delima.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions