Skip to content

Conversation

@bienvenuushindi
Copy link

Context:

While working with DBDiff to compare data and schema, an issue was identified in the handling of NULL values during data comparison. Specifically, when columns containing NULL values were included in the MD5 hash computation, the results were inconsistent. This issue was due to the behavior of CONCAT in SQL, which returns NULL if any of the values being concatenated are NULL.

Root Cause:

When constructing MD5 hashes for row comparisons, the query builder used CONCAT without handling NULL values. This led to NULL hashes, causing incorrect identification of rows as identical even when there were clear differences.

For example:

SELECT CONCAT('value', NULL);  -- Result is NULL as shown below
+-----------------------+
| CONCAT('value', NULL) |
+-----------------------+
| NULL                  |
+-----------------------+

@kroky
Copy link

kroky commented Dec 17, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants