-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Tuples & Variants #368
Add support for Tuples & Variants #368
Conversation
20be81e
to
c7912e5
Compare
For anyone interested, I've released the zip here. |
c7912e5
to
637ca42
Compare
The following tests are failing due to #367:
|
@jirislav Thanks for the contribution, I have disabled those tests in my latest merge tom main can you aline your brach with main and check again if the tests are still falling |
dd5f75f
to
ee92764
Compare
@mzitnik Thank you! I have rebased my changes and adjusted the GitHub workflow defintion to run cloud tests only when the secrets are defined. All tests are passing except this one:
I have no idea what's wrong here, I haven't touched anything related (I think). When I inspect the actual table contents, the |
ba966a0
to
f2ba783
Compare
Ok, turns out it was actually my fault. During the refactor, the I'm glad you have this test. All tests are passing now. |
f2ba783
to
8adcb9c
Compare
It turns out that adding support for Nested type is really easy, so I just did it and released it to the public here (if somebody didn't want to wait for merge): |
dea37ab
to
ed98ef5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing this! We're hoping to refactor serialization in the near-ish future, but this definitely helps cover gaps!
im seeing this error: INFO 2024-05-29T14:39:51.163593458Z java.lang.IllegalArgumentException: DESCRIBE TABLE is never supposed to return Nested type. It should always yield its Array fields directly. at com.clickhouse.kafka.connect.sink.db.mapping.Column.extractColumn(Column.java:204) at com.clickhouse.kafka.connect.sink.db.mapping.Column.extractColumn(Column.java:159) at com.clickhouse.kafka.connect.sink.db.mapping.Column.extractColumn(Column.java:155) at com.clickhouse.kafka.connect.sink.db.helper.ClickHouseHelperClient.describeTable(ClickHouseHelperClient.java:220) at com.clickhouse.kafka.connect.sink.db.helper.ClickHouseHelperClient.extractTablesMapping(ClickHouseHelperClient.java:248) at com.clickhouse.kafka.connect.sink.db.ClickHouseWriter.updateMapping(ClickHouseWriter.java:123) at com.clickhouse.kafka.connect.sink.db.ClickHouseWriter.start(ClickHouseWriter.java:103) at com.clickhouse.kafka.connect.sink.ProxySinkTask.(ProxySinkTask.java:61) at com.clickhouse.kafka.connect.sink.ClickHouseSinkTask.start(ClickHouseSinkTask.java:57) at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:329) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:202) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:259) at org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:237) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) the logs dont include the table that caused it, so i dont have a repro yet |
@dermasmid This has to be specific to the table you're using. I ran the Note that we don't have the context of the table being parsed within the |
heres the debug log:
clickhouse version |
note, this column is not from the table i intend to write to, but because its in the database that i set, it seems like to connector is fetching it. |
i opened issue #399 |
Summary
Related issue: #286
I have implemented the support for Tuple & Variant data types by adding
describe_include_subcolumns = 1
setting to theDESCRIBE TABLE
query.As a result, I was able to get rid of the recursive Map parsing (since ClickHouse returns separate rows for the nested types) and unify it with the way how Tuples are now parsed.
Essentially, the
Column
class is no longer responsible for parsing of the complex data type (except Array, since this is not included in theDESCRIBE TABLE
output). To fully parse the complex data type, a context of previous columns has to be present, which is whyTable
class is responsible for parsing those.I have added
TableTest
, feel free to inspect it first to get an idea of how it works.Regarding the
Variant
type, I needed this to support Avro's union type. In the Kafka Connect ecosystem, it is represented as aStruct
, whose columns are named by the lowercase version of the type.Please note that I've added the Lombok library as the telescoping constructors problem was too much in the
Column
class, so I refactored it to use the builder pattern.Checklist
Delete items not relevant to your PR: