Closed
Description
Describe the issue
We have a problem with following flow:
id: flow1
namespace: tutorial
inputs:
- id: csv
type: FILE
tasks:
- id: op001
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{inputs.csv}}"
fieldSeparator: ;
- id: op01
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{outputs.op001.uri}}"
- id: op1
type: io.kestra.plugin.scripts.python.Script
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
beforeCommands: []
outputFiles:
- output.jsonl
containerImage: ghcr.io/kestra-io/kestrapy
script: |-
import csv, json
from kestra import Kestra
file_uri = '{{ outputs.op01.uri }}'
output_url = 'output.jsonl'
headers = None
with open(file_uri, 'rb') as f:
# Open output file
with open(output_url, 'w') as outfile:
# Iterates over input json
for line in f:
data = json.loads(line)
if headers is None:
headers = data.keys()
new_data = dict(zip(headers, map(lambda x: ' ' + str(x) + ' ' if isinstance(x, (int, float)) else x, data.values())))
outfile.write(json.dumps(new_data)+"\n")
Kestra.outputs({ 'headers': ','.join(headers), 'headers_expression': ','.join(len(headers) * ['?']) })
We expect .csv file as an input. But we managed to upload .xlsx file instead and it has been processed by CsvToIon task and serialized to this broken .ion output
{'PK\x03\x04-\0\b\0\b\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x13\0\0\0[Content_Types].xml�S�n�0\x10����*6�PU\x15�C\x1f�\x16��\x03\\{�X�%����]\a8�R�':"q�cfgfW�d�q�ZCB\x13|��|�*�*h㻆},^�{Va�^K\x1b<4l\v�f��b\x1b\x01+��ذ>�� \x04�\x1e�D\x1e\"xBڐ��tL��R-e\a�v4�\x13*�\f>h���\tZ���z��\x17��\x18�Q2S,���H��\v�\x04v�`o\"�\x10�U�\x1bRٵC(2q��qa9S�\x1b"}
{'PK\x03\x04-\0\b\0\b\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x13\0\0\0[Content_Types].xml�S�n�0\x10����*6�PU\x15�C\x1f�\x16��\x03\\{�X�%����]\a8�R�':"&\x19"}
the problem happens in 3rd task that calls python script to extract headers from json output. The problem happens at line Kestra.outputs({ 'headers': ','.join(headers), 'headers_expression': ','.join(len(headers) * ['?']) })
as it sends to Kestra output headers
with unicode escape sequences, which causes worker container to crash with the following error message:
org.jooq.exception.DataException: SQL [insert into queues ("value", "type", "key") values (cast(? as jsonb), CAST(? AS queue_type), ?)]; ERROR: unsupported Unicode escape sequence
kestra-1 | Detail: \u0000 cannot be converted to text.
kestra-1 | Where: JSON data, line 1: ...s":{"vars":{"csv_hlavicka":"PK\u0003\u0004-\u0000...
kestra-1 | at org.jooq_3.19.10.POSTGRES.debug(Unknown Source)
kestra-1 | at org.jooq.impl.Tools.translate(Tools.java:3603)
kestra-1 | at org.jooq.impl.Tools.translate(Tools.java:3595)
kestra-1 | at org.jooq.impl.DefaultExecuteContext.sqlException(DefaultExecuteContext.java:827)
kestra-1 | at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:362)
kestra-1 | at org.jooq.impl.AbstractDelegatingQuery.execute(AbstractDelegatingQuery.java:115)
kestra-1 | at io.kestra.jdbc.runner.JdbcQueue.lambda$produce$0(JdbcQueue.java:111)
kestra-1 | at org.jooq.impl.DefaultDSLContext.lambda$transaction$5(DefaultDSLContext.java:592)
kestra-1 | at org.jooq.impl.DefaultDSLContext.lambda$transactionResult0$3(DefaultDSLContext.java:530)
kestra-1 | at org.jooq.impl.Tools$3$1.block(Tools.java:6325)
kestra-1 | at java.base/java.util.concurrent.ForkJoinPool.unmanagedBlock(Unknown Source)
kestra-1 | at java.base/java.util.concurrent.ForkJoinPool.managedBlock(Unknown Source)
kestra-1 | at org.jooq.impl.Tools$3.get(Tools.java:6322)
kestra-1 | at org.jooq.impl.DefaultDSLContext.transactionResult0(DefaultDSLContext.java:578)
kestra-1 | at org.jooq.impl.DefaultDSLContext.transactionResult(DefaultDSLContext.java:502)
kestra-1 | at org.jooq.impl.DefaultDSLContext.transaction(DefaultDSLContext.java:591)
kestra-1 | at io.kestra.jdbc.JooqDSLContextWrapper.lambda$transaction$1(JooqDSLContextWrapper.java:58)
kestra-1 | at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:243)
kestra-1 | at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
kestra-1 | at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:74)
kestra-1 | at dev.failsafe.internal.FallbackExecutor.lambda$apply$0(FallbackExecutor.java:51)
kestra-1 | at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:187)
kestra-1 | at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
kestra-1 | at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
kestra-1 | at io.kestra.core.utils.RetryUtils$Instance.wrap(RetryUtils.java:144)
kestra-1 | at io.kestra.core.utils.RetryUtils$Instance.runRetryIf(RetryUtils.java:103)
kestra-1 | at io.kestra.jdbc.JooqDSLContextWrapper.transaction(JooqDSLContextWrapper.java:55)
kestra-1 | at io.kestra.jdbc.runner.JdbcQueue.produce(JdbcQueue.java:101)
kestra-1 | at io.kestra.jdbc.runner.JdbcQueue.emit(JdbcQueue.java:121)
kestra-1 | at io.kestra.core.queues.QueueInterface.emit(QueueInterface.java:11)
kestra-1 | at io.kestra.core.runners.Worker.run(Worker.java:606)
kestra-1 | at io.kestra.core.runners.Worker.handleTask(Worker.java:286)
kestra-1 | at io.kestra.core.runners.Worker.lambda$run$7(Worker.java:241)
kestra-1 | at io.micrometer.core.instrument.internal.TimedRunnable.run(TimedRunnable.java:49)
kestra-1 | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
kestra-1 | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
kestra-1 | at java.base/java.lang.Thread.run(Unknown Source)
kestra-1 | Caused by: org.postgresql.util.PSQLException: ERROR: unsupported Unicode escape sequence
kestra-1 | Detail: \u0000 cannot be converted to text.
kestra-1 | Where: JSON data, line 1: ...s":{"vars":{"csv_hlavicka":"PK\u0003\u0004-\u0000...
kestra-1 | at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2725)
kestra-1 | at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2412)
kestra-1 | at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:371)
kestra-1 | at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:502)
kestra-1 | at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:419)
kestra-1 | at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:194)
kestra-1 | at org.postgresql.jdbc.PgPreparedStatement.execute(PgPreparedStatement.java:180)
kestra-1 | at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
kestra-1 | at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
kestra-1 | at org.jooq.tools.jdbc.DefaultPreparedStatement.execute(DefaultPreparedStatement.java:219)
kestra-1 | at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:458)
kestra-1 | at org.jooq.impl.AbstractDMLQuery.execute(AbstractDMLQuery.java:1068)
kestra-1 | at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:348)
kestra-1 | ... 32 common frames omitted
Environment
- Kestra Version: 0.20.12
Metadata
Metadata
Assignees
Type
Projects
Status
Done