Description
Expected Behavior
I have a SQS trigger and when a new message flows into the queue, it will convert into .jsonl
and pass the file uri as inputFiles
to kubernetes.PodCreate
. The file will be accessed inside the pod and processed.
Actual Behaviour
When I pass the nodeSelectors
and tolerations
to the kubernetes pod which will be deployed into different node (Not same as the kestra-worker deployed). Because of the kestra and task pod is in different node. busy-box
image is failed to upload the file that I am trying to pass it via flow.
But When I removed the node selectors and toleration, the inputFile upload works fine as it intended. From my observation it is only failed if kestra and newly creating task pod not in the same node. By the way I use Karpenter
to scale the EKS nodes up and down dynamically (Just passing the info if it is anything related to it).
Steps To Reproduce
the error log for task creating pod and failing
2024-09-17T19:19:27.693Z INFO Pod 'microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk' is created
2024-09-17T19:19:27.708Z DEBUG Received action 'ADDED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:06.043Z DEBUG Received action 'MODIFIED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:06.094Z DEBUG Received action 'MODIFIED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:09.811Z DEBUG Received action 'MODIFIED' on [Type: Pod, Namespace: dev, Name: microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk, Uid: c19ebc8e-e92d-4791-a60e-f3dfc61f18e8, Phase: Pending]
2024-09-17T19:20:10.128Z DEBUG Failed to call 'uploadInputFiles'
2024-09-17T19:20:11.684Z DEBUG Failed to call 'uploadMarker'
2024-09-17T19:20:13.034Z DEBUG Failed to call 'uploadMarker'
2024-09-17T19:20:15.318Z DEBUG Failed to call 'uploadMarker'
2024-09-17T19:20:15.331Z DEBUG Received close on [Type: PodWatcher]
2024-09-17T19:20:15.345Z INFO Pod 'microformboa-dev-boa-etl-part-1-async-textract-job-qjyzk' is deleted
2024-09-17T19:20:15.350Z TRACE io.kestra.core.utils.RetryUtils$RetryFailed: Stop retry, attempts 3 elapsed after 3 seconds
at io.kestra.core.utils.RetryUtils$Instance.lambda$exceptionFallback$4(RetryUtils.java:153)
at dev.failsafe.internal.FallbackImpl.apply(FallbackImpl.java:58)
at dev.failsafe.internal.FallbackExecutor.lambda$apply$0(FallbackExecutor.java:62)
at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:187)
at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:376)
at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:112)
at io.kestra.core.utils.RetryUtils$Instance.wrap(RetryUtils.java:144)
at io.kestra.core.utils.RetryUtils$Instance.run(RetryUtils.java:129)
at io.kestra.plugin.kubernetes.services.PodService.withRetries(PodService.java:147)
at io.kestra.plugin.kubernetes.services.PodService.uploadMarker(PodService.java:173)
at io.kestra.plugin.kubernetes.AbstractPod.uploadInputFiles(AbstractPod.java:84)
at io.kestra.plugin.kubernetes.PodCreate.run(PodCreate.java:193)
at io.kestra.plugin.kubernetes.PodCreate.run(PodCreate.java:39)
at io.kestra.core.runners.WorkerTaskThread.doRun(WorkerTaskThread.java:76)
at io.kestra.core.runners.AbstractWorkerThread.run(AbstractWorkerThread.java:57)
2024-09-17T19:20:15.350Z ERROR Stop retry, attempts 3 elapsed after 3 seconds
Environment Information
- Kestra Version: 0.187
- Plugin version: latest
- Operating System (OS / Docker / Kubernetes): Kubernetes (EKS)
- Java Version (If not docker):
Example flow
id: dev-boa-etl-part-1
namespace: microform.boa
triggers:
- id: trigger
type: io.kestra.plugin.aws.sqs.Trigger
accessKeyId: "{{kv(key='aws_access_key', errorOnMissing=true)}}"
secretKeyId: "{{kv(key='aws_secret_key', errorOnMissing=true)}}"
region: "eu-west-2"
serdeType: STRING
queueUrl: "queueurl"
maxRecords: 10
maxDuration: PT10S
tasks:
- id: to_json
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{ trigger.uri }}"
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{read(outputs.to_json.uri)}}"
- id: async_textract_job
type: io.kestra.plugin.kubernetes.PodCreate
namespace: dev
inputFiles:
data.jsonl: "{{outputs.to_json.uri}}"
metadata:
labels:
company: microform.boa
task: boa-etl-pipeline-part-1
waitRunning: PT1H
waitUntilRunning: PT15M
spec:
containers:
- name: boa-etl-pipeline-part-1
image: 1tw678125321685.dkr.ecr.eu-west-2.amazonaws.com/boaml/etl:v1.0
command:
- python
- textract.py
- '--f'
- "{{workingDir}}/data.jsonl"
volumeMounts:
- name: environment-vars
mountPath: /app/.env
subPath: .env
readOnly: true
imagePullPolicy: IfNotPresent
nodeSelector:
resource-type: private-cpu
tolerations:
- key: private/cpu
operator: Exists
effect: NoSchedule
restartPolicy: OnFailure
volumes:
- name: environment-vars
secret:
secretName: boa-etl-env-vars
Metadata
Metadata
Assignees
Type
Projects
Status