-
Notifications
You must be signed in to change notification settings - Fork 1
Description
There has been recent pipelines code upgrade requiring
Spark 3.5
Hadoop 3.2
Java 17
This requires some preparation work to make pipelines work with a later version of EMR.
Several issues need to be address:
-
Bootstrap fail when using later version of EMR from the production version (EMR 5.34): Related to Launch ingest dag with different emr version failed #20.
-
After fixing Bootstrap fail, launching EMR with the latest version still fail with EMR 7.9 using ManagadPolicy with permissions issue(may need to reconfirm if this is still the case)
-
However, using the service role which is used by ala-dev to launch EMR 7.9, the EMR is able to be launched successfully. The applications installed in EMR 7.9 are run with Java 17. However the default java version of the EMR is still java 8. Hence, when the pipelines is called by command-runner jar, this would fail. EMR 7.9 does come with few different java version alternatives and this need to be set explicitly in the first step once the EMR is launched. Setting it in the bootstrap fails