-
Notifications
You must be signed in to change notification settings - Fork 405
Add FlinkRunner support in the Beam Example #341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <scope>runtime</scope> | ||
| </dependency> | ||
| </dependencies> | ||
| <build> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we really need this <build> section for shade plugin. Other runners don't have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep the Shade plugin here because Flink runs user code on TaskManagers and needs the full dependency closure packaged.
For Flink we have three practical options:
- Uber-jar - This is the most repeatable path for submitting to an existing cluster. (recommended by flink official docs)
- Local exec (no shade) — for quick local checks we can run Beam’s FlinkRunner with exec:exec (as below). This works but doesn’t solve cluster submission.
- Attach external JARs at submit time — flink run -C file:///path/dep.jar … can inject deps without shading, but it is brittle and operationally heavier than a single uber-jar.
Given that we also want production-like testing against an existing Flink cluster, I suggest we keep the shade step in this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use with assembly, but it didn't work 🤔
Shade plugin do not only packaging jar files but also excluding META-INF.
Also it recommended by the Flink official docs, so I would like to keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I give approval in case your answers to my comments are negative, if not and you agree with making the proposed changes, then please send for another round of review.
| ], | ||
| main_class = "com.google.privacy.differentialprivacy.pipelinedp4j.examples.BeamExample", | ||
| runtime_deps = [ | ||
| "@maven//:org_apache_beam_beam_runners_flink_1_18", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we can just merge it with previous java_binary target and add just another runtime_dep. Less code and since it is just example, should be fine to have unnecessary deps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in ddf9636
| <scope>runtime</scope> | ||
| </dependency> | ||
| </dependencies> | ||
| <build> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before starting native Flink support, I tested Beam’s FlinkRunner and it worked well.
It cannot be run with exec:java due to Flink dependency conflicts. However, it runs correctly on a Flink cluster.
Test Result