-
-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HashJoin has problematic interaction with Merge #61
Comments
I can reproduce this on MapReduce and have a tentative fix, but the new tests also fail on Tez, so will need more time to find a resolution. This fix for MR is here: https://gist.github.com/cwensel/0b116bc7196af667d736f552ab8cb358 |
@cwensel thanks a ton for looking at this! |
I went ahead and pushed the fix for MR (it worked fine in local). I’ll have to address the Tez issue at a later date, after any additional MR issues are resolved. Should see 3.3-wip-18 in 9 hours or so assuming no test failures. Keep an eye out here: http://conjars.org/search?q=cascading-3.3 |
Also watch ‘recent versions’ here, they show when published: http://conjars.org/cascading/cascading-core |
if resolved for MR, feel free to close this issue. |
see this graph on 3.2.1
https://www.dropbox.com/s/iffadh9x7unrg5w/01-BalanceAssembly-init.dot.png?dl=0
You can see the full planner logs here:
https://www.dropbox.com/s/7qyc4a9pxtstwio/E552D2.tgz?dl=0\
We are merging two HashJoins after some Each operations. In this particular graph, it is possible to fix the issue by adding Checkpoints after all but one of the HashJoins it seems. This is not a great solution since even knowing what a graph will look like when you combine many pipes with functions is not very clear.
It would be great to have either a clear rule that we need to follow in generating the graphs, or to remove this restriction since we would like to using cascading 3 in scalding by default.
Thanks.
The text was updated successfully, but these errors were encountered: