-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-20917: Support applyQuotesToAll option in OpenCSVSerde. #5513
base: master
Are you sure you want to change the base?
Conversation
OpenCSV version 3.10 added an option called "applyQuotesToAll." When set to false, output columns were not quoted unless they needed it. This commit adds support for that option in OpenCSVSerde. This commit is based on a patch from Sungwoo Park <[email protected]>, which was originally based on a patch from David Engel <[email protected]>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some minor things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1, pending tests
@gigem, |
This link returns a 404 for me. Do you have another one? In the meantime, I'll see if I can recreate the error locally. |
|
It would seem Boolean.parseBoolean() is not parsing the empty/default property correctly. Unfortunately, I'm not able to reproduce the error (and test any fix) because earlier tests fail in my build. I'm a near total, Maven neophyte so could someone please tell me the magic encanation to run just the OpenCSVSerde (or Serde) tests? |
WDYM, there is no issue with
to run the test:
before change
after change quotes are removed
@gigem, shouldn't we apply the quotes by default if |
Ah, I see you mean now. It was not my intent to change any original defaults. However, this change was made a long time ago for our private use. It was only ressurrected somewhat recently for our use with Hive/MR3 and I assumed that when working with them we'd kept any old behavior unchanged. I'll fix this as soon as I'm able to test it properly. I have more pressing, immediate needs to attend to first.
I don't think so. The desire to quote or not quote all columns can be independtent of the actual, quote character. |
in that case, you could simply add |
I think that would change the default behavior for unsuspecting users. After a little bit of research, I believe the following should fix both the test and default behavior.
My attempts to test it keep failing, though, due to unrelated build issues and I'm reluctant to commit it until I'm sure it's finally correct. |
the above change makes sense to me. |
This fixes the test regression and maintains compatibility with previous versions before applyQuotesToAll support was added.
Quality Gate passedIssues Measures |
This is a revised version of #3718 and includes only the applyQuotesToAll changes. The other, OpenCSV changes were committed separately.
OpenCSV version 3.10 added an option called "applyQuotesToAll." When set to false, output columns were not quoted unless they needed it. This commit adds support for that option in OpenCSVSerde. This commit is based on a patch from Sungwoo Park [email protected], which was originally based on a patch from David Engel [email protected].
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
Is the change a dependency upgrade?
How was this patch tested?