-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of [:cntrl:] character class in tokenize() #5
Comments
Thank you for reporting this issue so thoroughly. I am traveling so I can't make changes right away, It's good that you found the LC_ALL=C work-around, so if other people need an immediate solution they can have it. For a permanent fix I think I will change |
Thanks for taking care. Here is what I did locally, to fix the issue: |
Thank you very much. I will get to it after returning from my travel. Update: while the proposed fix certainly works, the |
See also dominictarr/JSON.sh#46 |
See also dominictarr/JSON.sh#48 |
Aimed at old "onetrueawk" versions as the preceding commit. See #10 for details.
The sample file incorrectly uses U+0092 for a single quote instead of U+2019. It can be fixed by running it through |
Closed ac086a4 |
The POSIX [:cntrl:] character class does not exactly cover the same chars which must be escaped according to https://tools.ietf.org/html/rfc7159#section-7 (i.e. U+0000 through U+001F). [:cntrl:] does also cover U+007F, and all C1 control chars when used in a UTF locale. See the below example where I am getting an error in my locale "en_US.UTF-8". Apart from using LC_ALL=C, the error can be avoided when changing [:cntrl:] to the range defined in the spec: \x00-\x1F.
$ echo world_bank109.json | awk -f JSON.awk > /dev/null
world_bank109.json: expected <value> but got <"> at input token 263
, "productlinetype" : "L" , "project_abstract" : { "cdata" : <<">> T h e o b j e c t i
$ echo world_bank109.json | LC_ALL=C awk -f JSON.awk > /dev/null
(no error message here)
world_bank109.json.txt, which is line 109 from the world bank sample file at http://jsonstudio.com/resources/
The text was updated successfully, but these errors were encountered: