-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong encoding of non-ASCII characters in JSON #40
Comments
Hello @jscissr, You've raised a crucial aspect of encoding that hasn't received much attention until now. Your proposed solution may not be suitable for handling binary data. From my perspective, the appropriate approach would involve expanding the functionality of the |
Yes, my solution doesn't handle binary data, but that's because JSON itself doesn't support that. If you want to support binary, you would have to base64-encode all strings. If on the other hand all strings in your redis database are UTF-8 encoded, then my solution just passes through the strings as is, and you get a UTF-8 encoded JSON. You don't need to decode the UTF-8, you can just pass it through. rdbtools parses UTF-8 and generates unicode escape sequences, but that is not necessary and increases file size. |
We cannot make that assumption. Unless the user explicitly requests the use of UTF-8, the tool should default to printing what could be considered as raw data to prevent potential data corruption. The suggested conversion from UTF-8 to Unicode overlooks the fact that UTF-8 character encoding can vary in length, with sequences of up to 6 bytes. |
Non-ASCII characters are encoded incorrectly by
rdb-cli dump.rdb json
.Example:
Add a key to redis with
SET demo "Müller"
.Run
rdb-cli dump.rdb json
.The result is:
After unescaping, we get "Müller".
For comparison, rdbtools (which does not work with newer redis versions) outputs:
The simplest way to fix this is to avoid escaping non-ASCII characters entirely, and output them as is:
With this change, the result is:
The text was updated successfully, but these errors were encountered: