-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the content-type mismatch problem #70
base: master
Are you sure you want to change the base?
Conversation
We notice that Compy judge the content type based on the `content-type` field in the http response header. It is possible to cause failure (especially when performing decompression) when the `content-type` field does not match the real type of web content. A better way to recognize the real content type is to use the file format magic code in the file header. To this end, we should get the response.body first (from line 216 to line 225 in ./proxy/proxy.go) 1.To support precise recognition of JPEG files, we can check whether the first 2 bytes in response.body equals to 0xffd8 (from line 230 to line 232 in ./proxy/proxy.go). 2.To support precise recognition of PNG files, we can check whether the first 3 bytes in response.body equals to 0x474946 (from line 237 to line 239 in ./proxy/proxy.go). 3.To support precise recognition of gif files, we can check whether the first 7 bytes in response.body equals to 0x89504e470d0a1a (from line 233 to line 236 in ./proxy/proxy.go). In addition, we fix a syntax error (on line 144).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs some kind of test.
if body_data[0] == 0x47 && body_data[1] == 0x49 && body_data[2] == 0x46 { | ||
real_content_type = "image/png" | ||
} | ||
if real_content_type != "" && real_content_type != content_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of test can you add for this behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi gaul,
we find that quite a few tyros may incorrectly configure the 'content-type' field in HTTP header.
There are many discussions about this stuff on the Internet.
Just to list a few:
https://stackoverflow.com/questions/69890597/web-server-content-type-not-configuring-correctly-on-apache
https://www.drupal.org/project/drupal/issues/1440534
This drives us to think about making compy more robust and friendly to these web pages with the above modifications :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any tests you can add to compy_test.go
?
Thanks for your detailed comments! |
We notice that Compy judge the content type based on the
content-type
field in the http response header. It is possible to cause failure (especially when performing decompression) when thecontent-type
field does not match the real type of web content.A better way to recognize the real content type is to use the file format magic code in the file header.
To this end, we should get the response.body first (from line 216 to line 225 in ./proxy/proxy.go)
1.To support precise recognition of JPEG files, we can check whether the first 2 bytes in response.body equals to 0xffd8 (from line 230 to line 232 in ./proxy/proxy.go).
2.To support precise recognition of PNG files, we can check whether the first 3 bytes in response.body equals to 0x474946 (from line 237 to line 239 in ./proxy/proxy.go).
3.To support precise recognition of gif files, we can check whether the first 7 bytes in response.body equals to 0x89504e470d0a1a (from line 233 to line 236 in ./proxy/proxy.go).
In addition, we fix a syntax error (on line 144).