-
-
Notifications
You must be signed in to change notification settings - Fork 484
Description
I have been scraping from data18 a lot lately and I have noticed that on their scene pages as initially loaded, they provide a highly-compressed, extra-cropped image that maxes out at 600px wide. I have also noticed that if you click on the 'Gallery #0' button the left, they provide a less-cropped, much less-compressed image that seems to max out at 1000px wide. I have further noticed that the data18 scraper downloads the former image.
- Typical example
- The click-through image is only 688px wide (vs 600), but is much less compressed (and less cropped), with the first image having obvious compression artifacts.
- Best-/worst-case example
- The click-through image is 1000px wide and a lot of resolution is being left on the table.
I have been using the following script to grab the better image:
https://gist.github.com/Trousers0788/2deabef9972398fbc6b0b62b1d195d26
It's rough and I haven't distilled exactly which headers/parameters are strictly necessary, but the point is it's a working poc that successfully downloads a better image.
It occurs to me that it might make things easier for myself (and better for others) if something similar were part of a scraper that's properly integrated with stash. I'm wholly unfamiliar with the codebase, but at a glance it seems like the primary challenge might be that the existing data18 scraper seems to just be a configuration for a generic scraper. I see there are other scrapers with custom logic, but they seem to be all custom logic (i.e. it doesn't seem like the two approaches can be combined).
Thoughts?