Skip to content

Data18 scraper downloads suboptimal image #2534

@Trousers0788

Description

@Trousers0788

I have been scraping from data18 a lot lately and I have noticed that on their scene pages as initially loaded, they provide a highly-compressed, extra-cropped image that maxes out at 600px wide. I have also noticed that if you click on the 'Gallery #0' button the left, they provide a less-cropped, much less-compressed image that seems to max out at 1000px wide. I have further noticed that the data18 scraper downloads the former image.

  • Typical example
    • The click-through image is only 688px wide (vs 600), but is much less compressed (and less cropped), with the first image having obvious compression artifacts.
  • Best-/worst-case example
    • The click-through image is 1000px wide and a lot of resolution is being left on the table.

I have been using the following script to grab the better image:

https://gist.github.com/Trousers0788/2deabef9972398fbc6b0b62b1d195d26

It's rough and I haven't distilled exactly which headers/parameters are strictly necessary, but the point is it's a working poc that successfully downloads a better image.

It occurs to me that it might make things easier for myself (and better for others) if something similar were part of a scraper that's properly integrated with stash. I'm wholly unfamiliar with the codebase, but at a glance it seems like the primary challenge might be that the existing data18 scraper seems to just be a configuration for a generic scraper. I see there are other scrapers with custom logic, but they seem to be all custom logic (i.e. it doesn't seem like the two approaches can be combined).

Thoughts?

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions