-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed issue with links not being found #298
base: master
Are you sure you want to change the base?
Commits on Feb 5, 2020
-
Fixed issue with links not being found
Google recently changed the way they present the image data, and so the links were no longer being scraped. I figured out how to get the image urls with the new system and made the appropriate changes so it would work. Unfortunately, google no longer provides file format data so I had to try and retrieve it from the url of the image, which does not work in some cases.
Configuration menu - View commit details
-
Copy full SHA for aa1f012 - Browse repository at this point
Copy the full SHA aa1f012View commit details
Commits on Feb 9, 2020
-
By filtering out the image objects which had data[0]==2, I have removed the null items and it will no longer give the error: "TypeError: 'NoneType' object is not subscriptable".
Configuration menu - View commit details
-
Copy full SHA for 66f69d6 - Browse repository at this point
Copy the full SHA 66f69d6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8b794e0 - Browse repository at this point
Copy the full SHA 8b794e0View commit details -
Configuration menu - View commit details
-
Copy full SHA for a36a378 - Browse repository at this point
Copy the full SHA a36a378View commit details
Commits on Feb 10, 2020
-
This system is not very flexible, it seems google does not keep the same positions of target items, so sometimes it doens't work. I added a try-except just in case there are more problems
Configuration menu - View commit details
-
Copy full SHA for fbc4a16 - Browse repository at this point
Copy the full SHA fbc4a16View commit details
Commits on Mar 14, 2020
-
It is based on patch by https://github.com/Joeclinton1, but for some reason we get escaped string when getting the results page directly (limit < 101) and unescaped one when getting the results page using selenium. This is not the most elegant solution, but it works for me.
Configuration menu - View commit details
-
Copy full SHA for ef577fc - Browse repository at this point
Copy the full SHA ef577fcView commit details
Commits on Mar 24, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 90e52a4 - Browse repository at this point
Copy the full SHA 90e52a4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7db9a46 - Browse repository at this point
Copy the full SHA 7db9a46View commit details
Commits on Mar 25, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 2cd6817 - Browse repository at this point
Copy the full SHA 2cd6817View commit details -
Configuration menu - View commit details
-
Copy full SHA for 068712b - Browse repository at this point
Copy the full SHA 068712bView commit details
Commits on Jun 17, 2020
-
google changed their format a little. again
Alexey Voinov committedJun 17, 2020 Configuration menu - View commit details
-
Copy full SHA for d8dd8a9 - Browse repository at this point
Copy the full SHA d8dd8a9View commit details
Commits on Jun 27, 2020
-
Merge pull request #2 from Joeclinton1/master
Merged master and patch-1
Configuration menu - View commit details
-
Copy full SHA for 18b0e45 - Browse repository at this point
Copy the full SHA 18b0e45View commit details -
Configuration menu - View commit details
-
Copy full SHA for 36f798f - Browse repository at this point
Copy the full SHA 36f798fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 620e7f5 - Browse repository at this point
Copy the full SHA 620e7f5View commit details
Commits on Sep 6, 2020
-
Previously the end_object for the data pack was found by searching for '</script>' and then going 4 characters back, however google in a recent update has added , 'sideChannel: {}});' to the end of the data pack, which throws it off. To fix this the end_object finding script first searches for '</script>' and then searches for the first ']' to the left of that closing script tag. This should be more flexible.
Configuration menu - View commit details
-
Copy full SHA for bcb2af3 - Browse repository at this point
Copy the full SHA bcb2af3View commit details -
Previously if the data unpacking failed it would tell the user that the URL could not be opened. But this is the wrong exception. So i fixed this by splitting up the data un packing and url opening into seperate parts so each can have their own exception. This should make it easier to identify what has gone wrong.
Configuration menu - View commit details
-
Copy full SHA for 58a190b - Browse repository at this point
Copy the full SHA 58a190bView commit details
Commits on Jan 31, 2021
-
Configuration menu - View commit details
-
Copy full SHA for aa817df - Browse repository at this point
Copy the full SHA aa817dfView commit details
Commits on May 25, 2021
-
Configuration menu - View commit details
-
Copy full SHA for 2a310f1 - Browse repository at this point
Copy the full SHA 2a310f1View commit details -
Delete google_images_download.py
just added to wrong directory by accident
Configuration menu - View commit details
-
Copy full SHA for c17c55d - Browse repository at this point
Copy the full SHA c17c55dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 4c5e6a4 - Browse repository at this point
Copy the full SHA 4c5e6a4View commit details
Commits on Jun 1, 2021
-
Configuration menu - View commit details
-
Copy full SHA for dd0b83d - Browse repository at this point
Copy the full SHA dd0b83dView commit details
Commits on Jun 16, 2021
-
Fix clicking on the "Show more results" button with Selenium. - The button has no more "smb" id - We need to do more scroll down before clicking
Configuration menu - View commit details
-
Copy full SHA for 2f9f801 - Browse repository at this point
Copy the full SHA 2f9f801View commit details
Commits on Jun 30, 2021
-
Fix JSONDecodeError: Extra Data
This may have been caused by Google changing their Ajax response. Looking at the response, lines[4] only contained a single number and not any JSON. Removing it and simply pulling from lines[3] seems to fix the issue. The problem only manifested when downloading more than 100 images, which required launching ChromeDriver.
Configuration menu - View commit details
-
Copy full SHA for df2e289 - Browse repository at this point
Copy the full SHA df2e289View commit details
Commits on Aug 25, 2021
-
We extracted images from json.loads(data)[31][0]... because in json.loads(data)[31] was a list of 1 value. Now json.loads(data)[31] is a list of 2 values and we want the last. So replacing 0 by -1 manage this new case and the old one if Google revert this change.
Configuration menu - View commit details
-
Copy full SHA for a8e28e2 - Browse repository at this point
Copy the full SHA a8e28e2View commit details
Commits on Sep 20, 2021
-
The time range feature has changed, I used this tweet thread to fix it : https://twitter.com/i/events/1174066444029419520. We can imagine work on the time_range format to avoid changing the "API".
Configuration menu - View commit details
-
Copy full SHA for 375b6bb - Browse repository at this point
Copy the full SHA 375b6bbView commit details
Commits on Sep 22, 2021
-
Remove time range from directoriy names
It is not very useful to have the time range expression in the image directory names.
Configuration menu - View commit details
-
Copy full SHA for a0c18fd - Browse repository at this point
Copy the full SHA a0c18fdView commit details
Commits on Sep 26, 2021
-
Merge pull request #7 from NicolasGrosjean/patch-3
Get more than 400 images
Configuration menu - View commit details
-
Copy full SHA for 7c91e00 - Browse repository at this point
Copy the full SHA 7c91e00View commit details -
Merge pull request #8 from matthewlehew/patch-1
Fix JSONDecodeError: Extra Data
Configuration menu - View commit details
-
Copy full SHA for 9a0008d - Browse repository at this point
Copy the full SHA 9a0008dView commit details -
Merge pull request #9 from NicolasGrosjean/patch-4
Manage API change
Configuration menu - View commit details
-
Copy full SHA for 9070776 - Browse repository at this point
Copy the full SHA 9070776View commit details -
Merge pull request #10 from NicolasGrosjean/patch-6
Fix time_range argument
Configuration menu - View commit details
-
Copy full SHA for e13cc55 - Browse repository at this point
Copy the full SHA e13cc55View commit details -
Configuration menu - View commit details
-
Copy full SHA for c773e1c - Browse repository at this point
Copy the full SHA c773e1cView commit details
Commits on Feb 23, 2022
-
Update the url building to the new way to get the exact image size thanks to this article : https://www.labnol.org/internet/google-image-size-search/26902/
Configuration menu - View commit details
-
Copy full SHA for 36e5c06 - Browse repository at this point
Copy the full SHA 36e5c06View commit details
Commits on Mar 3, 2022
-
Merge pull request #12 from NicolasGrosjean/patch-7
Fix exact_size parameter
Configuration menu - View commit details
-
Copy full SHA for ce512d9 - Browse repository at this point
Copy the full SHA ce512d9View commit details
Commits on Aug 5, 2022
-
Configuration menu - View commit details
-
Copy full SHA for cf190d8 - Browse repository at this point
Copy the full SHA cf190d8View commit details
Commits on Aug 15, 2022
-
Configuration menu - View commit details
-
Copy full SHA for ae03d01 - Browse repository at this point
Copy the full SHA ae03d01View commit details
Commits on Aug 18, 2022
-
Configuration menu - View commit details
-
Copy full SHA for dcb4619 - Browse repository at this point
Copy the full SHA dcb4619View commit details -
Configuration menu - View commit details
-
Copy full SHA for 03671f3 - Browse repository at this point
Copy the full SHA 03671f3View commit details
Commits on Sep 23, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 945aeff - Browse repository at this point
Copy the full SHA 945aeffView commit details -
Configuration menu - View commit details
-
Copy full SHA for dffca08 - Browse repository at this point
Copy the full SHA dffca08View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3f58a9a - Browse repository at this point
Copy the full SHA 3f58a9aView commit details
Commits on Sep 24, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 219b850 - Browse repository at this point
Copy the full SHA 219b850View commit details
Commits on Sep 26, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 1421a43 - Browse repository at this point
Copy the full SHA 1421a43View commit details
Commits on Sep 30, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 2e117f3 - Browse repository at this point
Copy the full SHA 2e117f3View commit details -
Merge pull request #26 from ellisbrown/patch-1
fix breaking change due to google's response format
Configuration menu - View commit details
-
Copy full SHA for e91e6a3 - Browse repository at this point
Copy the full SHA e91e6a3View commit details