Cannot Find Images for this Search Filter #280

vk379 · 2019-11-03T15:17:01Z

Hey,
I recently noticed, google_images_download is failing to download images at random times.
An example of this would be, if I made a loop to download a certain image, and tried to download it 10 times, it would download it most of the time, but fail at random points. For example, iteration 1 of the loop successfully downloads the images, but iteration 9 fails , for no reason, even though the arguments are kept constant.
Could use any and all help on this issue.
The Error I receive is:
Unfortunately all 50 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Thanks,
-vk379

S-Cardenas · 2019-11-11T03:53:06Z

I have been seeing this issue also and would love to know what's behind it. Could it be that Google is prevent you from downloading images if it detects the same IP address being used too quickly?

vk379 · 2019-11-11T12:34:52Z

I'm not entirely sure. I fixed the issue personally but my code is really inefficient. I am basically running a while loop until till it works.. which means I am calling google images around 3-4 times till I get the URLs I want. Pretty brute force but does the job, would love to see a solution for this issue though.

S-Cardenas · 2019-11-11T21:16:57Z

I was thinking of doing the same thing but don't want to brute force it. I'm more worried about the root of the issue being Google related.

S-Cardenas · 2019-11-18T19:54:31Z

After some digging around, I found the issue to be the result of Google updating the response format. Traditionally, each of Google's image result came in the form of:

<div jscontroller="Q7Rsec" data-ri="0" class="rg_bx rg_di rg_el ivg-i" data-ved="0ahUKEwj0zuvFn_TlAhXJc98KHQC0CAcQMwhkKAAwAA">
    <a jsname="hSRGPd" href="#" jsaction="fire.ivg_o;mouseover:str.hmov;mouseout:str.hmou" class="rg_l" rel="noopener">
        <div class="THL2l"></div><img id="CixhSoPkCojjCM:" src="data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" jsaction="load:str.tbn" class="rg_ic rg_i" alt="Image result for economy chart" data-deferred="1">
        <div class="rg_ilmbg"> 2161&nbsp;&#215;&nbsp;1910 </div>
    </a>
    <a class="iKjWAf irc-nic isr-rtc a-no-hover-decoration" href="#" jsaction="mouseover:m8Yy5c;mousedown:QEDpme;focus:QEDpme;" rel="noopener" target="_blank">
        <div class="mVDMnf nJGrxf">The $80 Trillion World Economy in One Chart</div>
        <div class="nJGrxf FnqxG"><span>visualcapitalist.com</span></div>
    </a>
    <div class="rg_meta notranslate">{"id":"CixhSoPkCojjCM:","isu":"visualcapitalist.com","itg":0,"ity":"jpg","oh":1910,"ou":"http://2oqz471sa19h3vbwa53m33yj-wpengine.netdna-ssl.com/wp-content/uploads/2018/10/world-economy-gdp.jpg","ow":2161,"pt":"The $80 Trillion World Economy in One Chart","rh":"visualcapitalist.com","rid":"vzfo7BtwQ7sOEM","rmt":0,"rt":0,"ru":"https://www.visualcapitalist.com/80-trillion-world-economy-one-chart/","sc":1,"st":"Visual Capitalist","th":211,"tu":"https://encrypted-tbn0.gstatic.com/images?q\u003dtbn:ANd9GcRxpTvHqGYeKsCQATZP0ChgnXw2b4PAzSyBWHkpYNfFE1oqrDi7kg\u0026s","tw":239}</div>
    <div class="ll0QOb"></div>
</div>

This library, google-images-download, then searches for each original image link by searching for rg_meta notranslate in the body of the response. This can be seen in the method _get_next_item of google_images_download.py.

However, Google will occasionally send the response in a different format, such as this:

<div jsaction="IE7JUb:e5gl8b;MW7oTe:fL5Ibf;dtRDof:s370ud;R3mad:ZCNXMe;v03O1c:cJhY7b;" data-ved="2ahUKEwjbwvWwovTlAhWJNN8KHdolChQQMygAegUIARD_AQ" data-ictx="1" data-id="CixhSoPkCojjCM" jsname="N9Xkfe" data-ri="0" class="isv-r PNCib MSM1fd BUooTd" jscontroller="SI4J6c" jsmodel="uZbpBf sB4qxc" jsdata="j0Opre;CixhSoPkCojjCM;7" style="width:179px;" data-tbnid="CixhSoPkCojjCM" data-ct="0" data-cb="0" data-cl="0" data-cr="0" data-tw="239" data-ow="2161" data-oh="1910">
    <a class="wXeWr islib nfEiy mM5pbd" jsname="sTFXNd" jsaction="click:J9iaEb;" jsaction="mousedown:npT2md; touchstart:npT2md;" data-nav="1" tabindex="0" style="height:158px;">
        <div class="bRMDJf islir" jsname="DeysSe" style="background:rgb(248,69,133);width:179px; height:158px;" jsaction="mousedown:npT2md; touchstart:npT2md;"><img class="rg_i Q4LuWd tx8vtf" src="data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" data-iid="12" data-iurl="https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRbMq0HH9-11adBbwZ2LP9hFwRpBoe6UbFGgUvr2zNe0O5dhxyp" jsname="Q4LuWd" alt="Image result for economy chart" /></div>
        <div class="c7cjWc"></div>
        <div class="h312td  RtIwE" jsname="bOERI">
            <div class="O1vY7"><span>2161 × 1910</span></div>
        </div>
        <div class="PiLIec" jsaction="click: gFs2Re"></div>
    </a>
    <a class="VFACy kGQAp" data-ved="2ahUKEwjbwvWwovTlAhWJNN8KHdolChQQr4kDegUIARCAAg" jsname="uy6ald" rel="noopener" target="_blank" href="https://www.visualcapitalist.com/80-trillion-world-economy-one-chart/" jsaction="focus:kvVbVb; mousedown:kvVbVb; touchstart:kvVbVb;">
        <div class="sMi44c lNHeqe">
            <div class="WGvvNb">The $80 Trillion World Economy in One Chart</div>
            <div class="fxgdke">visualcapitalist.com</div>
        </div>
    </a>
</div>

This type of response can not be parsed correctly by the library because the library is specifically looking for rg_meta notranslate to find the original image link. In this updated response, the rg_meta notranslate tag and original image link are absent from the element. The original image link is being provided in a javascript callback AF_initDataCallback.

It's not clear why Google is sending two differently formatted responses. It's also not clear if they will continue to do so in the feature. This particular "bug" could be fixed via pull request but we the community should be mindful of any future changes Google implements to the response format.
@vk379 @hardikvasa

vk379 · 2019-11-18T20:53:44Z

Wow this is an amazing discovery. This makes sense, as to why I am getting so many errors. I think recently google themselves have unrolled a lot of updates, specifically dealing with Google Search and the implementation of a new NLP software called BERT. Maybe they changed a little bit of the code on their end during this which could be causing these changes.

S-Cardenas · 2019-11-18T22:16:16Z

It seems like they're slowly or partially rolling it out since only some of their responses are different.

pixelicous · 2020-02-04T20:53:54Z

I just started using this library, I never manage to get to download anything. Tried also downloading the chromedriver and specifying it in the attributes.. Nada.. also tried looping till i get downloads, at the third try it always gets hung on "Evaluating", restarting the script seem to get the loop rolling again, till the third try..

Unfortunately all 100 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

S-Cardenas · 2020-02-04T21:53:50Z

@pixelicous Could you check if the HTML being returned from google matches either of formats mentioned in: #280 (comment)

vk379 · 2020-02-05T01:20:49Z

This issue just got worse, my aforementioned method of looping until I get the URLs isn't working now, as the URLs are almost never coming up. This is a huge issue. I'm not sure if it's even being worked or not though :(

galfaroth · 2020-02-05T11:35:16Z

I started to use this lib yesterday and was also unable to download almost anything. Sometimes some keywords worked, but some simple ones like "city" did not succeed.

pixelicous · 2020-02-05T11:39:37Z

@S-Cardenas Could you point me on how to get the html returned using the library?

From the other replies it seems I am not the only one 😿

AlNik999 · 2020-02-05T15:44:07Z

the same here. Can't load even a single apple image

S-Cardenas · 2020-02-05T15:49:25Z

@pixelicous I believe the s in https://github.com/hardikvasa/google-images-download/blob/master/google_images_download/google_images_download.py#L714 is the raw html that is returned from the server.

RiddlerQ · 2020-02-05T15:51:39Z

Same here, if i loop it like 100 times then one image sometimes downloads

S-Cardenas · 2020-02-05T21:30:30Z

Jakub, Impressive! What was the issue?

…

On Wed, Feb 5, 2020 at 4:16 PM Jakub Dobies ***@***.***> wrote: After all day of pulling my hairs i finally repaired code. I will upload it after some minor fixes. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#280?email_source=notifications&email_token=ADLS6EPSDNZGWN5CZKN56M3RBMUBDA5CNFSM4JILWRVKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEK5AJWQ#issuecomment-582616282>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADLS6EN33N35QT7EQE7632TRBMUBDANCNFSM4JILWRVA> .

RiddlerQ · 2020-02-05T22:45:06Z

Like you said before "start_line = s.find('rg_meta notranslate')" was the problem.

I deleted my post cause my solution was to simply search for url by finding string that starts with '"https://" and ends with ".jpg"/".png"... , i said that i will upload it but my deadline is tomorrow and my solutions doesn't work with arguments. I created simplifed version with only keywords and number of images to download. It's a lot of work to add all of it, especially that i started learning python 2 days ago ;'c

I will have time after this week so if no one more experienced than me starts to fix it i will try to fully implement parameters etc.

vk379 · 2020-02-06T01:38:02Z

Dude, let me just say that you're a god if you were able to learn python in 2 days and come up with a fix. And honestly, I just need the image URLs man. I am using this library to simply search Images and gifs and get their URLs. Any help is appreciated.

vk379 · 2020-02-06T01:39:05Z

PS: Sorry about accidentally closing and reopening issue!

RiddlerQ · 2020-02-06T02:22:50Z

I will post simplified version tomorrow and full version in a week or two

RiddlerQ · 2020-02-06T06:23:47Z

#298 Someone else beat me up to it <3
And its done well not like mine rookie code

gonjumixproject · 2020-02-18T08:31:47Z

Any update on this one ? Because I have the same issue.. :(

Igor-Shabalin · 2020-02-23T06:47:18Z

Unfortunately all 10 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter! (((

miltosc · 2020-03-01T01:03:53Z

same here ...

googleimagesdownload --keywords "house" --limit 120 --chromedriver ./chromedriver -o ./downloaded-images/

result: Unfortunately all 120 could not be downloaded because some images were not downloadable. 0 is all we got for this search filter!

Please help...

Any other info you want please ask for, dont hesitate

SarthakDandriyal · 2020-03-04T06:26:47Z

Is there a solution for this error.
Some other library or some other method to download images from the internet?

Divyam10 · 2020-03-05T14:25:20Z

Use chrome extension guys. All images Download. It can download images over 700 too

SarthakDandriyal · 2020-03-05T14:28:24Z

@Divyam10 can you guide me how do I use it in python.

miltosc · 2020-03-06T15:36:33Z

Use chrome extension guys. All images Download. It can download images over 700 too

which extension? send a link

RiddlerQ · 2020-03-18T03:07:25Z

I thought the problem was solved ;-;

I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

vk379 · 2020-03-21T19:39:59Z

I thought the problem was solved ;-;

I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

Your new library is sick man, I'm already using it. Thanks a bunch dude.

RiddlerQ · 2020-03-23T01:15:57Z

I thought the problem was solved ;-;
I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

Your new library is sick man, I'm already using it. Thanks a bunch dude.

Thanks dude, you totally made my day <3

Math-crypto · 2020-04-05T05:30:29Z

I thought the problem was solved ;-;

I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

your library is veeeeeeeerrrryyyyyy good thank you
is there any option to download "high-quality" photo? most photo downloaded just to small and vague

RiddlerQ · 2020-04-06T00:24:37Z

I will update my library this week, i was very busy for last few days. Im adding "high-quality" feature on my bucket list. I think that this wont be a problem i intentionally wanted to skip bigger pics because they took long to download.

FarisHijazi · 2020-04-07T21:03:03Z

After some digging around, I found the issue to be the result of Google updating the response format. Traditionally, each of Google's image result came in the form of:

<div jscontroller="Q7Rsec" data-ri="0" class="rg_bx rg_di rg_el ivg-i" data-ved="0ahUKEwj0zuvFn_TlAhXJc98KHQC0CAcQMwhkKAAwAA">
    <a jsname="hSRGPd" href="#" jsaction="fire.ivg_o;mouseover:str.hmov;mouseout:str.hmou" class="rg_l" rel="noopener">
        <div class="THL2l"></div><img id="CixhSoPkCojjCM:" src="data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" jsaction="load:str.tbn" class="rg_ic rg_i" alt="Image result for economy chart" data-deferred="1">
        <div class="rg_ilmbg"> 2161&nbsp;&#215;&nbsp;1910 </div>
    </a>
    <a class="iKjWAf irc-nic isr-rtc a-no-hover-decoration" href="#" jsaction="mouseover:m8Yy5c;mousedown:QEDpme;focus:QEDpme;" rel="noopener" target="_blank">
        <div class="mVDMnf nJGrxf">The $80 Trillion World Economy in One Chart</div>
        <div class="nJGrxf FnqxG"><span>visualcapitalist.com</span></div>
    </a>
    <div class="rg_meta notranslate">{"id":"CixhSoPkCojjCM:","isu":"visualcapitalist.com","itg":0,"ity":"jpg","oh":1910,"ou":"http://2oqz471sa19h3vbwa53m33yj-wpengine.netdna-ssl.com/wp-content/uploads/2018/10/world-economy-gdp.jpg","ow":2161,"pt":"The $80 Trillion World Economy in One Chart","rh":"visualcapitalist.com","rid":"vzfo7BtwQ7sOEM","rmt":0,"rt":0,"ru":"https://www.visualcapitalist.com/80-trillion-world-economy-one-chart/","sc":1,"st":"Visual Capitalist","th":211,"tu":"https://encrypted-tbn0.gstatic.com/images?q\u003dtbn:ANd9GcRxpTvHqGYeKsCQATZP0ChgnXw2b4PAzSyBWHkpYNfFE1oqrDi7kg\u0026s","tw":239}</div>
    <div class="ll0QOb"></div>
</div>

This library, google-images-download, then searches for each original image link by searching for rg_meta notranslate in the body of the response. This can be seen in the method _get_next_item of google_images_download.py.

However, Google will occasionally send the response in a different format, such as this:

Block (17 lines)

<div jsaction="IE7JUb:e5gl8b;MW7oTe:fL5Ibf;dtRDof:s370ud;R3mad:ZCNXMe;v03O1c:cJhY7b;" data-ved="2ahUKEwjbwvWwovTlAhWJNN8KHdolChQQMygAegUIARD_AQ" data-ictx="1" data-id="CixhSoPkCojjCM" jsname="N9Xkfe" data-ri="0" class="isv-r PNCib MSM1fd BUooTd" jscontroller="SI4J6c" jsmodel="uZbpBf sB4qxc" jsdata="j0Opre;CixhSoPkCojjCM;7" style="width:179px;" data-tbnid="CixhSoPkCojjCM" data-ct="0" data-cb="0" data-cl="0" data-cr="0" data-tw="239" data-ow="2161" data-oh="1910">
    <a class="wXeWr islib nfEiy mM5pbd" jsname="sTFXNd" jsaction="click:J9iaEb;" jsaction="mousedown:npT2md; touchstart:npT2md;" data-nav="1" tabindex="0" style="height:158px;">
        <div class="bRMDJf islir" jsname="DeysSe" style="background:rgb(248,69,133);width:179px; height:158px;" jsaction="mousedown:npT2md; touchstart:npT2md;"><img class="rg_i Q4LuWd tx8vtf" src="data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==" data-iid="12" data-iurl="https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRbMq0HH9-11adBbwZ2LP9hFwRpBoe6UbFGgUvr2zNe0O5dhxyp" jsname="Q4LuWd" alt="Image result for economy chart" /></div>
        <div class="c7cjWc"></div>
        <div class="h312td  RtIwE" jsname="bOERI">
            <div class="O1vY7"><span>2161 × 1910</span></div>
        </div>
        <div class="PiLIec" jsaction="click: gFs2Re"></div>
    </a>
    <a class="VFACy kGQAp" data-ved="2ahUKEwjbwvWwovTlAhWJNN8KHdolChQQr4kDegUIARCAAg" jsname="uy6ald" rel="noopener" target="_blank" href="https://www.visualcapitalist.com/80-trillion-world-economy-one-chart/" jsaction="focus:kvVbVb; mousedown:kvVbVb; touchstart:kvVbVb;">
        <div class="sMi44c lNHeqe">
            <div class="WGvvNb">The $80 Trillion World Economy in One Chart</div>
            <div class="fxgdke">visualcapitalist.com</div>
        </div>
    </a>
</div>

This type of response can not be parsed correctly by the library because the library is specifically looking for rg_meta notranslate to find the original image link. In this updated response, the rg_meta notranslate tag and original image link are absent from the element. The original image link is being provided in a javascript callback AF_initDataCallback.

It's not clear why Google is sending two differently formatted responses. It's also not clear if they will continue to do so in the feature. This particular "bug" could be fixed via pull request but we the community should be mindful of any future changes Google implements to the response format.
@vk379 @hardikvasa

Hello, yes indeed google has changed their response format, i have a script that downloads as well and I got around this issue (in a very ugly and hacky way).
I plan on submitting an issue here and submitting a pull request as well, just give me around a week and it should get done.

…itDataCallback() the google page contains info in a script variable `AF_initDataCallback` See the javascript that parses it: https://gist.github.com/FarisHijazi/6c9ba3fb315d0ce9bfa62c10dfa8b2f8 This commit is an implementation to this code.fix-2020-format I have added an iterator that returns rg_meta objects

oDK1 · 2020-04-16T16:58:23Z

I thought the problem was solved ;-;

I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

Total noob developer here. Using cloud IDE (python) and downloaded your library through "pip install simple_image_download" and ran your Test1 file on Python, but nothing shows up. Any idea why?

adtygan · 2020-07-29T15:11:03Z

I thought the problem was solved ;-;

I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

Dude you are a God. You saved my project. I was breaking my head over how to scrape google images using cli and found this thread. Can't thank you enough.

ronnathaniel · 2020-09-19T05:30:18Z

Amazing package. 10/10 👍

nasiksami · 2020-10-16T22:26:14Z

BROOO !!!! What a package! I spent the whole day for this thing. Found few solutions and the easier one was with bing API.

from bing_image_downloader import downloader
downloader.download("bear", limit=250, output_dir='bear', adult_filter_off=True, force_replace=False)

But your one is a gem! Its so easy and with google images! Thank you!

nasiksami · 2020-10-16T22:27:34Z

I thought the problem was solved ;-;
I have my simple downloader on github right now, i wanted to post it earlier but i saw someone here that fixed the issue with google_images_download not finding pics. If someone only needs to download or find urls my code is working every time for me, but its veeeerrryyyy simplified and i applied brute force for searching.

Total noob developer here. Using cloud IDE (python) and downloaded your library through "pip install simple_image_download" and ran your Test1 file on Python, but nothing shows up. Any idea why?

try to check on your working directory folder. C:\Users\Nasik\simple_images\

S-Cardenas mentioned this issue Dec 3, 2019

Can't download any images #285

Open

tmallonee mentioned this issue Feb 5, 2020

Google's image search code has changed. jimkang/g-i-s#4

Closed

vk379 closed this as completed Feb 6, 2020

vk379 reopened this Feb 6, 2020

Norod mentioned this issue Feb 6, 2020

Reverse image feature not working any more #297

Open

monocongo mentioned this issue Feb 11, 2020

Unabe to download any images #299

Closed

SellersEvan mentioned this issue Feb 19, 2020

google change something... #302

Open

Cannot Find Images for this Search Filter #280

Cannot Find Images for this Search Filter #280

Comments

vk379 commented Nov 3, 2019

S-Cardenas commented Nov 11, 2019

vk379 commented Nov 11, 2019 • edited Loading

S-Cardenas commented Nov 11, 2019

S-Cardenas commented Nov 18, 2019

vk379 commented Nov 18, 2019

S-Cardenas commented Nov 18, 2019

pixelicous commented Feb 4, 2020 • edited Loading

S-Cardenas commented Feb 4, 2020

vk379 commented Feb 5, 2020

galfaroth commented Feb 5, 2020

pixelicous commented Feb 5, 2020 • edited Loading

AlNik999 commented Feb 5, 2020

S-Cardenas commented Feb 5, 2020

RiddlerQ commented Feb 5, 2020

S-Cardenas commented Feb 5, 2020 via email

RiddlerQ commented Feb 5, 2020

vk379 commented Feb 6, 2020

vk379 commented Feb 6, 2020

RiddlerQ commented Feb 6, 2020

RiddlerQ commented Feb 6, 2020

gonjumixproject commented Feb 18, 2020 • edited Loading

Igor-Shabalin commented Feb 23, 2020

miltosc commented Mar 1, 2020

SarthakDandriyal commented Mar 4, 2020

Divyam10 commented Mar 5, 2020

SarthakDandriyal commented Mar 5, 2020

miltosc commented Mar 6, 2020

RiddlerQ commented Mar 18, 2020

vk379 commented Mar 21, 2020

RiddlerQ commented Mar 23, 2020

Math-crypto commented Apr 5, 2020

RiddlerQ commented Apr 6, 2020

FarisHijazi commented Apr 7, 2020

oDK1 commented Apr 16, 2020

adtygan commented Jul 29, 2020

ronnathaniel commented Sep 19, 2020

nasiksami commented Oct 16, 2020

nasiksami commented Oct 16, 2020

vk379 commented Nov 11, 2019 •

edited

Loading

pixelicous commented Feb 4, 2020 •

edited

Loading

pixelicous commented Feb 5, 2020 •

edited

Loading

gonjumixproject commented Feb 18, 2020 •

edited

Loading