You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have created a web scraping script using Playwright in Python that works fine until after some tries it gets rate limited. I have also implemented techniques such as rotating proxies and random user agents, but I am still getting rate-limited. After researching advanced scraping techniques like browser fingerprinting and how to spoof it to bypass rate limitations, it comes down to having an undetected browser and most good scraping libraries seem to exist in JavaScript like crawlee and fingerprint-suite, etc. However, I could not integrate those fingerprinting spoofing libraries in Python and am also not interested in using third-party solutions like scrapingbee, incogniton, or dolphin-anty. Moreover, I have also tried writing the script in Selenium and JavaScript Playwright but was not successful in making the script concurrent until I gave up on it. Moreover, It was easy to create a concurrent program using Playwright in Python.
In short, I need help in bypassing the rate limitation and implementing it in my current script. I am also open to other solutions like using a library in either JavaScript or Python that is easy to make concurrent and allows for multiple browser instances to be run concurrently.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have created a web scraping script using Playwright in Python that works fine until after some tries it gets rate limited. I have also implemented techniques such as rotating proxies and random user agents, but I am still getting rate-limited. After researching advanced scraping techniques like browser fingerprinting and how to spoof it to bypass rate limitations, it comes down to having an undetected browser and most good scraping libraries seem to exist in JavaScript like crawlee and fingerprint-suite, etc. However, I could not integrate those fingerprinting spoofing libraries in Python and am also not interested in using third-party solutions like scrapingbee, incogniton, or dolphin-anty. Moreover, I have also tried writing the script in Selenium and JavaScript Playwright but was not successful in making the script concurrent until I gave up on it. Moreover, It was easy to create a concurrent program using Playwright in Python.
In short, I need help in bypassing the rate limitation and implementing it in my current script. I am also open to other solutions like using a library in either JavaScript or Python that is easy to make concurrent and allows for multiple browser instances to be run concurrently.
Beta Was this translation helpful? Give feedback.
All reactions