Browsertrix Host
Self-Hosted
What change would you like to see?
I would like the system to pause crawls instead of failing crawls if a page behavior detects the browser is not logged in on supported pages (eg. Facebook).
This way you don´t lose all crawled content (up till the moment you are logged out), you can re-input login-info in Browser-profile and continue crawling afterwards.
Pros: you can input many seeds of a given social media etc. with needing to create many jobs with the "Fail crawl if not logged in" - which will also creativemore administrative work. There migt be a small gab, some missed seeds, but this is acceptable.
If this is implemented it´s a lot easier to start a crawl without needing to monitor it closely to see if eg. a Facebook-loggedin-profile is logged out. The crawl will just continue but not getting the real content. Automically pausing it will help on getting better/less/the relevant data.
It might be great to have an option to get an email if logged out...Like "Pause crawls instead of stopping when quotas are reached or archiving is disabled #2997"
#2997
Additional details
No response