Releases: tijme/not-your-average-web-crawler
Releases · tijme/not-your-average-web-crawler
1.7.3
Improvement(s):
- The
www
subdomain will be treated the same as no subdomain - Catch exceptions in all the callbacks
Fixed bug(s):
- Fixed possible invalid URL schemes when the
<base>
element
was used.
1.7.2
Improvement(s):
- Exception catching in callback methods
1.7.1
Fixed bug(s):
- Possible deadlock when stopping the crawler using a
CrawlerAction
1.7.0
If you are upgrading from 1.6.x to 1.7.x please use the migration guide.
New feature(s):
- Request methods (scope option)
- Request timeout (performance option)
Improvement(s):
- Close all threads safely on SIGINT
- Run
crawler_after_finish
callback on SIGINT - Removed queue counting overhead
- Removed #anchors from URLs
Fixed bug(s):
- On request error message was an object
1.6.5
New feature(s):
- Callback option before request start (in thread)
- Callback option after request finish (in thread)
Fixed bug(s):
- New line in default user agent
1.6.4
New feature(s):
- Callback option on request error
Fixed bug(s):
- Duplicate queue items on scheme change due to redirect
1.6.3
Fixed bug(s):
- Missing
.semver
file during installation via GIT
1.6.0
If you are upgrading from 1.5.x to 1.6.x please use the migration guide.
New feature(s):
- Support for authentication (e.g. BasicAuth)
- Support for proxies (e.g. SOCKS)
- Added an option for debugging
Improvement(s):
- Request headers have some default headers
- Request headers are case insensitive
- User agent contains
nyawc
including it's version - Updated dependencies to newest versions
1.5.2
1.5.1
Fixed bug(s):
- Missing
.semver
file during installation via pip