-
Notifications
You must be signed in to change notification settings - Fork 3
Description
As an attempt to address issue #176 , I tried replacing the base_url https://info.health.nz/careers with https://info.health.nz/careers/, adding a trailing slash. It returns a 301 to https://info.health.nz/careers without the slash which CWAC should handle, but it ends up with zero pages scanned.
[{2025-10-14T15:40:49+1300} INFO crawler.py : 69 ] iterate_through_base_urls Starting test https://info.health.nz/careers/ ThreadPoolExecutor-0_4
[{2025-10-14T15:40:49+1300} INFO crawler.py : 482 ] crawl Starting crawl of https://info.health.nz/careers/ ThreadPoolExecutor-0_4
[{2025-10-14T15:40:59+1300} INFO filters.py : 236 ] process_url_headers https://info.health.nz/careers/ has status code 200 ThreadPoolExecutor-0_4
[{2025-10-14T15:40:59+1300} INFO crawler.py : 188 ] url_filter_prevent_intersections URL filtered out due to not starting with base_url https://info.health.nz/careers/ https://info.health.nz/ ThreadPoolExecutor-0_4
[{2025-10-14T15:40:59+1300} INFO crawler.py : 597 ] crawl Crawl exhausted all links https://info.health.nz/careers/ ThreadPoolExecutor-0_4
[{2025-10-14T15:40:59+1300} WARNING verify.py : 21 ] verify_axe_results VERIFY: https://info.health.nz/careers/ had 0 pages scanned, not 10 MainThread
Similar things are happening when trying to use:
https://www.mch.govt.nz/our-work/memorials-and-commemorations/oi-manawa-canterbury-earthquake-national-memorial/ with the slash which 301 redirects to
https://www.mch.govt.nz/our-work/memorials-and-commemorations/oi-manawa-canterbury-earthquake-national-memorial
and
https://www.mbie.govt.nz/business-and-employment/economic-growth/going-for-growth/
which 301 redirects to
https://www.mbie.govt.nz/business-and-employment/economic-growth/going-for-growth
And with if we use https://register.charities.govt.nz/, which 302 redirects to https://register.charities.govt.nz/CharitiesRegister/Search, neither the latter URL nor the 3 other pages on that site (/AdvancedSearch, /Account/LogOn, /PowerBI) are found.