-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Follow redirects #538
Comments
Very poorly structured work on how to do this can be found in my repo here: https://github.com/alphagov/data-insights-sandbox/tree/main/hyperlink_tester At the moment this pulls the links from gov.uk-knowledge-graph content embedded_links table and for each link returns:
I will slightly refine this to extract only the final item from the historic status codes and links so that it answers the question raised in the original issue. |
A user stumbled on this problem.
Those two mainstream pages link to https://www.gov.uk/guidance/visa-decision-waiting-times-applications-outside-the-uk, which redirects to https://www.gov.uk/guidance/visa-processing-times-applications-outside-the-uk, hence the user's expectation that GovSearch would include the pages in a search for ones that link to https://www.gov.uk/guidance/visa-processing-times-applications-outside-the-uk. |
Trello
Suppose
/old-page
has been unpublished and redirected to/new page
. You want to find pages that link to/new-page
, and you would like pages that still link to/old-page
to appear in the search results.This could be done for GOV.UK redirects in a similar way to how we follow taxons up the hierarchy, with a
WITH RECURSIVE
SQL statement.For links to external sites, we'd have to visit the links to find out where they redirect to.
The text was updated successfully, but these errors were encountered: