Skip to content

Archived sul-embed resources not replaying #192

@edsu

Description

@edsu

While archiving library.stanford.edu with browsertrix-crawler we discovered that embedded SDR viewers on blog posts don’t display and give an error, see:

https://swap.stanford.edu/was/20230505152659/https://library.stanford.edu/blogs/special-collections-unbound/2022/11/born-digital-collections-opened-research-2022

Screenshot 2023-05-24 at 6 01 52 PM

While a crawl of the same page appears to work in ArchiveIt, it is showing the embeds from the live web rather than the capture (which those resources are missing from). Here’s what the browser dev tools Network panel looks like when viewing the Archive-it capture:

Screenshot 2023-05-24 at 12 28 26 PM (1)

While some of the embedded iframe has been captured, looks like maybe some critical resources for rendering were not? For example: https://purl.stanford.edu/pq546tq4448/iiif/manifest is not going through swap. These resources are not loaded by the browser on page load, but only when they scroll into view:

Screen.Recording.2023-05-24.at.5.26.26.PM.mov

So it appears that browsertrix-crawler was not configured to scroll the page?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions