As the world becomes increasingly interconnected and climate change elevates the risk of zoonotic spillover events, the public becomes ever more susceptible to global-scale outbreaks. Traditional disease surveillance methods are prone to under-reporting and time lags. By contrast, Wikipedia pageviews offer a real-time and cost-effective open source resource for tracking online health-related information-seeking behavior with the potential for enhancing global disease surveillance. This paper investigates the value of anonymized country-level Wikipedia pageviews data for predicting case incidence during the 2022-2024 mpox outbreak in the United States. The methods employed in this study involve a combination of quan- titative techniques aimed at increasing understanding of the relationship between online behaviors and disease dynamics. A lag analysis correlating mpox cases and pageviews for mpox-related Wikipedia articles at different time lags was conducted to assess the variation in directionality between pageviews and cases across mpox-related articles. This was followed by a multivariate linear regression analysis aimed at predicting mpox incidence based on pageview data. Finally, impulse response and Granger-causality tests were performed to further analyze the directionality of the relationship between online activity and mpox cases. The studyβs findings underscore the potential of Wikipedia traο¬ic as a predictive tool for public health trends, revealing a bidirectional relationship between pageviews and mpox cases that unfolds over time. The predictive models struggled with accuracy, highlighting the need for further model refinement to adequately account for the complexity of online attention and disease dynamics.
.
βββ README.md
βββ 1-proposal
β βββ data-report
β βββ pre-analysis-plan
βββ 2-literature
βββ 3-data
β βββ mpox-cases
β βββ mpox-news
β βββ mpox-studies
β βββ wikipedia
β βββ output
βββ 4-code
βββ 5-tables
βββ 6-figures
βββ 7-paper
βββ 8-poster