Skip to content

melofton/multi-model-chla-prediction

Repository files navigation

Code repository associated with Lofton et al., "The importance of a multi-model ensemble for predicting variable ecological time series across dynamic conditions"

Submitted as a research article to Ecological Applications

Guide for peer reviewers:

This repository contains all data, modeling code, and model output associated with the manuscript. Because running the modeling workflow takes hours to days and required high-performance computing for some models, we have provided:

  1. an example modeling workflow for the purposes of peer review using the ARIMA model in the manuscript, and
  2. a script that uses final model output to generate all final figures for the manuscript.

The example workflow is found in the example_prediction_workflow.Rmd file in the top level directory of the repository. Output from this workflow will be written to the example_workflow_output folder as well as rendered as an .html file.

We have also provided a script, generate_final_figures.R, in the top level directory of this repository that uses the final model output from all models to generate all the final figures in the manuscript, as well as the supplementary figures related to modeling results.

Users who wish to apply the example workflow for other models will find additional guidelines at the bottom of the example_prediction_workflow.Rmd RMarkdown file. However, we note that some models (LSTM and GLM-AED) have more complex workflows due to model structural complexities, long run times, and substantial dependencies. For these models, workflows were run on a high-performance computing cluster in a containerized environment. Specifically, the LSTM model workflow was run using a slightly modified version of the ml-verse container which can be downloaded here, and the GLM-AED model workflow was run using a container developed by R. Quinn Thomas for the Forecasting Lake and Reservoir Ecosystems (FLARE) platform which can be downloaded here. It will likely require additional effort to set up an appropriate computing environment to run the workflows for these models.

Repository folder structure:

  1. code contains all project code

    -archive code that is no longer in use

    -function_library custom functions associated with the project; each sub-folder within this directory contains functions associated with each stage of the workflow (e.g., formatting data or generating prediction)

    -model_files additional files for complex models (LSTM, GLM-AED, and OneDProcessModel)

    -workflow_scripts scripts that scale up the example_prediction_workflow to be applied across many models

  2. data contains processed project data; note that all of the raw project data is downloaded directly from the Environmental Data Initiative repository and the Virginia Ecoforecast Reservoir Analysis forecasting challenge using custom functions and therefore raw data are not stored in this repository

  3. example_workflow_output contains output files from the example_prediction_workflow.Rmd

  4. figures contains figures associated with the project; final figures are in the final_figures sub-folder

  5. model_output contains all model output associated with the project, including, e.g., tables of parameters for fitted models and model diagnostics; the final model prediction results are provided in validation_output.csv