01_download.RDownload latest NEON data products, storing the raw data in<NEONSTORE_HOME>.02_targets.RGenerate the target prediction variables from the raw data files.03_forecast.RRun a dummy forecast and write out to csv files.04_score.RScore the forecast and write out to csv files.
.Renviron
The workflow uses a .Renviron to configure behavior. These are optional parameters that allow
the workflow scripts to publish and download data from EFI server. These could easily be configured
for a workflow using a different server.
-
NEON_TOKENOptional, will speed downloading of raw NEON data -
NEONSTORE_HOMEPath to the neonstore, if you need to override the default. -
AWS_ACCESS_KEY_IDOnly needed to publish data to AWS.S3-style server, such as MINIO -
AWS_SECRET_ACCESS_KEYditto. -
AWS_DEFAULT_REGIONSet todatato download data from ecoforecast.org automatically. -
AWS_S3_ENDPOINTSet to "ecoforecast.org" to download data automatically. -
Running
01_download.Rand02_targets.Rwil upldate the resulting latest target data files and publish them to the EFItargetsbucket. Some time after challenge entries are submitted, this workflow will thus result in an updated set of targets that contains the true values for the sites and times that the teams were trying to submit. -
The
04_score.Rscript will score all.csv.gzforecasts found in theforecasts/bucket that start withbeetle-richness-forecast-<project_id>.csv.gzorbeetle-abund-forecast-<project_id>.csv.gzrespectively (and conform to the same tabular structure used here: columns of: siteID, month, year, value, rep). These scores will be written out toscoresbucket using filenames that correspond to the submission files, replacingforecastwithscore. -
03_forecast.Rgenerates a benchmark forecast based on a simple null model (historical mean and standard deviation). Entry teams will replace this script with their own more involved forecasting mechanisms, generating output forecasts that follow the above filenaming convention. These can be uploaded directly to the challenge server at URL TBD. -
The Challenge Coordinating Team server will use a
cronjob to regularly (intervals TBD) run the full workflow scripts, resulting in updated raw data, derived data, benchmark forecast, and scores for any submissions.
All files are exposed via public buckets on MINIO for now. Automated functions need to be added that will publish the target, benchmark forecast, and possibly benchmark scores in a persistent, version-controlled manner. details TBD