Learning Objectives:

LO3a: Learn about the nature of reproducible research, workflow design, data management and manipulation, dynamic reporting, what the key requirements are, and which resources are available to support these (knowledge).

LO3b: Be able to use available resources to create a workflow for reproducible research (task).

Key components:

Factors that affect reproducibility of research.
Principles of reproducibility, and integrity and ethics in research.
What is the 'reproducibility crisis', and meta-analyses of reproducibility.
Open materials, reagents and hardware, including resources, repositories and standards.
Electronic lab notebooks.
Data analysis documentation and open research workflows.
Living figures, turning scripts into reproducible documents, and Markdown.
Pre-registration and prevention of p-hacking/HARK-ing (Hypothesising After Results are Known).
Reproducible analysis environments (virtualization).
What are the computing options and environments that allow collaborative and reproducible set up.

Who to involve:

Individuals: Andy Byers, Anna Krystalli, Julien Colomb, Rutger Vos, Brian Nosek, Lorena Barba, Karl Broman, Victoria Stodden, John Ioannidis, Chris Chambers.
Organisations: FOSTER, Center for Open Science, COPE, Protocols.io, ROpenSci, Addgene, BITSS, Project TIER.
Other: GOSH Community, Software and Data Carpentry communities.

Key resources:

Tools

Open Science Framework (COS).
Existing reproducible research workshops/practical resources:
- Reproducible Research Workshop (CC-BY, April Clyburne-Sherin & Courtney Soderberg).
- Initial steps towards reproducible research (Karl Broman).
- The Open Science and Reproducible Research course (CC-BY, Christie Bahlai).
- Reproducibility Workshop: Best practices and easy steps to save time for yourself and other researchers (Code Ocean).
- Reproducibility in Science: A guide to enhancing reproducibility in scientific results and writing, ROpenSci.
- Reproducible Research using Jupyter Notebooks workshop (Data Carpentry).
- R markdown workshop (Liberate Science).
- rrtools: Tools for Writing Reproducible Research in R (Ben Marwick).
- Reproducible Research: Walking the Walk. SciPy 2014 workshop.
- Reproducible Python, Pycon 2018 workshop. Repo, Video
- Reproducible Research: Principles and Methods for transparent science, online course in French + English, showing Reproducible Research practices. It offers three tracks: Jupyter+Python, RStudio+R, Emacs Org-Mode + Python/R.
ReproZip, an open source tool for full computational reproducibility.
Software Carpentry and Data Carpentry lessons.
Jupyter notebooks (and JupyterLab), R Markdown, Stencila.
Virtual Machines, Docker, Vagrant, Binder Hub, nteract.io.
- Binder Documentation, for creating custom computing environments that can be shared and used by multiple remote users.
Statcheck, GRIM.
Scienceroot, the first blockchain-based scientific ecosystem.
Online repositories for open hardware: PLOS open source toolkit channel; Open Neuroscience; Open Plant Science; Appropedia; DocuBricks; Hackaday.io.
Bio-protocol, a peer reviewed protocol journal.
BMJ Open Science, a new journal that aims to improve the transparency, integrity and reproducibility of biomedical research.
Evernote, Labguru, sciNote.
AsPredicted.
The Sci-Gaia Open Science Platform.
Improving your statistical inferences, Daniel Lakens.
Statistical Thinking for the 21st Century, an open access online book for introductory statistics.
- Open Stats Lab, Kevin McIntyre.
R for Data Science.
- R tutorial: Introduction to cleaning data with R (DataCamp).
Nextflow, open source tool than enables reproducible and portable computational workflows across cloud and clusters.

Research Articles and Reports

Reproducibility, Virtual Appliances, and Cloud Computing (Howe, 2012).
The Ironic Effect of Significant Results on the Credibility of Multiple-Study Articles (Schimmack, 2012).
Power failure: why small sample size undermines the reliability of neuroscience (Button et al., 2013).
Git can facilitate greater reproducibility and increased transparency in science (Ram, 2013).
Ten simple rules for reproducible computational research (Sandve et al., 2013).
Investigating Variation in Replicability: A "Many Labs" Replication Project (Klein et al., 2014).
An introduction to Docker for reproducible research (Boettiger, 2015).
Opinion: Reproducible research can still be wrong: Adopting a prevention approach (Leek and Peng, 2015).
Replicability vs. reproducibility - or is it the other way around? (Liberman, 2015).
The GRIM test: A simple technique detects numerous anomalies in the reporting of results in psychology (Brown and Heathers, 2016).
What does research reproducibility mean? (Goodman et al., 2016).
Recommendations for open data science (Gymrek and Farjoun, 2016).
Tools and techniques for computational reproducibility (Piccolo and Frampton, 2016).
Transparency, Reproducibility, and the Credibility of Economics Research (Christensen and Miguel, 2017).
A trust approach for sharing research reagents (Edwards et al., 2017).
Estimating the Reproducibility of Psychological Science (Nosek et al., 2017).
Digital Open Science: Teaching digital tools for reproducible and transparent research (Toelch and Ostwald, 2017).
Terminologies for reproducible research (Barba, 2018).
An introduction to statistical and data sciences via R (Ismay and Kim, 2018).
The practice of reproducible research: case studies and lessons from the data-intensive sciences (Kitzes et al., 2018).
bookdown: Authoring Books and Technical Documents with R Markdown (Xie, 2018).
Our path to better science in less time using open data science tools (Lowndes et al. 2017).
A model-centric analysis of openness, replication, and reproducibility (Baumgaertner et al., 2018).
Haves and Have nots must find a better way: The case for Open Scientific Hardware (Chagas, 2018).
Computational Reproducibility via Containers in Social Psychology (Green and Clyburne-Sherin, 2018).
Reproducible research practices, transparency, and open access data in the biomedical literature, 2015-2017 (Wallach et al., 2018).

Key Posts

Data hygiene and data provenance.
- A Data Cleaner's Cookbook.
- Storify by Dawn Bazely.
Failure is moving science forward, Christie Aschwanden.
5 keys to building open hardware, Joshua Pearce.
How to make replication the norm (Gertler et al., 2018).
Reproducibility PI Manifesto, Lorena Barba.
- How to run a lab for reproducible research, Lorena Barba.
- Essential skills for reproducible research computing (Barba et al., 2017).
A toolkit for data transparency takes shape, Jeffrey Perkel.

Other

Institutions, projects, and companies using or providing open hardware/materials: CERN's Open Hardware Repository and Open Hardware License; UFRGS Centro de Tecnologia AcadÃªmica (CTA); Michigan Tech Open Sustainability Technology research group; Open Plant; Trend in Africa; Open Lab Tools Cambridge University; PhotosynQ; PublicLab; BackyardBrains; OpenPCR; OpenROV; Prometheus Science; senseBox, Addgene
Definition of Open Reproducible Research, FOSTER.
Dara skills for reproducible science, Dale Barr and Lisa DeBruine.
Reversible Reproducible Documents, Noam Ross.
Global Open Science Hardware Roadmap, GOSH.
Open and Reproducible Science syllabus (Campbell, 2018).
Reproducible workflows, Brian Palmer.
EQUATOR network (Enhancing the QUAlity and Transparency Of health Research).
Knitr: Elegant, flexible, and fast dynamic report generation with R (Yihui Xie).
- Using Sweave and knitr (RStudio Support).
SoS, multi-language notebook (based on Jupyter Notebook) and workflow system for cost-effective reproducible analysis.
- Introduction to SoS Notebook, SoS Workflow Engine, and the power of backing a polyglot notebook with a workflow engine.
Decentralized research platforms such as DEIP.
FORRT - Framework for Open and Reproducible Research Training.

Tasks:

Find a core data set that is used throughout the examples.
- If possible, the dataset should have a diverse set of formats and styles for different types of analysis.
Designing a reproducible research workflow.
- Create a flowchart of options to help get you started Check if your collaborators, colleagues or supervisors are using the same tools.
- This can be created as a Google doc and shared for collaboration.
- Use validated, standardized reagents where possible.
- Use an electronic lab notebook and best practices for recording protocols and actual steps, reagents used.
How well annotated are your code scripts? As a general rule of thumb, try and include one comment for every three lines of code. Bear in mind, the primary audience is future you and other people less familiar with your code.
Posting raw and cleaned data files.
- Post your data (raw and/or treated) online in a non-proprietary format.
- Make sure it is in a place where you can get a unique identifier for it.
Write a study plan or protocol.
- Preregister your study design using AsPredicted, OSF, or Registered Reports.
- For clinical trials use Clinicaltrials.gov.
Set up a reproducible project using an electronic lab notebook to help organise and track your research.
- Track changes as your research develops using a version control system such as GitHub.
- Document everything done by creating a README file.
- Make sure to select an appropriate license for your repo.
- Convert the notebook into a standard research manuscript.
- In this manuscript, include all necessary code to reproduce any figures and tables in their respective captions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

key_elements.md

key_elements.md

Key components:

Who to involve:

Key resources:

Tasks:

Files

key_elements.md

Latest commit

History

key_elements.md

File metadata and controls

Key components:

Who to involve:

Key resources:

Tasks: