Who are you and what is your research field? Include your name, affiliation, discipline, and the background or context of your overall research that is necessary specifically to introduce your specific case study.
The core of your case study is a diagrammatic workflow sketch, which depicts your the entire pipeline of your workflow. Think of this like a circuit diagram: boxes represent steps, tools, or other disjoint components, and arrows show how outputs of particular steps feed as inputs to others. This diagram will be complemented by a textual narrative.
We recommend the site draw.io for creating this diagram, or you are also welcome to sketch this by hand. While creating your diagram, be sure to consider:
- specialized tools and where they enter your workflow
- the "state" of the data at each stage
- collaborators
- version control repos
- products produced at various stages (graphics, summaries, etc)
- databases
- whitepapers
- customers
Each of the two example case studies include a workflow diagram you can also use for guidance.
Please save your diagram in pdf and svg format alongside this completed case study template. Please also include the raw file containing your drawing, such as the xml
file created by draw.io or a PPTX file if you used PowerPoint.
Referring to your diagram, describe your workflow for this specific project, from soup to nuts. Imagine walking a friend or a colleague through the basic steps, paying particular attention to links between steps. Don't forget to include "messy parts", loops, aborted efforts, and failures.
It may be helpful to consider the following questions, where interesting, applicable, and non-obvious from context. For each part of your workflow:
- Frequency: How often does the step happen and how long does it take?
- Who: Which members of your team participate (or not)?
- Manual/Automated: Is the step automated or does it involve human intervention (if so, is it recorded)?
- Tools: Which software or online tools are used in the step? How are they used?
In addition to detailing the steps of the workflow, you may wish to consider the following questions about the workflow as a whole:
- Data: Is your raw data online?
- Is it citeable?
- Does the license allow external researchers to publish a replication/confirmation of your published work?
- Software: Is the software online?
- Is there documentation?
- Are there tests?
- Are there example input files alongside the code?
- Processing: Is your data processing workflow online?
- Are the scripts documented?
- Would an external researcher know what order to run them in?
- Would they know what parameters to use?
(500-800 words)
Describe in detail the steps of a reproducible workflow which you consider to be particularly painful. How do you handle these? How do you avoid them? (200-400 words)
Discuss one or several sections of your workflow that you feel makes your approach better than the "normal" non-reproducible workflow that others might use in your field. What does your workflow do better than the one used by your lesser-skilled colleagues and students, and why? What would you want them to learn from your example? (200-400 words)
If applicable, provide a detailed description of a particular specialized tool that plays a key role in making your workflow reproducible, if you think that the tool might be of broader interest or relevance to a general audience. (200-400 words)
Please provide short answers (a few sentences each) to these general questions about reproducibility and scientific research. Rough ideas are appropriate here, as these will not be published with the case study. Please feel free to answer all or only some of these questions.