Description
Most of the tools that we use to increase reproducibility (make, git, docker, vagrant, etc) are directly taken from software engineering fields, but we don't really consider how appropriate they are for use in research. I think that research software is has different issues to traditional software, though there is of course a lot of overlap.
I've seen this writing maker
vs using make
for workflows: stringing together existing, reliable, tools is a different proposition to stringing together bits of scientific software that are always changing (see notes here).
Git is great for versioning code, but not so great for dealing with bits of data that go along with the code, simulation outputs, GIS layers, etc. It remains an unsolved problem how to associate these bits together.
Virtual machines are most useful when dealing with relatively stable software. I don't think we have clear workflows for baking in the unchanging bits with the "research bits" that never have a stable version (I'm currently dealing with this on a collaborative project with software engineers).
Research software is different to traditional software: for most scientists, the "final version" is designed to be run only once. I'm interested in working out where tools from software engineering are a good fit and where research-specific tooling is required. Our our problems actually different, or is this just a special case of Special Snowflake Syndrome?