Skip to content

content on dependency management #38

@nehamoopen

Description

@nehamoopen

I think the section on dependency management could improve with some (re)exploration of the topic and (potential) reorganization of the content.

It would be nice to present options along the reproducibility spectrum from easy (noting dependencies in a README file) to advanced solutions (containerization). During the workshop, we dive into the easy and middle-ground solutions. We should also be prepared to explain the differences between these options better, like what is the difference between renv environments and Docker containers.

I'm going to organize the ideas per programming language for now:

R

  • easy: Suggest the annotater package to annotate package load calls.
  • easy: Use sessionInfo() to print version information about R, the OS and attached or loaded packages + automate writing the output of sessionInfo() into the README or another file.
  • middle-ground: Look into the groundhog package. This is an interesting solution but it has some caveats, see: https://www.brodrigues.co/blog/2023-01-12-repro_r/
  • advanced: Really figure out how renv works, including common issues. Some things to consider: should we ask participants to initialize renv at the beginning of the workshop already when they reorganize their project + how do you ensure that renv only records the project libraries in the lockfile and not the system/global libraries (this happens now and then during the workshop).

Python

I'm not aware of easy and middle-ground solutions in Python that are comparable to the ones listed for R above. It might be nice to do some research into it but I don't think they're necessary to include in the workshop if they are not standard/best practices.

  • advanced: Should we look into virtual environments for Python? venv is apparently the standard library for Python, I think there is also pipenv and conda environments if you use those package managers. Similar to renv - should we ask participants to initialize these environments at the beginning of the workshop already.
  • advanced: Look into the generation of requirements.txt and enviroment.yml again. Similar to renv - how do you ensure that only get project libraries noted and not the system/global libraries. We also assume that everyone uses either pip or conda - is that correct?

Other

  • Binder could be a demo or optional?
  • curate any MATLAB resources on the topic that participants have shared
  • provide links to Docker tutorials

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions