Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conda-forge compatibility and eventual migration #666

Open
jchodera opened this issue Jan 8, 2017 · 16 comments
Open

conda-forge compatibility and eventual migration #666

jchodera opened this issue Jan 8, 2017 · 16 comments

Comments

@jchodera
Copy link
Member

jchodera commented Jan 8, 2017

cc: #662 #634 #645

I think the majority of us are in agreement that omnia should start to migrate to the excellent conda-forge build system. Even continuum has officially embraced this. However, we need to do this migration in a manner that is not majorly disruptive.

Why can't I just move my recipe to conda-forge right now?

For linux builds, omnia uses a Docker-based build image based on the CentOS 5-based holy build box, which uses an old version of glibc that is maximally compatible with older linux installations still found on many HPC clusters. On the other hand, conda-forge uses a Docker-based build image based on CentOS 6.

As a result, packages built with the conda-forge CentOS 6 build image will fail to run on the omnia CentoOS 5 build image, causing problems for omnia recipes that want to use these as dependencies. For now, this means that any omnia recipes using compiled code must also be built on omnia.

What is the migration plan?

This is certainly open for discussion, but the current migration plan is to get to a point where recipe maintainers can begin to migrate to the conda-forge build infrastructure. This involves the following steps:

  • Update the omnia-build-box Dockerfiles and docker images to a CentOS 6-based image. This would allow us to build packages using the same glibc version as conda-forge. There are some difficulties with this, described in more detail below.
  • At this point, we can adopt one of two options: (1) copy packages from the conda-forge to the omnia channel using the anaconda CLI, though this process can potentially be automated; (2) add the conda-forge channel to the conda build environment, and recommend users of omnia include both channels in their environment.
  • Eventually, we will no longer add new recipes to omnia, but we would maintain the existing packages for reproducibility of published work. We may want to discuss long-term maintenance to ensure reproducibility in a separate thread.

Why haven't we already migrated to a CentOS 6 based build system for conda-forge compatibility?

To support OpenMM, the omnia-build-box image includes some specially-installed build tools:

  • CUDA 8.0
  • AMD APP SDK 3.0
  • clang 3.8
  • TeXLive (and some additions)

None of these tools are available in the current conda-forge linux-anvil build image, so we can't use this directly.

The simplest path to updating to CentOS 6 at first appeared to be to update to the new holy-build-box CentOS 6 base image once this PR was completed. It's not yet clear if this process is stalled or if work was simply suspended for the holidays.

As an alternative, I've started on a different approach that layers these tools on top of the conda-forge linux-anvil. This is incomplete, as additional troubleshooting is needed. But this may present a viable alternative once someone has put the effort in.

What do we need to do now?

I am in the process of restoring the mdtraj 1.8.0 recipe so we can still build packages that depend on mdtraj, like pyemma.

@mpharrigan
Copy link
Contributor

Can you ballpark the "difficulty" of migrating from

  • centos 5 hbb -> centos 6 hbb
  • centos 6 hbb -> linux-anvil
  • centos 5 hbb -> linux-anvil

If I'm understanding correctly, the plan outlined above would migrate from centos 5 hbb -> centos 6 hbb -> linux-anvil (when we move to conda-forge). The alternative plan would be to go centos 5 hbb -> linux anvil directly.

The first plan makes sense if the sum of the difficulty of the two steps is less than the one larger transition. I understand why this might/should be true, but I worry that each transition would be almost as difficult as going from centos 5 straight to linux-anvil [just a hunch, due to us always running into strange issues with these containers]

A complimentary proposal

We could identify "leaf" packages on which nothing depends and move those to conda-forge incrementally. We could work our way up the dependency tree that way, incrementally moving omnia packages to conda-forge when it's safe to do so.

@jchodera
Copy link
Member Author

jchodera commented Jan 8, 2017

If I'm understanding correctly, the plan outlined above would migrate from centos 5 hbb -> centos 6 hbb -> linux-anvil (when we move to conda-forge). The alternative plan would be to go centos 5 hbb -> linux anvil directly.

There are two different levels of effort we have to think about:

  • How much effort will it take us, the omnia maintainers, to retool the build framework?
  • How much effort will it take package maintainers to update their packages in omnia or migrate to conda-forge?

For the omnia maintainers, centos 5 hbb -> centos 6 hbb would be trivial to easy, while my day of fiddling with trying to use a linux-anvil derivative in omnia suggests this will take some time (though it may not actually be difficult), but may require fixing some special recipes (like openmm) that use specific features of the build system.

For recipe maintainers, they would likely not have to do anything to keep their recipes working under omnia for any of these migrations, but for special recipes like openmm, using a linux-anvil derivative for omnia builds would likely shake out any issues ahead of time.

The first plan makes sense if the sum of the difficulty of the two steps is less than the one larger transition. I understand why this might/should be true, but I worry that each transition would be almost as difficult as going from centos 5 straight to linux-anvil [just a hunch, due to us always running into strange issues with these containers]

I'm certainly open to going straight to the linux-anvil derivative, but someone needs to put in the time to make that work.

We could identify "leaf" packages on which nothing depends and move those to conda-forge incrementally. We could work our way up the dependency tree that way, incrementally moving omnia packages to conda-forge when it's safe to do so.

I'm totally OK with this, but who would do the migration?

My hope was to get us to a CentOS 6 endpoint so that package maintainers can migrate their own packages as they see fit, which takes the pressure off the omnia maintainers. Suggesting we actively migrate packages makes more work for us, rather than less. Permitting migration of "leaf" packages is fine, however, if we can guarantee nobody is using them.

We do still have to decide what scheme we will use for allowing omnia packages to use conda-forge packages as dependencies. Any preference for (1) including conda-forge in the build environment channels, and requiring users add both conda-forge and omnia channels, vs (2) copying packages from conda-forge to omnia?

@mpharrigan
Copy link
Contributor

My hope was to get us to a CentOS 6 endpoint so that package maintainers can migrate their own packages as they see fit, which takes the pressure off the omnia maintainers. Suggesting we actively migrate packages makes more work for us, rather than less. Permitting migration of "leaf" packages is fine, however, if we can guarantee nobody is using them.

Yes, I was suggesting we permit / encourage "leaf" owners to migrate

Any preference for (1) including conda-forge in the build environment channels, and requiring users add both conda-forge and omnia channels, vs (2) copying packages from conda-forge to omnia?

I think we'll have to do (2) for a while anyways so users aren't confused why conda install -c omnia packagename no longer works / gets the latest version.

@mpharrigan
Copy link
Contributor

I'm going to try poking around with https://github.com/omnia-md/omnia-linux-anvil

@mpharrigan
Copy link
Contributor

mpharrigan commented Jan 8, 2017

Started building packages (python 3.6, numpy 1.11) on a modified version of omnia-linux-anvil, uploaded here https://anaconda.org/mpharrigan/repo?label=omnia-anvil

Haven't attempted openmm yet

edit: but then ran out of space on my ssd, where docker was doing it's thing by default. Currently trying to move all docker stuff to my data partition :)

@jchodera
Copy link
Member Author

jchodera commented Jan 9, 2017

Awesome! Can you check your dockerfile changes into a PR? Once the PR is merged, it will automatically build an updated image at dockerhub and we can continue to experiment with that on travis by just modifying .travis.yml for conda-dev-recipes or conda-recipes.

@marscher
Copy link
Contributor

marscher commented Feb 6, 2017

@jchodera tex-live-core and clang-dev are available in conda-forge, so you just need to dynamically drag in cuda and amd sdk in the openmm conda build script and build-time depend on the existing packages. There should be no need to have a custom build image.

@jchodera
Copy link
Member Author

jchodera commented Feb 6, 2017

so you just need to dynamically drag in cuda and amd sdk in the openmm conda build script and build-time depend on the existing packages.

Interesting! These packages are huge---CUDA is 1.3GB and AMD APP SDK is 167MB. Do you think that will cause a problem? I suppose OpenMM release builds are infrequent, so as long as this can fit within the CI timeframes, we're OK.

The builds are handled in docker containers with root access, right?

Now that OpenMM 7.1 is out, we can start doing test builds of it in conda-forge.

@marscher
Copy link
Contributor

marscher commented Feb 6, 2017 via email

@jchodera
Copy link
Member Author

jchodera commented Feb 6, 2017

Is the installed cuda sdk much smaller than the install file? I think Nvidia provides a network installer rpm, which only drags in the requested components (do you need all of them, eg. documentation).

Good point. We were grabbing the whole tarball and then just installing just the components we needed to minimize the size of the docker image, but in this case, we probably want the network installer to minimize the total data transfer.

The download time is affected no matter if you download these libs inside the container or they are already included in the docker image (bigger image).

Not quite, since we only install a minimal number of components and then discard the whole tarball/RPMs we don't use.

For testing purposes you could just replace the image name in the circle.yml of the openmm-feedstock, if this is allowed. Since the build-image is derived from the same base, it shouldn't be a problem.

That was our original plan, but it sounds like we should first try your idea of modifying the build script to install the additional components and see if it works!

@salotz
Copy link

salotz commented Oct 11, 2018

What is the status of this? I want to package something that depends on OpenMM.

@marscher
Copy link
Contributor

marscher commented Oct 11, 2018 via email

@Lnaden
Copy link
Contributor

Lnaden commented Oct 11, 2018

This is still true. If you want OpenMM, you would have to publish on Omnia or your own channel

@jchodera
Copy link
Member Author

We think we're getting closer! Several versions of the CUDA Toolkit are now available on the Anaconda channel, so we are hoping we can use those to build CUDA projects on conda-forge:
https://anaconda.org/anaconda/cudatoolkit/files

But not all CUDA Toolkit versions are available, and contributions to the recipes to build the rest are actively being solicited: ContinuumIO/anaconda-recipes#140

@salotz
Copy link

salotz commented Oct 11, 2018

Sounds like some good news. I might refactor my library into the an opemm-less core (since it's just the killer app default) and an extension since I just have a feeling that the cuda version treadmill isn't going to get easier anytime soon though. Once the migration is fully done does that mean omnia will dissolve (except for legacy reasons) or will it still be around?

@jchodera
Copy link
Member Author

I think our hope is to eventually migrate everything to conda-forge and pitch in our efforts there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants