Skip to content

Conversation

@allansp84
Copy link
Contributor

@allansp84 allansp84 commented Oct 20, 2025

Pull Request Description

Add exclude_from parameter to copy primitive

Summary

This PR adds an optional exclude_from parameter to the copy primitive to enable rsync-based selective file copying for Singularity and Apptainer builds.
It allows users to exclude specific files and directories during image construction — analogous to Docker’s .dockerignore — improving flexibility for large data or codebases.


Motivation

When building complex scientific or HPC images, users often need to avoid copying:

  • Large datasets
  • Intermediate build artifacts
  • Cache directories or virtual environments

Previously, the copy primitive always performed a full copy.
This enhancement introduces optional filtering using rsync --exclude-from, eliminating the need for pre-cleanup scripts.


New Behavior

When exclude_from is specified, HPCCM now emits the copy logic inside the %setup section using rsync:

%setup
    mkdir -p ${SINGULARITY_ROOTFS}/opt/app
    rsync -av --exclude-from=.apptainerignore ./ ${SINGULARITY_ROOTFS}/opt/app/
%files

Otherwise, the behavior remains unchanged and continues to use the %files directive.


Example Usage

# Single exclusion file
copy(src='.', dest='/opt/app', exclude_from='.apptainerignore')

# Multiple exclusion files
copy(src='data', dest='/opt/data', exclude_from=['.ignore1', '.ignore2'])

Implementation Details

  • Added optional exclude_from argument in hpccm/primitives/copy.py.
  • If provided (and not in Docker mode), the copy primitive uses rsync --exclude-from=<file> emitted in %setup.
  • Backward-compatible with existing copy usage.
  • No changes to Docker output or behavior.
  • Added logging.info diagnostic when exclude_from is used.

Testing

Added three new unit tests in test/test_copy.py:

  • test_exclude_from_single_singularity
  • test_exclude_from_multiple_singularity
  • test_exclude_from_docker_ignored

Verified that:

  • rsync and %setup sections are generated for Singularity.
  • %files remains present but only as an empty trailing section.
  • Docker builds ignore the new parameter.
  • All existing 696 tests continue to pass.
pytest -v test/test_copy.py -k exclude_from

Results:

3 passed, 696 passed previously — total 699 tests OK

Backward Compatibility

  • Docker builds: unaffected
  • Singularity/Apptainer: only affected when exclude_from is explicitly set
  • No changes to existing APIs or defaults
  • Fully compatible with _app, _mkdir, _post, and _from

Checklist

Check Status
Code builds locally
Unit tests added
All existing tests pass
Docstring and examples updated
No backward-incompatible API changes

Related Discussions / References

  • Aligns Singularity copy behavior with Docker’s .dockerignore concept.
  • Requested feature in internal HPC and data-science workflows (e.g., CNPEM / LNLS builds).

@allansp84 allansp84 changed the title feat(copy): add exclude_from parameter for rsync-based copy in Singul… feat(copy): add exclude_from parameter for rsync-based copy in Singularity builds Oct 20, 2025
…tring alphabetically

Renamed the parameter `exclude_from` to `_exclude_from` to follow HPCCM
convention that container framework-specific options begin with an underscore
(e.g., `_chown`, `_mkdir`, `_post`).

Also reordered the parameter documentation block in `copy.py` to maintain
alphabetical order within the class docstring for consistency.
Removes the redundant `logging.info()` statement in the rsync exclusion
branch of the copy primitive to keep logging output minimal and consistent
with other primitives.
Refactors the _exclude_from tests to use assertEqual() with full expected
recipe strings instead of multiple substring checks. This aligns the test
style with other HPCCM copy primitive tests.

Note:
When `_exclude_from` is used, an empty %files section is still emitted after
the rsync-based %setup block. This is intentional to preserve compatibility
with the existing copy control flow. The extra section is harmless and may
be removed in a future cleanup.
@samcmill
Copy link
Collaborator

Thanks for the contribution!

If you would please respond to the last minor comment. Otherwise, LGTM!

@allansp84
Copy link
Contributor Author

All updates have been applied and verified. All tests are passing successfully.
Thanks a lot for the great feedback, @samcmill!

@samcmill samcmill merged commit 0c4bff7 into NVIDIA:master Oct 22, 2025
15 checks passed
@allansp84 allansp84 deleted the feature/exclude-from-copy branch October 23, 2025 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants