GitHub - HI-FRIENDS-SDC2/hi-friends: SKA data challenge SDC2 solution

Summary

This repository hosts a workflow to process HI data cubes produced by radio interferometers, in particular large data cubes produced by future instruments like the SKA. It extract radio sources and characterize their main properties.

The workflow is managed and executed using snakemake workflow management system. It uses spectral-cube based on dask parallelization tool and astropy suite to divide the large cube in smaller pieces. On each of the subcubes, we execute Sofia-2 for masking the subcubes, find sources and characterize their properties. Finally, the individual catalogs are cleaned, concatenated into a single catalog, and duplicates from the overlapping regions are eliminated. Some diagnostic plots are produced using Jupyter notebook.

HI-FRIENDS team: participation in the SKA Data Challenge 2

This repository contains the workflow used to find and characterize the HI sources in the data cube of the SKA Data Challenge 2. This is developed by the HI-FRIENDS team. The execution of the workflow was conducted in the SP-SRC cluster at the IAA-CSIC. Documentation can be found in HI-FRIENDS SDC2 Documentation (more details below).

Accessibility to the workflow

Following FAIR principles, we are trying to make the workflow as accessible as possible. The contents of this repository and the solution to participate in the SDC2 are published in this Zenodo record. The snakemake workflow is also provided as a singularity and a docker container. The workflow is also published in WorkflowHub. Installation and execution instructions can be found in the online documentation developed in this repository.

Installing

For details on installing and using HI-FRIENDS, please visit the documentation: installation, execution.

License

We are using GNU General Public License v3.0. See full license here.

Citation

Please, use this reference (resolves to most recent version in Zenodo): https://doi.org/10.5281/zenodo.5167659

Documentation

The repository documentation can be found in the HI-FRIENDS SDC2 webpage where you can find details on:

The SKA Data Challenge 2
- The HI-FRIENDS solution to the SDC2
- Workflow general description
- The HI-FRIENDS team
Methodology
- Data exploration
- Feedback from the workflow and logs
- Configuration
- Unit tests
- Software managed and containerization
- Check conformance to coding standards
Workflow Description
- Workflow definition diagrams
- Workflow file structure
- Output products
- Snakemake execution and diagrams
Workflow installation
- Dependencies
- Installation 1. Get conda 2. Get the pipeline and install snakemake
- Deploy in containers - Docker - Singularity - Podman
- Use tarball of the workflow
- Use myBinder
Workflow execution
- Preparation
- Basic usage and verification of the workflow
- Execution on a data cube
SDC2 HI-FRIENDS results
- Our solution
- Score
SDC2 Reproducibility award
- Reproducibility of the solution check list
Developers
- define_chunks module
- eliminate_duplicates module
- filter_catalog module
- run_sofia module
- sofia2cat module
- split_subcube module
Acknowledgments

Contributing

More details in CONTRIBUTING.MD. Summary here:

Coding

Nothing fancy here, just:

Fork this repo
Commit you code
Submit a pull request. It will be reviewed by maintainers and they'll give you proper feedback so you can iterate over it.

Considerations

Make sure existing tests pass
Make sure your new code is properly tested and fully-covered
Following The seven rules of a great Git commit message is highly encouraged
When adding a new feature, branch from master-branch

Testing

As mentioned above, existing tests must pass and new features are required to be tested and fully-covered.

Documenting

Code should be self-documented. But, in case there is any code that may be hard to understand, it must include some comments to make it easier to review and maintain later on.

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
.github/workflows		.github/workflows
.tests/unit		.tests/unit
config		config
docs		docs
resources		resources
workflow		workflow
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
deploy.docker		deploy.docker
deploy.singularity		deploy.singularity
environment.yml		environment.yml
mkdocs.yml		mkdocs.yml
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summary

HI-FRIENDS team: participation in the SKA Data Challenge 2

Accessibility to the workflow

Installing

License

Citation

Documentation

Contributing

Coding

Considerations

Testing

Documenting

About

Releases 1

Packages

Contributors 6

Languages

License

HI-FRIENDS-SDC2/hi-friends

Folders and files

Latest commit

History

Repository files navigation

Summary

HI-FRIENDS team: participation in the SKA Data Challenge 2

Accessibility to the workflow

Installing

License

Citation

Documentation

Contributing

Coding

Considerations

Testing

Documenting

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 6

Languages

Packages