Benchmarking platform and challenge for source extraction from imaging data.
Develop algorithms interactively in standard environments using Docker and Jupyter notebooks. Have your algorithms automatically deployed and tested in the cloud using Spark.
- Go to codeneuro notebooks
- Launch a notebook session and click on the neurofinder section
- Explore the notebooks to learn about data format and see example algorithm designs
- Sign up for an account on github (if you don't already have one)
- Fork this repository
- Create a branch
- Add a folder inside
neurofinder/submissions
with the structure described below - Submit your branch as a pull request and wait for your algorithm to be validated and run!
Submission structure:
neurofinder/submissions/user-name-alogirthm-name/info.json
neurofinder/submissions/user-name-alogirthm-name/run/run.py
neurofinder/submissions/user-name-alogirthm-name/run/__init__.py
The file info.json
should contain the following fields
{
"algorithm": "name of your algorithm",
"description": "description of your algorithm"
}
The file run.py
should contain a function run
that accepts as input an Images
object and an info
dictionary, and returns a SourceModel
(these classes are from Thunder's Source Extraction API). See the existing folder neurofinder/submissions/example-user-example-algorithm/
for an example submission.
Jobs will be automatically run every few days on a dynamically-deployed Spark cluster, so be patient with your submissions. You will be notified of job status via comments on your pull request.
Data sets for evaluating algorithms have been generously provided by the following individuals and labs:
- Simon Peron & Karel Svoboda / Janelia Research Campus
- Adam Packer, Lloyd Russell & Michael Häusser / UCL
- Jeff Zaremba, Patrick Kaifosh & Attila Losonczy / Columbia
- Nicholas Sofroniew & Karel Svoboda / Janelia Research Campus
- Philipp Bethge and Fritjof Helmchen / University of Zurich (in preparation)
All data hosted on Amazon S3 and training data availiable through the CodeNeuro data portal.
All jobs will be run on an Amazon EC2 cluster in a standardized environment. Our notebooks service uses Docker containers to deploy an interactive version of this same environment running in Jupyter notebooks. This is useful for testing and developing algorithms, but is currently limited to only one node.
The environment has following specs and included libraries:
- Python v2.7.6
- Spark v1.3.0
- Numpy v1.9.2
- Scipy v0.15.1
- Scikit Learn v0.16.1
- Scikit Image v0.10.1
- Matplotlib v1.4.3
as well as several additional libraries included with Anaconda. You can develop and test your code in a full cluster deployment by following these instructions to launch a cluster on EC2.
This repo includes a suite for validating and executing pull requests, storing the status of pull requests in a Mongo database, and outputting the results to S3. To run its unit tests:
- Install the requirements with
pip install -r /path/to/neurofinder/requirements.txt
- Call
py.test
inside the base neurofinder directory