RxRx1 is the first dataset released by Recursion in the RxRx.ai series and was the topic of the NeurIPS 2019 CellSignal competition. It contains 125,510 images of 6-channel fluorescent cellular microscopy, taken from four kinds of cells perturbed by 1,138 siRNAs. The goal of the competition was to train models that could identify which siRNA was used in a given image taken from an experimental batch not seen in the training data. For more information about RxRx1 please visit RxRx.ai.
RxRx1 is part of a larger set of Recursion datasets that can be found at RxRx.ai and on GitHub. For questions about this dataset and others please email [email protected].
The metadata can be found in metadata.csv
and downloaded from here. The schema of the metadata
is as follows:
Attribute | Description |
---|---|
site_id | Unique identifier of a given site |
well_id | Unique identifier of a given well |
cell_type | Cell type tested |
dataset | The split that this site belongs to; train or test |
experiment | The experiment name, same as explained above |
plate | Plate number within the experiment |
well | Location on the plate |
site | Indication of the location in the well where image was taken (1 or 2) |
well_type | Indicates if the well is a treatment, negative_control , or positive_control |
sirna | The siRNA (ThermoFisher ID) that was introduced into the well |
sirna_id | The siRNAs mapped to integers for ease of classification tasks |
The images are found in images/*
and can be downloaded from here (n.b this is 47GB).
The images are 512x512 8-bit png
files. The image paths, such as HUVEC-1/Plate1/M23_s2_w3.png
,
can be read as:
- Experiment Name: Cell type and experiment number (HUVEC experiment 1)
- Plate Number: 1
- Well location on plate: column M, row 23
- Site: 2
- Channel: 3
All six channels (w1
- w6
) make up a single image of a given site.
Physical resolution: 0.65 micron/pixel.
- June 2019: original release for CellSignal; train images only
- December 2019: updated to include test images after completion of CellSignal competition
- August 2020: file organization updated and license changed to CC-BY-NC-SA
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.