Init spatial experiment #1

keviny2 · 2025-01-21T23:09:03Z

No description provided.

jkanche

Looks great! Awesome work!

src/spatialexperiment/SpatialExperiment.py

jkanche · 2025-01-22T14:49:50Z

Just remembered, slicing needs to be updated so it also slices the image and spatial coords props. Technically only need to implement this function similar to SingleCellExperiment here

…ta and img_data

setup.cfg

src/spatialexperiment/SpatialExperiment.py

src/spatialexperiment/_validators.py

src/spatialexperiment/SpatialExperiment.py

…ata' and 'img_data'

keviny2 · 2025-01-25T00:52:04Z

Recording behavior regarding img_data and column_data.

Constructor

All sample_ids in img_data must be present in column_data['sample_id'], but not all sample_ids in column_data need to be present in img_data['sample_id'] (i.e. there can be sample_ids in column_data['sample_id'] that are not in img_data['sample_id']

column_data setter

Bioconductor only checks that the number of unique sample_ids in column_data remains constant, irrespective of the actual values of the new sample_ids
It is possible to set a new column_data with a completely different set of sample_ids (doesn't even have to include the sample_id in img_data); as long as the number of unique sample_ids in the new column_data match the old column_data. Confusingly, doing this this will actually modify the sample_id in the img_data...

Here's an example

library(SpatialExperiment)
dir <- system.file(
   file.path("extdata", "10xVisium", "section1", "outs"),
   package = "SpatialExperiment")

# read in counts
fnm <- file.path(dir, "raw_feature_bc_matrix")
sce <- DropletUtils::read10xCounts(fnm)

# read in image data
img <- readImgData(
    path = file.path(dir, "spatial"),
    sample_id = "foo")

# read in spatial coordinates
fnm <- file.path(dir, "spatial", "tissue_positions_list.csv")
xyz <- read.csv(fnm, header = FALSE,
    col.names = c(
        "barcode", "in_tissue", "array_row", "array_col",
        "pxl_row_in_fullres", "pxl_col_in_fullres"))
xyz$sample_id <- c(rep("bar",25),rep("foo",25))

# construct observation & feature metadata
rd <- S4Vectors::DataFrame(
    symbol = rowData(sce)$Symbol)

# construct 'SpatialExperiment'
spe <- SpatialExperiment(
    assays = list(counts = assay(sce)),
    rowData = rd, 
    colData = DataFrame(xyz), 
    spatialCoordsNames = c("pxl_col_in_fullres", "pxl_row_in_fullres"),
    imgData = img
)

xyz_new <- DataFrame(xyz)
xyz_new$sample_id <- c(rep("bar",25),rep("baz",25))
colData(spe) <- xyz_new

# no more `foo` in `sample_id`
colData(spe)

# `sample_id` changes to `baz`
imgData(spe)

@jkanche How do you think we should handle this?

jkanche · 2025-01-25T06:58:12Z

I can understand why this might be the case (if the image is huge or someone doesn't care too much about the image itself). So I propose In addition to the current check you already have in the constructor, we

Rule (1): check if all sample_ids in column_data are accounted for in imgdata["sample_id"] and if they do not, we don't raise an exception but warn the user. We may in the future after talking to the SPE folks can decide to change this to an Exception rather than a warning.

Constructor

All sample_ids in img_data must be present in column_data['sample_id'], but not all sample_ids in column_data need to be present in img_data['sample_id'] (i.e. there can be sample_ids in column_data['sample_id'] that are not in img_data['sample_id']

I don't understand the checking for unique number of sample_ids. So to be consistent, what makes sense to me right now is we do the same as what we do in the constructor: all sample_ids in img_data need to be in column_data's sample_id column and apply Rule (1).

column_data setter

* Bioconductor only checks that the number of unique `sample_id`s in `column_data` remains constant, irrespective of the actual values of the new `sample_id`s

* It is possible to set a new `column_data` with a completely different set of `sample_id`s (doesn't even have to include the `sample_id` in `img_data`); as long as the number of unique `sample_id`s in the new `column_data` match the old `column_data`. Confusingly, doing this this will actually modify the `sample_id` in the `img_data`...

Does that make sense?

edit 1: If you also agree, lets document this behavior in the constructor and the column data setter methods.

keviny2 · 2025-01-27T19:14:50Z

@jkanche I think the reason why the sample_ids don't have to match in the column_data setter is so that you can 'rename' samples. See this test case: https://github.com/drighelli/SpatialExperiment/blob/devel/tests/testthat/test_SpatialExperiment-colData.R#L20

See #8

keviny2 added 10 commits January 16, 2025 15:26

EOD: progress on SpatialExperiment and SpatialImage

d42e8aa

EOD: Add suggestions by Jay - made some progress

abedc5b

Write getters and setters for spatial_coords

46db013

Write getters and setters for img_data

8d07afa

Implement get_img() and add_img()

aeaf63f

Coerce frames to BiocFrame

8b1e2c2

Add SpatialExperiment variables to copy

7b2cc80

Add unifnished printing section

34f1cc5

Reduce complexity of constructor

1fa867e

Remove type hints in docstrings

c5c308a

keviny2 requested a review from jkanche January 21, 2025 23:09

Remove sample_id from constructor

721089e

jkanche reviewed Jan 21, 2025

View reviewed changes

src/spatialexperiment/SpatialExperiment.py Outdated Show resolved Hide resolved

src/spatialexperiment/SpatialExperiment.py Outdated Show resolved Hide resolved

src/spatialexperiment/SpatialExperiment.py Outdated Show resolved Hide resolved

Implement getters and setters for spatial coordinates names

a35ae28

jkanche linked an issue Jan 23, 2025 that may be closed by this pull request

SpatialExperiment class #2

Closed

keviny2 added 5 commits January 23, 2025 09:48

Validate img_data and spatial_coords in constructor

6780ce2

Add validation for one-to-one mapping between sample_ids in column_da…

0891311

…ta and img_data

Override column_data setter

ecbd3ce

Make getters and setters consistent

ae55ff0

Write first test

1fb5d79