Skip to content

snippers: order of "select" and "snip" matters, but should it  #502

@sergpolly

Description

@sergpolly

Here is a simple example, consider simple CoolerSnipper:

snipper = CoolerSnipper(
    clr,
    view_df=view_df,
    cooler_opts={"balance": "weight"},
    min_diag=2,
)

where view_df is

chrom	start	end	name
chr1	100000000	150000000	foo
chr2	100000000	150000000	bar

inputs are taken from cooltools/tests.
Now, if we select/snip in the "right" order:

# select and snip foo
foo_mat = snipper.select("foo","foo")
foo_snip = snipper.snip(foo_mat, "foo", "foo", (100000000, 107000000, 120000000, 127000000) )
# select and snip bar
bar_mat = snipper.select("bar","bar")
bar_snip = snipper.snip(bar_mat, "bar", "bar", (100000000, 107000000, 120000000, 127000000) )

results look like this:
download-1

when we change the order of snip/select, like so:

# select foo and bar:
foo_mat = snipper.select("foo","foo")
bar_mat = snipper.select("bar","bar")
# snip foo and bar:
foo_snip = snipper.snip(foo_mat, "foo", "foo", (100000000, 107000000, 120000000, 127000000) )
bar_snip = snipper.snip(bar_mat, "bar", "bar", (100000000, 107000000, 120000000, 127000000) )

result would look like so:
download-2

Note that white bar on the foo_snip - it is because snipper.select("bar","bar") modified some instance attributes - _isnan1, _isnan2 etc, so now those attributes from "bar", modified them for "foo" as well ... We're lucky in this case it didn't crash because dimensions of "foo" and "bar" are identical.

Anyhow, this is not a big deal right now for the way snippers are used (they're used in the right order, even in multiprocessing scenario, i hope), but this is potentially confusing and I wanted to document this is a fact of life. potentially - would be nice to decouple selecting and snipping if others agree - also this could be added to such a refactoring wishlist #227

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions