-
-
Notifications
You must be signed in to change notification settings - Fork 516
Feature: add filter and remove functions to CooMatrix #1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ve_row_column for clarity
Hi @SeabertYuan, I appreciate the effort on this PR, but I must also admit I'm a little skeptical if these additions carry their own weight. Could you perhaps say something about possible use cases for these methods? Given that a COO matrix is primarily intended as a way to accumulate matrix entries before conversion to either CSR/COO, I can't really think of a use case where you wouldn't just apply these filtering/removal operations directly while building the COO matrix in the first place. |
Absolutely, it's mostly a convenience thing. So in my particular use case, I need to read a lot of data and it's only known when reading the data which rows/columns need to be removed. For convenience, storing this data in memory as the Additionally, after the data is read, the user might choose to remove rows and columns multiple times. For example in my instance I am building a sparse matrix to run some sort of simulation result. In my case, the user can say "I want to run the simulation again with I can definitely see how I could just filter and remove before building the COO matrix but I would argue that even the original |
I see how these functions are useful to you. I'm just concerned that, well, they might not be useful to anyone else, as it seems to be a very niche use case to me. My general opinion is that COO is primarily an "in-between" format, used to keep unstructured matrix data until it can be turned into a more suitable format (CSR/CSC at present). Therefore I also don't think we want to expand the API with functions to manipulate the data directly in COO form unless there is some substantial demand for this functionality. At the moment I'm therefore inclined to not merge this PR at this time. If there's more demand for this kind of functionality in the future, I'd be happy to revisit this decision. |
Sounds good thanks for your time and input! |
Overview
Adds the following to
CooMatrix
in thenalgebra-sparse
crate:filter
remove_row
remove_column
remove_row_column
Details
Since the regular
DMatrix
supports the following operations:remove_row
remove_column
I tried to maintain a consistent API and added these functionalities to the
CooMatrix
in thenalgebra-sparse
crate. SinceVec
deletions are costly, this implementation allocates a newCooMatrix
based on the old one. There are two trade-offs with this:Copy
must be implemented forT
since we need to copy the values from the old matrix to the new oneTo offset (2) somewhat, I introduced a
remove_row_column
function when both a row and a column need to be removed from a sparse matrix to avoid an intermediate allocation.Differences from
filter
Additionally since
filter
exists inCscMatrix
andCsrMatrix
I felt that there wasn't any reason it shouldn't be inCooMatrix
. One might find it useful to "zero-out" parts of a sparse matrix since theCooMatrix
has the default behaviour of summing together duplicate entries.The
filter
method (implemented forCscMatrix
for example) does not change the structure of the matrix similar to the newfilter
method forCooMatrix
. Thefilter
method is used for the removal functions which change the shape of the matrix.Notes on testing
I'm not sure the exact testing standards, but I added some tests based on existing tests. I am currently using a forked version of
nalgebra-sparse
in a project and have used it as a "real-world" test which I have used to update the tests that I've written.