Inefficient MatrixVectorMult_DDRM.innerProduct

I have noticed that `innerProduct`

https://github.com/lessthanoptimal/ejml/blob/2c9d1dc9d7b1bee86da196887ebbc203b3340d65/main/ejml-ddense/src/org/ejml/dense/row/mult/MatrixVectorMult_DDRM.java#L338-L344

performs a lot worse (2x minimum, but varies with size, 1000s) than my original naive implementation.

I think I figured out the reason.

The access of the matrix data likely trashes the CPU cache, because it keeps jumping column: `B.data[k + i*cols]`, where `i` is incremented in the inner loop.

If I swap the loops, I get a back the lost speed.

Before I provide a PR, is there any reason this is does this way?


	for (int k = 0; k < B.numCols; k++) {
	double sum = 0;
	for (int i = 0; i < B.numRows; i++) {
	sum += a[offsetA + i]B.data[k + icols];
	}
	output += sum*c[offsetC + k];
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inefficient MatrixVectorMult_DDRM.innerProduct #201

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inefficient MatrixVectorMult_DDRM.innerProduct #201

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions