Allow seperate workspaces for convolution/pooling #27

hobofan · 2016-02-21T18:52:21Z

#18 reduces the memory usage for use cases where you need the workspace for both forward and backward (training).

I haven't done any testing but it could be possible the forward workspace is smaller than the backward one, leading to to higher memory usage than necessary in pure forward(/inference) use cases.

This would also help seperate the automatic convolution algorithm detection for those uses cases, leading to quicker startup time (this should also be possible right now by using an Algo different than Auto, but would clearer).

The text was updated successfully, but these errors were encountered:

BREAKING CHANGE: All convolution functions now require a SharedTensor<u8> workspace to be passed. This allows for reuse of the workspace between different convolution operations and a global shared workspace. REFERENCE #27

hobofan added F-CUDA Perf-Memory labels Feb 21, 2016

hobofan added breaking refactor labels Feb 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow seperate workspaces for convolution/pooling #27

Allow seperate workspaces for convolution/pooling #27

hobofan commented Feb 21, 2016

Allow seperate workspaces for convolution/pooling #27

Allow seperate workspaces for convolution/pooling #27

Comments

hobofan commented Feb 21, 2016