memory usage bottleneck #49

hy395 · 2019-11-22T23:16:44Z

Hi Avanti,

I've been trying to figure out the memory bottleneck when using tfmodisco. It turns out the initially dense matrix created by seqlets2patterns doesn't take that much memory with 40k seqlets per metacluster (~40gb). I then narrow it down to the graph2binary() in modisco/cluster/phenograph/core.py. graph2binary() creates a really large list before writing it out to a binary file:

181 f.writelines([e for t in zip(ij, s) for e in t])

For a 4gb sparse matrix, the list can be ~60gb. avoid creating this list, I can run tfmodisco with 40k seqlets per metacluster with 200G memory.

I've submitted a PR to make a revision on this. I'm not too familiar with the codebase yet. So let me know if I miss anything.

AvantiShri · 2019-11-23T05:15:48Z

Sounds good to me! To clarify, does it currently take 200G for 40k seqlets even with this modification put in?

hy395 · 2019-11-23T08:23:55Z

For 40k seqlets per metacluster, currently the peak memory usage is ~120G with this modification put in, according to slurm seff. Thanks!

akundaje · 2019-11-23T08:31:15Z

Thanks Han. Av - we still need to still bring this down by a lot - must fit in a Google collab instance or around 12 GB max. usage Maybe using a different implementation of Laden/Louvain might help? Is the high memory usage a problem with the phenograph implementation. The Louivain/laden implementation that Laksshman and Akshay use for the single cell clustering (much larger number of entities) seems to be very efficient with memory and speed. Maybe take a look at that.

…

On Sat, Nov 23, 2019, 12:23 AM Han Yuan ***@***.***> wrote: For 40k seqlets per metacluster, currently the peak memory usage is ~120G with this modification put in, according to slurm seff. Thanks! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#49?email_source=notifications&email_token=AABDWEIWJRH3YJXTVROK73LQVDSBZA5CNFSM4JQXCFY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE7QJ6Y#issuecomment-557778171>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABDWEP5HA6CIIQC5NQYUMTQVDSBZANCNFSM4JQXCFYQ> .

AvantiShri · 2019-11-23T20:38:57Z

Yes, agreed. I don't recall seeing any evidence that the implementation of Louvain/Leiden is causing the issue after Han's fix? Han's fix specifically addressed an issue in the borrowed-from-phenograph code that wrote the binary file that was subsequently called by Louvain.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory usage bottleneck #49

memory usage bottleneck #49

hy395 commented Nov 22, 2019

AvantiShri commented Nov 23, 2019

hy395 commented Nov 23, 2019

akundaje commented Nov 23, 2019 via email

AvantiShri commented Nov 23, 2019

memory usage bottleneck #49

memory usage bottleneck #49

Comments

hy395 commented Nov 22, 2019

AvantiShri commented Nov 23, 2019

hy395 commented Nov 23, 2019

akundaje commented Nov 23, 2019 via email

AvantiShri commented Nov 23, 2019