Skip to content

Overhead scales non-monotonically with pmap batch_size #311

@Socob

Description

@Socob

I’ve noticed that for large values of pmap’s batch_size parameter, the progress meter overhead becomes large, which is especially significant since in the cases where you’d want to increase batch_size (individual items are cheap to compute), this makes the overhead dominate over the actual computation.

Example:

using Distributed
addprocs(6)
@everywhere using ProgressMeter
for batch_size in round.(Int, 10 .^ range(0, 4, length=20))
    @show batch_size
    @showprogress pmap(i -> nothing, 1:200_000; batch_size);
end

Plot of execution time vs. batch_size

So it looks like (in this case?) that there’s a “sweet spot” around batch_size=20. I found this surprising, so I’m not sure what the reason is! But it would be good if this behaved better for large batch_size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions