Skip to content

Consider partitioning eventspaces when using Subgroups #284

@denisalevi

Description

@denisalevi

Using Subgroups in Brian2CUDA can be rather costly because we need to check for each neuron in the eventspace if it belongs to the subgroup during spike propagation / synaptic effect application / spike and rate monitoring (see #283). I'm wondering if we could get around all those issues by partitioning eventspaces based on subgroups with additional counters per subgroup. This could e.g. be achieved by splitting CUDA blocks in the thresholder across subgroups. Synapse computations could then be performed only based a subgroup's partition. Spike recorders could directly copy the spikes from a single partition, and rate recorders would have access to the number of spikes in their partition.

This should probably be optional per NeuronGroup, something like a partition_by_subgroups flag. Because while it might make subgroup computations faster, it would likely slow down computations on the full NeuronGroup:

  • For no or homogeneous delays, we call the effect application kernel with as many blocks as there are spiking neurons and each block reads one spiking neuron ID. If we partition the evenstpaces, we would have to read all counters to determine the location of a blocks neuron ID. Would require some benchmarking to find out if this makes sense. The memory reads would be broadcast to all threads, but still for many subgroups one would have to perform as many global memory reads.
  • We wouldn't be able to speed up the resetter as suggested in Call reset kernel only with as many threads as there are spiking neurons (not as there are neurons in total) #272
  • Spikemonitors would have to first get rid of all the -1 values in the eventspace (or we would have to perform costly copying around of the eventspace during thresholding to remove all -1 then).

Alternatively, one could just encourage using multiple NeuronGroups instead of Subgroups in Brian2CUDA. If concurrent kernel execution is implemented, that might be just the easier way and would just produce exactly the partitioning I'm talking about. For too many NeuronGroups though, this might increase compile times significantly as long as we keep one source file per codeobject. This would be especially relevant for denisalevi/brian2-network-multiplier.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions