Right now for every source we call the draw kernel once. They should be batched into a single call instead to improve performance.