Great library and excellent performance!
I would propose that using the predict method of SpanMarkerModel should also pass through the document ID (and sentence ID) that it received as input. This would make workflows easier, because then the output of the model can easily be rejoined with other data.
I am already working on a pull request for this one