SpanMarkerModel.predict(dataset) should also output document_id and sentence_id

Great library and excellent performance!

I would propose that using the `predict` method of `SpanMarkerModel` should also pass through the document ID (and sentence ID) that it received as input. This would make workflows easier, because then the output of the model can easily be rejoined with other data.

I am already working on a pull request for this one