danafarber datasets (and others) are all of the same category and thus don't need make_ngrams_dataset to return (X,Y) but just X