Extracting "intermediate", parsed data as a flat Table? #38

ddofer · 2022-08-28T09:26:39Z

Hi! Great dataset!
I am interested in experimenting on it for my own work, as well as comparing ML approaches on it. I want to get the data in the form of a table (amenable to pandas and the like), while keeping the "Raw" data (i.e raw text, labels, marking rows as being from source X, keeping dates, reviewer # as an column (ID) variable, etc'.
I know the pipeline munges these features, but the output is too processed for my purposes - where should I look at in the code, in order to get the intermediate outputs?

e.g. a csv of all reviews and texts, with the raw variables (each in it's own column), across all the datasets? (and train/test splits)?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting "intermediate", parsed data as a flat Table? #38

Extracting "intermediate", parsed data as a flat Table? #38

ddofer commented Aug 28, 2022

Extracting "intermediate", parsed data as a flat Table? #38

Extracting "intermediate", parsed data as a flat Table? #38

Comments

ddofer commented Aug 28, 2022