Skip to content

Dataset layout? #16

@hughperkins

Description

@hughperkins

Some questions on data:

  • Is it a fair impression that each of the reviewsx... files, for beer reviews, is laid out as follows?
[look] [smell] [feel] [taste] [overall]        [input words ...]

? (I used the values for the 'deep brown color with a thin tan head that quickly dissipated' review, to obtain this sequence, by comparison with the page at https://www.beeradvocate.com/beer/profile/144/30806/?ba=Will_Turner , and the numbers in the dataset)

  • why are the datasets broken down into 'aspect1', 'aspect2', etc?
    • Is it a fair impression that each of these is the results of decorrelation, section 5.1, 'Dataset', for that specific aspect?
    • Can I assume that aspect1 is the first aspect, as laid out inside the files, ie [look]?
    • is this also true for 2 and 3, ie:
      • aspect2 is [smell]?, and
      • aspect3 is [feel]?
  • which wordvectors are you using? It looks like you are using something 200-dimensional? Maybe glove 200, from https://nlp.stanford.edu/projects/glove/, ie http://nlp.stanford.edu/data/glove.6B.zip ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions