Open
Description
- OCTIS version: 1.11.0
- Python version: 3.8
- Operating System: Windows 10
Description
Hello,
I am having trouble loading my custom dataset. I followed the guide in the main README and am getting the below errors.
What I Did
from octis.dataset.dataset import Dataset
import pandas as pd
df = pd.read_csv("/mnt/mydata/notebooks/data.csv")
df.to_csv('corpus.tsv', sep="\t", header= False, columns=['documents'])
dataset.load_custom_dataset_from_folder("/mnt/mydata/notebooks")
/opt/conda/lib/python3.8/site-packages/octis/dataset/dataset.py:330: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
final_df = df[df[1] == 'train'].append(df[df[1] == 'val'])
/opt/conda/lib/python3.8/site-packages/octis/dataset/dataset.py:331: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
final_df = final_df.append(df[df[1] == 'test'])
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/opt/conda/lib/python3.8/site-packages/octis/dataset/dataset.py in load_custom_dataset_from_folder(self, path, multilabel)
335
--> 336 self.__corpus = [d.split() for d in final_df[0].tolist()]
337 if len(final_df.keys()) > 2:
/opt/conda/lib/python3.8/site-packages/octis/dataset/dataset.py in <listcomp>(.0)
335
--> 336 self.__corpus = [d.split() for d in final_df[0].tolist()]
337 if len(final_df.keys()) > 2:
AttributeError: 'int' object has no attribute 'split'
During handling of the above exception, another exception occurred:
Exception Traceback (most recent call last)
<ipython-input-16-28e6bd2fc3cd> in <module>
1 dataset = Dataset()
----> 2 dataset.load_custom_dataset_from_folder("/mnt/mydata/notebooks")
/opt/conda/lib/python3.8/site-packages/octis/dataset/dataset.py in load_custom_dataset_from_folder(self, path, multilabel)
356 self._load_document_indexes(self.dataset_path + "/indexes.txt")
357 except:
--> 358 raise Exception("error in loading the dataset:" + self.dataset_path)
359
360 def fetch_dataset(self, dataset_name, data_home=None, download_if_missing=True):
Exception: error in loading the dataset:/mnt/mydata/notebooks
Metadata
Metadata
Assignees
Labels
No labels