TANGO

Setup Instructions

Prerequisites

Python 3.8 or higher
pip (Python package installer)

Installation

Clone the repository:

git clone https://github.com/mabonmn/TANGO.git
cd TANGO

Create a virtual environment and activate it:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```

Running the Code

Jupyter Notebooks

Launch Jupyter Notebook:
```
jupyter notebook
```
Open the desired notebook (e.g., Dev_Basic.ipynb, scraper.ipynb) and run the cells.

Python Scripts

Run the Python script directly:
```
python scraper.py
```

Main Scripts and Functionalities

`MainNotebook_Eval.ipynb`

Purpose: This notebook demonstrates basic development and testing of the core functionalities.
Sections:
- Data Loading
- Data Preprocessing
- Model Training
- Evaluation

`scraper.py`

Purpose: This script is used to scrape data from the Wikipedia English corpus and save it as a CSV file.
Usage:
```
python scraper.py
```

Running `runDataGen.py`

Ensure you have the original dataset in the dataset directory with the name train.csv.
Run the script:
```
python runDataGen.py
```
The script will generate augmented datasets and save them to dataset/dataset_aug_train_all_new.csv.

Running `bertEval.py`

Ensure you have the augmented dataset generated by runDataGen.py in the dataset directory with the name dataset_aug_train_all_new.csv.
Run the script:
```
python bertEval.py
```
The script will evaluate the quality of sentence augmentations and save the results to dataset/BERTEval.csv.

Dataset Files

`dataset_Small/dataset_aug_train_all_new_clean.csv`

Purpose: This file contains augmented training data for the model.
Usage: Load this CSV file into your data processing pipeline to train the model with augmented data.

`dataset_Small/dataset_aug_train_all_new.csv`

Purpose: This file contains the original augmented training data.
Usage: Similar to the clean version, but may contain raw and unprocessed entries.

`dataset_Small/dataset_train.csv`

Purpose: This file contains the original training data.
Usage: Use this file for initial training and testing of the model.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
dataset_Small		dataset_Small
fig		fig
.gitignore		.gitignore
Dev_Basic_old.ipynb		Dev_Basic_old.ipynb
LICENSE		LICENSE
MainNotebook_Eval.ipynb		MainNotebook_Eval.ipynb
README.md		README.md
ResultViz.ipynb		ResultViz.ipynb
berteval.py		berteval.py
parse-tree-distance-result.ipynb		parse-tree-distance-result.ipynb
parse-tree-distance.ipynb		parse-tree-distance.ipynb
requirement.txt		requirement.txt
scraper.py		scraper.py
sen-sim.ipynb		sen-sim.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TANGO

Setup Instructions

Prerequisites

Installation

Running the Code

Jupyter Notebooks

Python Scripts

Main Scripts and Functionalities

`MainNotebook_Eval.ipynb`

`scraper.py`

Running `runDataGen.py`

Running `bertEval.py`

Dataset Files

`dataset_Small/dataset_aug_train_all_new_clean.csv`

`dataset_Small/dataset_aug_train_all_new.csv`

`dataset_Small/dataset_train.csv`

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

mabonmn/TANGO

Folders and files

Latest commit

History

Repository files navigation

TANGO

Setup Instructions

Prerequisites

Installation

Running the Code

Jupyter Notebooks

Python Scripts

Main Scripts and Functionalities

MainNotebook_Eval.ipynb

scraper.py

Running runDataGen.py

Running bertEval.py

Dataset Files

dataset_Small/dataset_aug_train_all_new_clean.csv

dataset_Small/dataset_aug_train_all_new.csv

dataset_Small/dataset_train.csv

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

`MainNotebook_Eval.ipynb`

`scraper.py`

Running `runDataGen.py`

Running `bertEval.py`

`dataset_Small/dataset_aug_train_all_new_clean.csv`

`dataset_Small/dataset_aug_train_all_new.csv`

`dataset_Small/dataset_train.csv`

Packages