Merge pull request #52 from jmisilo/51-retraining

51 retraining
jmisilo · Nov 6, 2022 · ef9922b · ef9922b
2 parents e7519d4 + 2b22c04
commit ef9922b
Show file tree

Hide file tree

Showing 10 changed files with 14 additions and 11 deletions.
diff --git a/examples/23012796.jpg b/examples/23012796.jpg
diff --git a/examples/36979.jpg b/examples/36979.jpg
diff --git a/examples/3787801.jpg b/examples/3787801.jpg
diff --git a/examples/7757242158.jpg b/examples/7757242158.jpg
diff --git a/examples/89407459.jpg b/examples/89407459.jpg
diff --git a/examples/loss_lr.jpg b/examples/loss_lr.jpg
diff --git a/readme.md b/readme.md
@@ -10,17 +10,17 @@ The Model uses prefixes as in the [ClipCap](https://arxiv.org/abs/2111.09734) pa
 
 The Model was trained with a frozen CLIP, a fully trained Mapping Module (6x Transformer Encoder Layers) and with partially frozen GPT-2 (the first and last 14 layers were trained).
 
-The training process was carried out using the [Kaggle](https://www.kaggle.com/) P100 GPU. Training time is about 2 x 11h (106 epochs) with a linearly changing learning rate (from 0 to 0.0001908) and batch size 64. Originally, the Model was supposed to be trained longer - which results in a non-standard LR. *I also tried a longer training session (150 epochs), but overtraining was noticeable.*
+The training process was carried out using the [Kaggle](https://www.kaggle.com/) P100 GPU. Training time - about 3 x 11h (150 epochs) with a linear learning rate warmup (max LR `3e-3`) and batch size 64. 
 
-### Example results
-
-![Example1](./examples/23012796.jpg)
+#### Loss and Learning Rate during training
 
-![Example2](./examples/3787801.jpg)
+![LOSSxLR](./examples/loss_lr.jpg)
 
-![Example3](./examples/7757242158.jpg)
+### Example results
 
-As I said, the goal was to test the Model's ability to recognize the situation. In the next phase of the experiments, I will try to improve the Model process and parameters to achieve better captions with the same dataset.
+![Example1](./examples/23012796.jpg)
+![Example2](./examples/36979.jpg)
+![Example3](./examples/89407459.jpg)
 
 ### Usage
 
@@ -36,7 +36,10 @@ Create environment and install requirements:
 
 ```bash
 python -m venv venv
+# for windows
 .\venv\Scripts\activate
+# for linux/mac
+source venv/bin/activate
 
 pip install -r requirements.txt
 ```

diff --git a/src/training.py b/src/training.py
@@ -90,7 +90,7 @@
     start_epoch, total_train_loss, total_valid_loss = (
         load_ckp(ckp_path, model, optimizer, scheduler, scaler, device) 
         if os.path.isfile(ckp_path) else 
-        0, [], []
+        (0, [], [])
     )
 
     # build train model process with experiment tracking from wandb

diff --git a/src/utils/config.py b/src/utils/config.py
@@ -15,8 +15,8 @@ class Config:
     num_workers: int = 2
     train_size: int = 0.84
     val_size: int = 0.13
-    epochs: int = 200
-    lr: int = 6e-3
+    epochs: int = 150
+    lr: int = 3e-3
     k: float = 0.33
     batch_size_exp: int = 6
     ep_len: int = 4

diff --git a/src/utils/load_ckp.py b/src/utils/load_ckp.py
@@ -29,4 +29,4 @@ def download_weights(checkpoint_fpath):
         Downloads weights from Google Drive.
     '''
 
-    gdown.download('https://drive.google.com/uc?id=1lEufQVOETFEIhPdFDYaez31uroq_5Lby', checkpoint_fpath, quiet=False)
+    gdown.download('https://drive.google.com/uc?id=10ieSMMJzE9EeiPIF3CMzeT4timiQTjHV', checkpoint_fpath, quiet=False)