Description
Good day,
I am trying to train on my own dataset as with the case with issue #215. I opted to load the weights from the crowdAI dataset trained model and then continue training on my own images from there.
Using issue #160 as reference, I loaded the weights from best.torch.
(btw, is it correct to use self.load('.../experiments/mapping_challenge_baseline/checkpoints/unet/best.torch')
?)
I also set self._initializar _model_weights = None'
.
However it threw out an error: ‘module’ object has no attribute ‘_rebuild_tensor_v2’
Which I was able to fix via this thread.
Another error occurred:
And I fixed it via this thread.
Now, running python main.py train --pipeline_name unet_weighted
does not throw any more errors, but training seems to not start at all (no prints of epoch 0).
Here is the full printout of the console:
/home/USER/Developer/anaconda3/envs/mapping/lib/python3.6/site-packages/sklearn/externals/joblib/__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
warnings.warn(msg, category=DeprecationWarning)
/home/USER/Developer/ML/open-solution-mapping-challenge/src/utils.py:132: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
config = yaml.load(f)
SHOW-325
https://ui.neptune.ml/shared/showroom/e/SHOW-325
2019-09-26 01-18-48 mapping-challenge >>> training
2019-09-26 01-18-54 steps >>> step xy_train adapting inputs
2019-09-26 01-18-54 steps >>> step xy_train transforming...
2019-09-26 01-18-54 steps >>> step xy_inference adapting inputs
2019-09-26 01-18-54 steps >>> step xy_inference transforming...
2019-09-26 01-18-54 steps >>> step loader adapting inputs
2019-09-26 01-18-54 steps >>> step loader transforming...
2019-09-26 01-18-54 steps >>> step unet unpacking inputs
2019-09-26 01-18-54 steps >>> step unet loading transformer...
2019-09-26 01-18-55 steps >>> step unet transforming...
2019-09-26 01-18-58 steps >>> step mask_resize adapting inputs
2019-09-26 01-18-58 steps >>> step mask_resize transforming...
100%|##########| 16/16 [00:01<00:00, 8.38it/s]
2019-09-26 01-18-59 steps >>> step mask_resize caching outputs...
2019-09-26 01-18-59 steps >>> step category_mapper adapting inputs
2019-09-26 01-18-59 steps >>> step category_mapper transforming...
100%|##########| 16/16 [00:00<00:00, 1761.53it/s]
2019-09-26 01-19-00 steps >>> step mask_erosion adapting inputs
2019-09-26 01-19-00 steps >>> step mask_erosion transforming...
100%|##########| 16/16 [00:00<00:00, 136956.87it/s]
2019-09-26 01-19-00 steps >>> step labeler adapting inputs
2019-09-26 01-19-00 steps >>> step labeler transforming...
100%|##########| 16/16 [00:00<00:00, 132.53it/s]
2019-09-26 01-19-00 steps >>> step mask_dilation adapting inputs
2019-09-26 01-19-00 steps >>> step mask_dilation transforming...
100%|##########| 16/16 [00:00<00:00, 92.15it/s]
2019-09-26 01-19-00 steps >>> step mask_resize loading output...
2019-09-26 01-19-00 steps >>> step score_builder adapting inputs
2019-09-26 01-19-00 steps >>> step score_builder transforming...
100%|##########| 16/16 [00:00<00:00, 18.44it/s]
2019-09-26 01-19-01 steps >>> step output adapting inputs
2019-09-26 01-19-01 steps >>> step output transforming...
(mapping) USER@debian:~/Developer/ML/open-solution-mapping-challenge$
No errors are reported but the training does not seem to start. Do you have any ideas for why this is the case? Thank you.