Training data:
Input: 13 band image of Slovenia, here is a 3 band (rgb) example:
Output: 3 band (rgb) image of Slovenia colored with 9 classes according to the legend:
Architecture:
Image is split into 100x100 frames. Extra frames that overlap are taken to increase data size. Augmentation may be done. If augmented, for each frame 2 random additional transformations are performed (out of the possible 7, from D4 without the identity). The U-Net architecture is used.
After training the model on the data from Slovenia, we perform the semantic segmentation on Novi Sad:
Using only overlapping, 160 epochs. Accuracy 71%.
Using only overlapping, 60 epochs. Accuracy 81%.
Using overlapping and augmentation, 160 epochs. Accuracy 76%.