Fine-tuning CLIP on RSICD
Created on July 22|Last edited on August 3
Comment
Impact of image augmentations
Inroduction of image augmentations helps to reduce the overfitting significantly. The set of image augmentation used for this run is:
One can further explore possible image augmentation strategies.
Run set
2
Impact of text augmentations
The RSICD dataset contains 5 captions per image. For many images, it was 1 caption repeated 5 times. We used back translations [ref] to make captions more diverse. Specifically, MarianMT model [links] was used to generate augmented caption.
Run set
2
Authors
- Dev Vidhani
Add a comment