CoCa CLIP loss conditioning
Created on March 20|Last edited on April 10
Comment
Run set
5
Results
Model trained from epoch 60->76 with CLIP similarity conditioning. Here's how it compares to the normal epoch 76 model:
Normal Zero-Shot Imagenet:
imagenet-zeroshot-val-top1: 0.7182 imagenet-zeroshot-val-top5: 0.9271
improves to
imagenet-zeroshot-val-top1: 0.7234 imagenet-zeroshot-val-top5: 0.9284
MSCOCO caption generation:
{"dataset": "mscoco_captions", "model": "coca_ViT-L-14", "pretrained": "/fsx/iejmac/open_clip_dev/open_clip/src/logs/adpt_coca/epoch_76.pt", "task": "mscoco_generative", "metrics": {"Bleu_1": 0.3084563745235632, "Bleu_2": 0.1880991219962997, "Bleu_3": 0.1151609364894224, "Bleu_4": 0.07211067085076828, "METEOR": 0.1212304498145956, "ROUGE_L": 0.2605558512821119, "CIDEr": 0.34406495909559154, "SPICE": 0.09105146670053557}, "language": "en"}
improves to
{"dataset": "mscoco_captions", "model": "coca_ViT-L-14", "pretrained": "/fsx/iejmac/open_clip_dev/open_clip/src/logs/adpt_coca/adpt_epoch_76.pt", "task": "mscoco_generative", "metrics": {"Bleu_1": 0.3239379083831527, "Bleu_2": 0.1973981451426982, "Bleu_3": 0.12121751751996586, "Bleu_4": 0.07490514239710015, "METEOR": 0.12607178324939913, "ROUGE_L": 0.2671355955745214, "CIDEr": 0.3552257461803831, "SPICE": 0.09305231270489032}, "language": "en"}
Add a comment