Skip to main content

ReviewKD vs ReviewDKD vs DKD

Created on July 9|Last edited on July 9

Top-1, Top-5 Accuracy comparison between methods



In those graphs exponential smoothing of 0.1 was used. The aggregation for the runs was set to "MAX" while the range was set to "Std. dev"

Computing group metrics from first 10 groups
200250300350400450Step203040506070Accuracy
displayName: cifar100_baselines/reviewdkd-additive,resnet32x4,res8x4
displayName: cifar100_baselines/reviewdkd-UFS,resnet32x4,res8x4
displayName: cifar100_baselines/dkd_our_MIXUP,wrn_40_2,wrn_16_2
displayName: cifar100_baselines/dkd_simple_MIXUP,wrn_40_2,wrn_16_2
displayName: cifar100_baselines/dkd_our_MIXUP,vgg13,vgg8
displayName: cifar100_baselines/dkd_simple_MIXUP,vgg13,vgg8
displayName: cifar100_baselines/dkd_our_MIXUP,res32x4,res8x4
displayName: cifar100_baselines/dkd_simple_MIXUP,res32x4,res8x4
displayName: cifar100_baselines/reviewdkd,wrn_40_2,wrn_16_2,ReviewKD_loss*dkd_loss(l_s-l_t),optimized params
displayName: cifar100_baselines/reviewdkd,vgg13,vgg8, ReviewKD_loss*dkd_loss(l_s-l_t),optimized params
Computing group metrics from first 10 groups
200250300350400450Step5060708090Top-5 Accuracy
displayName: cifar100_baselines/reviewdkd-additive,resnet32x4,res8x4
displayName: cifar100_baselines/reviewdkd-UFS,resnet32x4,res8x4
displayName: cifar100_baselines/dkd_our_MIXUP,wrn_40_2,wrn_16_2
displayName: cifar100_baselines/dkd_simple_MIXUP,wrn_40_2,wrn_16_2
displayName: cifar100_baselines/dkd_our_MIXUP,vgg13,vgg8
displayName: cifar100_baselines/dkd_simple_MIXUP,vgg13,vgg8
displayName: cifar100_baselines/dkd_our_MIXUP,res32x4,res8x4
displayName: cifar100_baselines/dkd_simple_MIXUP,res32x4,res8x4
displayName: cifar100_baselines/reviewdkd,wrn_40_2,wrn_16_2,ReviewKD_loss*dkd_loss(l_s-l_t),optimized params
displayName: cifar100_baselines/reviewdkd,vgg13,vgg8, ReviewKD_loss*dkd_loss(l_s-l_t),optimized params
Run set
261