Skip to main content

Key Word Spotting

Created on November 11|Last edited on November 13

Train initial model:

path: saved/models/kws_sheila/1110_220958/model_acc_0.97_epoch_90.pth
epoch: 100
  • 'mean_val_loss': 0.086
  • 'mean_val_acc': 0.968,
  • 'mean_val_FA': 0.015,
  • 'mean_val_FR': 0.016,
  • 'au_fa_fr': 0.0003
Test model:
  • Num params: 564080
  • 'val_time_inference': 0.0026 sec
  • Model size in mb: 2.152
config:
config = {
'verbosity': 2,
'name': "train",
'log_step': 50,
'exper_name': f"kws_{key_word}",
'key_word': key_word,
'batch_size': 256,
'len_epoch': 200,
'learning_rate': 3e-4,
'weight_decay': 1e-5,
'num_epochs': 100,
'n_mels': 40, # number of mels for melspectrogram
'kernel_size': (20, 5), # size of kernel for convolution layer in CRNN
'stride': (8, 2), # size of stride for convolution layer in CRNN
'hidden_size': 128, # size of hidden representation in GRU
'gru_num_layers': 2, # number of GRU layers in CRNN
'gru_num_dirs': 2, # number of directions in GRU (2 if bidirectional)
'num_classes': 2, # number of classes (2 for "no word" or "sheila is in audio")
'sample_rate': 16000,
'device': device.__str__()
}
Columns [train | val | other]

Run: treasured-universe-16
1