Skip to main content

CE loss

Created on August 2|Last edited on August 2

50k100k150ktrain/global_step0.60.811.21.41.61.82
Group 32k
Group 65k
32k
1
65k
2
State
Notes
User
Tags
Created
Runtime
Sweep
_name_or_path
accelerator_config.even_batches
accelerator_config.non_blocking
accelerator_config.split_batches
accelerator_config.use_seedable_sampler
activation_dropout
activation_function
adafactor
adam_beta1
adam_beta2
adam_epsilon
add_cross_attention
apply_spec_augment
architectures
attention_dropout
auto_find_batch_size
auto_map.AutoModel
average_tokens_across_devices
batch_eval_metrics
begin_suppress_tokens
bf16
bf16_full_eval
bos_token_id
chunk_size_feed_forward
classifier_proj_size
d_model
dataloader_drop_last
dataloader_num_workers
dataloader_persistent_workers
dataloader_pin_memory
dataloader_prefetch_factor
ddp_find_unused_parameters
ddp_timeout
debug
decoder_attention_heads
decoder_ffn_dim
decoder_layerdrop
decoder_layers
decoder_start_token_id
disable_tqdm
diversity_penalty
do_eval
do_predict
do_sample
Crashed
huseinzol05
14h 37m 34s
-
mesolitica/gemma3n-audio-encoder-whisper-decoder
true
false
false
true
0
gelu
false
0.9
0.999
1.0000e-8
false
false
["GemmaWhisperForConditionalGeneration"]
0
false
gemma_whisper.GemmaWhisperForConditionalGeneration
false
false
[220,50256]
true
false
50257
0
256
1280
false
32
false
true
2
false
1800
[]
20
5120
0
4
50258
false
0
false
false
false
Crashed
huseinzol05
11m 30s
-
mesolitica/gemma3n-audio-encoder-whisper-decoder
true
false
false
true
0
gelu
false
0.9
0.999
1.0000e-8
false
false
["GemmaWhisperForConditionalGeneration"]
0
false
gemma_whisper.GemmaWhisperForConditionalGeneration
false
false
[220,50256]
true
false
50257
0
256
1280
false
32
false
true
2
false
1800
[]
20
5120
0
4
50258
false
0
false
false
false
1-2
of 2