Pythia 1B Trained Parameters LoRA Fine-tuning Compare

Created on May 19|Last edited on May 19
Comment
﻿
Compare Trained ParametersComparing LoRA fine-tuning with ShareGPT dataset, mixed English and Chinese.
Brown: Fine-tune after training only embed (embed_in.weight, embed_out.weight).
Green: Fine-tune after training embed + attention (all above + layers.n.post_attention_layernorm.weight, layers.n.post_attention_layernorm.bias, layers.n.attention.query_key_value.weight, layers.n.attention.query_key_value.bias, layers.n.attention.dense.weight, layers.n.attention.dense.bias).
Dark Red: Fine-tune after training all parameters.
﻿
Train Loss
Train Loss
0200m400m600m800m1train/epoch11.21.41.61.822.22.4
ta01-c_2_lora_instruction_tune-zh-tw-ta8000-v1-pythia-1b-a100-t00t1-caf149
ta01-a_2_lora_instruction_tune-zh-tw-ta8000-v1-pythia-1b-a100-t00t1-0845e9
ta01-b_2_lora_instruction_tune-zh-tw-ta8000-v1-pythia-1b-a100-t00t1-ef37b9
Evaluate Loss
Evaluate Loss
0.20.40.60.8train/epoch1.81.922.12.2
ta01-c_2_lora_instruction_tune-zh-tw-ta8000-v1-pythia-1b-a100-t00t1-caf149
ta01-a_2_lora_instruction_tune-zh-tw-ta8000-v1-pythia-1b-a100-t00t1-0845e9
ta01-b_2_lora_instruction_tune-zh-tw-ta8000-v1-pythia-1b-a100-t00t1-ef37b9
Run set3
﻿
﻿
Add a comment