Crysty Novel instructionFT Stage report
Train on 1 A100 40GB
Created on August 14|Last edited on August 14
Comment
Training conclusion
Larger dataset (30MB to 60MB) gives this model more robust to write the novel part in EPOCH 10, we got a far more better result,story consistence is still not ideal,this might caused by RWKV_World model only have 4096 ctx-len.we will try to train a new lora on CTX_LEN=128K model to figure out the influence of CTX_LEN afterwards.
Run set
2
Add a comment