Skip to main content

VITS fine-tuning

Created on June 30|Last edited on July 5

Test samples

(Samples synthesized from text without reference audio)
Runs:
  • vits-ljs-freeman-angry - fine-tune all weights from LJSpeech checkpoint on angry subset of Morgan dataset
  • vits-vctk-freeman-angry - fine-tuning with frozen text_encoder from VCTK checkpoint on angry subset of Morgan dataset

TestAudios/0-audio
This run didn't log audio for key "TestAudios/0-audio", step 1007822, index 0. Docs →
This run didn't log audio for key "TestAudios/0-audio", step 1007348, index 0. Docs →
VITS
2


Combining Morgan with VCTK

Each sample represents different "mood"
vits-vctk-freeman_x10 is continued from last checkpoint of vits-vctk-freeman with "Freeman" data upsampled to be 10 times more freequent.

Run set
2