Fine-Tuning CLOOB-latent-diffusion
Training a cloob-conditioned latent diffusion model on the WikiArt dataset for the #huggan event.
Created on April 6|Last edited on April 8
Comment
demo_grid
CLOOB conditioned latent diffusion (https://github.com/JD-P/cloob-latent-diffusion) is similar to the CLIP conditioned diffusion trained by Katherine Crowson (https://github.com/crowsonkb/v-diffusion-pytorch). Latent diffusion means faster training, and CLOOB helps thanks to the higher similarity between embeddings of text and images. This avoids the need for something like the text embed -> image embed prior used in the just-released DALLE-2 (https://openai.com/dall-e-2/).
We're attempting to fine-tune the pretrained model that @JD-P released on a dataset of paintings, in the hope of biasing future outputs in a more artistic direction. I've only just started the training run and will leave it going overnight on 2xA6000 GPUs and see what we get in the morning. You can monitor progress in this report (the demo grid updates every ~5 minutes) or by heading on over to the projects page that catalogues all my attempted runs: https://wandb.ai/johnowhitaker/jw-ft-cloob-latent-diffusion
Since we're fine-tuning a pretrained model, I don't expect dramatic improvements to the loss (below) but the visual aesthetic of the images should change noticeably over time.
EDIT: This trained for ~12 hours before we called it done. See some results in this thread:
A demo space and writeup should be out soon. In the meantime, you can try it in this colab: https://colab.research.google.com/drive/1HPu6tz44brMnKOU4G6Luhy1_iMgDl1fw?usp=sharing
Prompts for the demo grid:
- A photorealist detailed snarling goblin
- A fantasy painting of a city in a deep valley by Ivan Aivazovsky
- a rainy city street in the style of cyberpunk noir, trending on ArtStation
- An oil painting of A Vase Of Flowers
- oil painting of a candy dish of glass candies, mints, and other assorted sweets
- the Tower of Babel by J.M.W. Turner
- sketch of a 3D printer by Leonardo da Vinci
- The US Capitol Building in the style of Kandinsky
- a watercolor painting of a Christmas tree
- control room monitors televisions screens computers hacker lab, concept art, matte painting, trending on artstation
- A painting of a cat playing chess by Salvador Dali, oil on canvas
- A painting of a dog playing checkers by Picasso, oil on canvas
- A mysterious shining orb sits besides a vase of flowers, still life oil on canvas
- The HuggingFace emoji as an oil painting in an ornate frame
- An avocado armchair, an armchair in the shape of an avocado
- Panda mad scientist mixing sparkling chemicals, artstation
Add a comment
