Skip to main content

StyleGAN-Human: More Accurate Generation Of Full-Body Humans

StyleGAN is widely used for image generation, and yet-to-be-released paper goes into how it can be best used for generation and modification of full-body human figures.
Created on April 22|Last edited on April 22
A yet-to-be-released paper by Fu et al. looks to provide more insight into better ways to generate human using StyleGAN and GAN-type models.
The paper's abstract, as well as the accompanying overview video, describe a rigorously prepared and annotated dataset of full-body human figures they have put together at the dissatisfaction of other already existing datasets.
Also explained is some things they have found out in the process of creating this dataset as to what a quality dataset for full-body human generated should provide as well as how it should be used in the training process.
Furthermore, they go on to demonstrate what their trained model is capable of in terms of full-body human image generation and detail modification.



StyleGAN-Human's SHHQ dataset is extensive

The team behind StyleGAN-Human has prepared a dataset they call SHHQ (Stylish-Humans-HQ), that includes over 230,000 full-body fashion images, all attributes labelled, standardized at 1024x512 resolution. The dataset does pull from existing public datasets, however is combed through to ensure the full-body requirement they are after.
The dataset covers the full diverse range human characteristics, as well as a varied collection of fashion choices including complicated patterns and unique poses. Every attribute important to the image is annotated.
Unlike most other similar datasets, this dataset contains only images featuring the entire human body, presumably all in the style a fashion model's image would be, posing nicely for the camera on a clean background.


Optimizing StyleGAN human image generation

The team behind StyleGAN-Human has identified three things in particular that they wish to share about the process of training GAN for full-body human image generation.
First, to successfully train a vanilla StyleGAN model your dataset must exceed 40,000 images, otherwise the results will be largely unsatisfactory.
Second, a balanced distribution of face poses helps with rare face poses, but evenly distributed cloth patterns does not seem to help more in generating complex patterns.
Third, models trained using the body's middle point as the alignment point works better than training using the center of the face or position of the pelvis as the alignment point.

Applications

While there's a number of established and successful models training on generating and modifying faces, there doesn't seem to be any dedicated to the task of doing the same with full human bodies.
Regardless, the team used a few models (InterFaceGAN, StyleSpace, and SeFa) to display the power of low-to-high level image alteration, being able to fluidly convert features of the image to different states including poses, sleeve lengths, clothing colors, and more. This expands to style-mixing where multiple images are presented and one image is modified to match an attribute of another. And finally in this same vein of alteration, fluid interpolation between two images is also accomplishable.
Beyond general alterations, The power of InsetGAN, a concurrent work to StyleGAN-Human, allows for the seamless stitching of heads onto bodies. In the examples, heads from the FFHQ (Flicker-Faces-HQ) dataset and bodies from the SHHQ dataset are interatively meshed together.

Find out more

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.