V0.2 was trained on a completely overhauled training set to reduce biases towards specific faces, outfits, and poses:
No repeating backgrounds and faces (I inpainted over every single face and any distracting backgrounds in the training set)
More diverse outfits (50% or more of the subjects were wearing denim jeans in the previous version)
More diverse poses (spread toes, feet touching/apart, feet up, knees flexed) but can be hard to control in generations
Includes clean and dirty feet (specify preference in prompt)
Tip: If you like a generated image but it has missing/extra toes, try doing variations of the image with the same prompt at moderate strength (0.5-0.6) and it will usually fix it after a few tries. It's much faster and usually more coherent than inpainting if you're not too attached to any other minor details.
Training was based on the checkpoint OpenDalle v1.1, but it seems to work reasonably well with other checkpoints. It does minor variations of a basic barefoot seated pose on the ground with the soles of the feet facing the viewer that every model I tested so far was unable to produce coherently.
It seems to work with women as well as men, and there is evidence that it can handle styles other than photographic realism. But works very poorly with subjects that are wearing footwear (although V0.2 seems to handle it slightly better than V0.1).