EDIT: after the negative feedback i should make clear that the model is not completely a SD 1.5 and because of that it needs a different approach on generating images, the negative embedding i'm linking here should facilitate the use of negative prompt which is quite hard to get right on this model.
The model is tested on Automatic1111.
If you're having troubles generating images - if they look that bad - it's possibly a problem of your graphics card (turn the random number generator to CPU).
Another trick is that the ENSD works better set to 99999 rather than 31337, check it out for yourself.
If you can't in any way reproduce the images, even with the embeddings, try copying the info from the example images, it should help.
https://civitai.com/models/93766/embeddings-pack-for-caravaggio-reupload-with-images
This model is not perfect and I'm very well aware of that, this was the first step of my current model which I'd rather not publish yet, anyway it still has plenty of power.
(VAE baked in, the model comes with some textual inversions, 768x768 resolution)
The procedure in making this model was first of all some merges, then i finetuned a 1.5 basic model with some of my black and white drawings/sketches (im not that much of an artist but it worked very good because it took the style in a nice way, making each humanoid output quite different from both anime and semirealistic models.
After i merged even the model finetuned on my drawings on this, then i took the final model and dissected it with the kohya extract diffusers tool... why? Because there are better text encoders around than the base one of Stable Diffusion! And in fact i scraped off Huggingface and tested each one to find the best compatibility. Now this model seems to understand well structured english, you can easily ask ChatGPT for a story and try it on this model, or copy your Midjourney prompts here... it works fine, but it ain't Midjourney, obviously.