AI image generators aren’t going to replace anyone soon, but they are going to be amazing tools

AI image generation took the world by storm a few months ago, and led to a wave of graphics artists and designers worrying about their jobs and their art. For the most part, there was a lot of hype that hasn’t yet met reality, and it was that hype doing the worst damage. For one, AI is not really new to image generation. Photoshop and its analogues have been using AI to modulate various features of their platform. Image generation itself is only an extension of this idea, in that it doesn’t really replace graphics artists. People with keen eyes and creativity are still needed, as well as strong skill sets in photo and graphics manipulation. Midjourney, Dall-E, Stable Diffusion, and so on certainly hasn’t displaced them.

It has made people with lower artistic skills able to produce something usable in their social media content. I’ll take myself and this blog for example. Before I was scanning stock photos to find something decent enough for the banners. And that’s what I found – something *decent* enough. Now, using Midjourney, I get something rather sleek. It didn’t replace anyone, but it did make my own content marginally better.

As for the graphics artists on our team busy doing other stuff. They’re still busy doing other stuff. They use Midjourney and others now in their workflow, but they still have to edit the results, and by and large it’s not capable of doing the focused, bespoke, branded graphic design that’s required for most corporate clients.

Generated pictures still need a great deal of editing, whereas sometimes it’s just easier to do it yourself, as you’ll see below. What’s interesting though is at first glance, it’s easy to say, “Wow!” But then you look at little bit closer and people look like they have elephantism or are subjects of Picasso paintings or just general, “What in the heck is going on?!”

In this blog then, I’ll focus on how social media content creators can upscale their skills to improve their own content. This blog isn’t intended for the marketing manager to figure out how to dump their graphics designer. By and large, that’s still impossible even with today’s technology. Unless the MM is some kind of genius with prompt engineering and doesn’t have much else to do.

Using AI image generation for your content

Most content creators, especially those starting out, probably aren’t going to hire a graphics artist in the first place. Especially because they face an extraordinarily tight budget with a business that probably isn’t going to make them much money without a lot of time and energy dedicated to it. That’s just a fact. And a lot of graphics artists really can use the help in spinning out more content in their already busy workloads (every designer I know seems to be overworked as it is). So if anything, AI image generation can help a lot of different people up and down the line.

Here are just a few ways to use it to improve your content. I’ll explore more pointed realms in future blogs, these are just some first points of advice to help get people started.

Choose your platform

There are quite a few free and budget options for image generation these days, but most of them are based off two main options, OpenAI’s Dall-E 2 and Stable Diffusion, which form the basis of the other most popular ones, Bing Image Creator and Midjourney (respectively). There are pluses and minuses for each of them.

To illustrate the pluses and minuses, I’ll leave it to a pair of intentionally vague prompts (that is, lacking style, size, and other indicators). The prompts will have people that are not an explicit subject, instruments, and an animal, to give you a good sampling of how the generator manages most use-case scenarios. I’ll choose (what I think is) the best, least distorted image from the four that are rendered. In an upcoming blog, I’ll give some tips on how to refine your prompt.

Each platform works in a similar manner: describe your target image, figure out how to improve your prompt, rinse and repeat. As you see, there are some pretty wild and inexplicable variations in each of the results. You might use all the engines in your social media content creation, one or the other depending on its style.

Dall-E 2

Dall-E 2 stems from the first publicly available AI image generator, Dall-E. It continues to struggle with faces, fingers, musical instruments, and I’m sure a few other categories. Generally, without further prompt modification, it errs on a slightly… impressionistic style. It also excels at following directions to edit images with minimal “engineering” of the prompt. And it’s free, for the most part. You are only limited by a per-day limit.

People sitting, drinking, and smoking at a cafe listening to live jazz music

Frankly I’m not sure why it didn’t attempt a realistic approach, but went quite an artsy direction. You could type “realistic” at the end to aim it in that direction.

A white tiger sleeping in an ancient Hindu temple

Much better and at first glance, definitely hits it. But then you notice the facial distortions…

Bing Image Creator

Based off Dall-E 2, the Bing Image Creator is available both in the Bing search engine and in Bing Chat (just ask Bing to “make an image of…”). The human images can still be a bit Cubist like in Dall-E, but it’s a definite improvement and can also be highly accurate if you avoid fingers and faces. It works best with scenes, landscapes, and animals, as you’ll see below. Bing definitely takes a slightly different artistic direction, is more vibrant, and has better depth of field and attention to detail. It gives you 4 results per prompt, and you can Upscale one result which costs you one credit. You get 25-100 credits a day. I haven’t really figured it out as it seems to keep changing.

One thing for sure, depth of field and realism is a lot better, and it seems to be able to do instruments a little more believably.

People sitting, drinking, and smoking at a cafe listening to live jazz music

Somehow Bing decided on its own to bypass faces, which it has difficulty with when not the subject. And is that a bassoon, harp or baritone sax that guy is playing?

A white tiger sleeping in an ancient Hindu temple

Simply beautiful. Can’t joke around with this one. Looks like a photo I took at the zoo the other week.

Stable Diffusion

Like Dall-E, Stable Diffusion is another base model of which others were built off. It’s free to use, and tends to do faces a lot better than the Dall-E-based generators. Science fiction scenes are out-right beautiful. However, you do need to do some engineering to get your prompts right. What’s great about Stable Diffusion though, is they have a lot of tools for prompts, including “negative prompts” to include what you don’t want in the result, as well as a “prompt builder” that helps guide your prompts so they’re not so painfully general as the example ones I’ve written here.

People sitting, drinking, and smoking at a cafe listening to live jazz music

At first glance it’s passable, then you notice the faces. And is that a bear furry sitting on the bench? And why is this pic so small?

A white tiger sleeping in an ancient Hindu temple

A fairly nice tiger, but the only thing “sleeping” about it is the eyes are apparently closed. Also how many fingers do tigers have anyway?

Midjourney

My personal favorite, Midjourney is annoyingly run in Discord, which means you need to have a Discord account to write in your prompts. You’ll have to type /imagine and then your description in order to get a reply. It’ll also cost you. It definitely seems to be influenced by your previous prompts, since I do a lot of “digital art” tags, as you can see below it definitely seemed to consider that. It has less difficulty with faces than any of the others, but you often get weird, Adolphe Saxian images of instruments and arms appear in the most absurd of places.

Instead of re-inventing the wheel, so to speak, Midjourney was based off the Stable Diffusion dataset, but then the way that it handles that information is a bit tweaked, as you’ll see.

People sitting, drinking, and smoking at a cafe listening to live jazz music

This should reveal why MJ is my favorite. Though those are really creative musical instruments!

A white tiger sleeping in an ancient Hindu temple

And if this doesn’t sell you on MJ… wow…

Learn some prompt engineering

They call it “prompt engineering” like it’s a science. It’s not. There is certainly a way to approach it, and a mindset, but by and large it’s a great deal of experimentation, a pinch of patience, and a whole lot of luck. And like with the Scientific Method in Opposite Land, no two exact prompts will ever render the same result.

In another blog, I’ll get to the details of how to write a great prompt, but for now take these words of advice: Be specific, experiment with your prompt by changing various words, troll the Midjourney chat and see how other people are editing their prompts, and have fun!

Troll the Midjourney discord chat to see prompts in motion

The best way to learn prompt engineering, and to learn what to learn, is to troll the Midjourney Discord chat. There you can see a feed of what people are prompting, and the results. Often, you can see them editing their prompt, trying to get an exact image. When you view all the changes in their prompt, and the AI’s return edits, you start to get a picture of how to prompt. The best part: Doing this doesn’t cost you any credits.

Edit your results

You might get some nice results, but almost always there will be something to edit in there, whether it’s an added finger or a misshapen rooftop. Undoubtedly, some of the best images you’ve seen shared online were probably manipulated further by the artist. This is normal, especially if you’re looking to add your own branded touches for social media. So your first step is going to be to bring that pic into an editor.

Combine images

Just as with the last bit of advice, it’s wrong to think that the AI generated image is the end-all image. And sometimes it’s too hard to prompt the exact thing you want. So why not try doing two prompts, and taking those two results and combining them together in photoshop yourself? The sky becomes the limit here, and it can be very easy to do. Just add to your prompt descriptions like “no background” to make it easier to cut out the images and further manipulate them elsewhere.

Conclusion

It’s clear that AI image generators, as they stand now, aren’t likely to replace anyone at work, but they will make it easier to make better products both as a graphics professional and social media content creator. Whatever generator you choose to use, whether all of them, one of them, or some combination, of course is up to you. Figure out which is the easiest UI for your workflow and hone your prompting skills for that one. Though the results can be quite different, with patience and some added editing work, you can get some pretty amazing pieces done.

Stay tuned for the next few blogs where I’ll hit up some tips on prompting. Good luck and happy creating!

Using AI art for social media

Using AI image generation for your content

Choose your platform

Dall-E 2

People sitting, drinking, and smoking at a cafe listening to live jazz music

A white tiger sleeping in an ancient Hindu temple

Bing Image Creator

People sitting, drinking, and smoking at a cafe listening to live jazz music

A white tiger sleeping in an ancient Hindu temple

Stable Diffusion

People sitting, drinking, and smoking at a cafe listening to live jazz music

A white tiger sleeping in an ancient Hindu temple

Midjourney

People sitting, drinking, and smoking at a cafe listening to live jazz music

A white tiger sleeping in an ancient Hindu temple

Learn some prompt engineering

Troll the Midjourney discord chat to see prompts in motion

Edit your results

Combine images

Conclusion

Share This Story, Choose Your Platform!

Related Posts

Skyrocket your social with these 12 hacks

Two new AI tools for content creation: Zoom and Adaptive Fill

9 trends in social media and the net to watch for

5 Alternatives to Photoshop in 2023

Starting a YouTube Channel in 2022

Title