Can AI create?

With the new AI tech, people are worried about their jobs and their art. Should they be?

The latest generation of Artificial Intelligence has hit the world by storm. Though not for the reasons that science fiction authors have summed up in the years past. No one is worried about SkyNet hitting the skies and raining hellfire down on the human population (yet), but we are worried about intellectual property and people losing jobs. It’s in understanding these two fears that we can also understand how to best take advantage of AI in our own projects (the latter part I’ll delve into in the next blog).

What is AI?

First of course, it’s necessary to understand what is and isn’t AI. I could explain it, but I thought maybe letting ChatGPT, one of the latest AI language models, explain it for me would be better:

“AI refers to the ability of computer systems to perform tasks that would normally require human intelligence, such as recognizing speech, interpreting images and videos, making decisions, and understanding natural language. It involves the use of algorithms and statistical models to process and analyze large volumes of data, learn from patterns and trends, and make predictions or decisions based on that information.”

This is where AI (or the developers behind the programs) get into trouble. In order to perform, the AI must have access to “large volumes of data” to “learn” from. What are those “large volumes of data”? And also, we must understand that AI is not “learning” in the same respect that a human is learning. A program’s “learning” means it now has the ability to copy – and remix – a set of data. That set of data being, apparently, anything on the Internet.

The act of creation

I’ll bring up the two popular neural network-based AI models that made the biggest fuss lately. Dall-E and Midjourney. Both generate images from textual descriptions, which at times look like a kind of science fiction magic. Notice though I don’t use the word “create”, but rather “generate”. They are creating nothing. According to ChatGPT, which only speaks for itself and not Midjourney (pluralizations my own), they were “trained on a large dataset of text and image pairs… the model(s) learned to associate certain words and phrases in the input text with specific visual features and objects in the corresponding images.”

Read our blog on what AI means for composers.

Both Dall-E and Midjourney basically just scour their datasets for relationships between text searches and image results. Then take those images and mash them up with other images, coming up with a “new” image. Of course, this creates a huge – and interesting – catastrophe in the realm of intellectual property. Many artists have done exactly the same thing – Andy Warhol comes to mind in the most blatant of examples. I’m neither a lawyer nor a philosopher, but it does bring up interesting questions of ethics. We can look to the music world in how they handle remixes – anything that was sampled from the original work is still owned by the original artist, and dues must be paid to the rights’ owner.

ChatGPT works in a very similar way. “The model learned to identify patterns and relationships in the text data, and to use this knowledge to generate coherent and relevant responses to new text input.” Whereas the graphics protocols look for the relationships between text and images, ChatGPT looks for questions and answers, and uses past text to model the syntax of its new answer. Which means you might get paragraphs that are completely lifted from other people’s writing elsewhere on the net.

As an experiment, I decided to ask it to write a description for me about the game Company of Heroes 3 for a blog I was working on. It gave me a nice description all right, one that I had no idea if it were true or not (not having played the game myself), but the answer seemed somewhat… formulaic. I took the answer and placed the various sentences into Google. And sure enough, the questionable text gave me a search result, a blog describing a Marvel game. It had nothing to do with Company of Heroes, but they apparently do have a lot to do with each other, according to ChatGPT, since it decided to use the exact, cliché-laden yet unique string of text.

So here you have two problems, not only is it straight up plagiarizing text, it’s also giving me something that’s very possibly wrong. But then again, I did ask it to describe to me a game it had never played either, so fair enough, right?

Remixing is not creation

I might not be a lawyer, but I am a musician. And I know, when a musician is remixing a track, they must get permission from the original creator of that track. AI scours datasets from who-knows-where and then remixes that work – whether an image or a text. The answer doesn’t have to be correct, because what is “correct” for an algorithm? By its very nature, after all, it cannot experience the real world. It doesn’t actually “know” anything, being unable to gather first-hand sense-experience, so what is “wrong” or “right” in any regard without that very important part to start from? Of course, our sense-experience itself could be a lie, and we could all be trapped in some advanced Matrix-like simulation, or we ourselves could be advanced AI systems… but let’s not dive into those rabbit holes yet.

On the other hand, remixing IS a form of creation. You’re mashing something up into a new order, adding new touches and sentiments that the original creator never thought of before. By the AI copying from various sources, pulling those things together into a novel image, is it not, in some way, “creating”?

What I believe will emerge is some form of waiver when you’re posting a professional image that you’ll sign to allow your image to be used for machine learning. That way, you’ve got legal proof that you didn’t allow it (or vice versa). We also might see entire stock libraries created just for the sole purpose of machine learning. Who knows. Ultimately it’s not about loss, but about adaptation.

Stealing jobs

Another big concern about Artificial Intelligence is that it will steal the jobs of creators like myself, who rely on writing or on graphics design. Having used ChatGPT for a little while now, I can firmly say I’m not worried. It’s a great tool, and for now, that’s about all it is. In my next blog I’ll delve into some appropriate ways of using it for work.

Just as so many Photoshop actions were spun off to the world of AI, making photoshop easier to use, I’m not so sure it had that big of an effect on the market. Sure, some people who were selling their own Actions might have lost their income, but that can’t be a sizable amount of people.

We’ve seen technology hitting the market before. The most famous being that of the automatic teller machine (ATM). No one really wants to resort back to the times of the bank teller, when the only time to withdraw cash was during banking hours and at bank locations. The convenience of the ATM is indomitable, but comes with the reality that some people did lose their jobs. But then again, new “ATM adjacent” jobs emerged. Suddenly, you had to have ATM manufacturers, maintenance techs, repair workers, hardware engineers, software engineers, systems security specialists, and other technical professionals.

We’ll see a similar shift in dynamics as Artificial Intelligence grows bigger, undoubtedly, which means we have to be flexible in the work we do. Looking at the current art, I think the biggest threat is towards people who do stock photos and photos for hotels/corporations.

But then again, it will likely be those people who are the ones who take up the position of “prompt engineers”, implementing AI into their own work. Someone still needs to set up the AI, make sure it’s told the correct thing, correct those items, give it to the manager, and so on. All work that the manager probably doesn’t have time to do anyway. The ease of working with AI will also come at a premium. It’s not likely AI companies will give away their “AI labor” either. Just like with the current emerging models, where you must pay for premium services, you’ll continue to do that. And undoubtedly, the better the language and neural network-models, where there is a less likely chance of plagiarism and lawsuits, the higher that premium will become.

What is not AI?

Our stock music app, Smartsound Cloud.

We do operate with a lot of fancy algorithmic wizardry. But algorithms alone are not what make an Artificial Intelligence. Most algorithms have no capacity for “learning”. Like our algorithm at Smartsound Cloud. The fact that you can expand and shorten the music as you see fit certainly seems magically. It seems like there must be an AI involved. But there isn’t. Our musicians write their songs and transitions, and then our engineers stitch up those transitions. Using an AI, we would be able to have a much more hands off process, “teaching” the AI how to do it a few times and then letting it work by itself on each track, creating possibly a more generic sound. Instead, though, our musicians and engineers indeed listen and work on each and every track, and adapt our algorithms to each individual piece.

With Smartsound Cloud, there’s most definitely a real human touch every step of the way.

Stay tuned for the next blog where I delve into how we can use AI to help us with our social media and other creative endeavors. Subscribe to our newsletter here.