fbpx
Diving deeper into generative AI

Diving deeper into generative AI

Loading the Elevenlabs Text to Speech AudioNative Player...

In the following text, I make an exception in my blog and write in English. With this, I hope to provide some background and insights for my international connections and curious ones. Maybe we can use AI for something cool. At the end are a few pictures I have made with Stable Diffusion, and more can be found here on my AI-focused Instagram.

Please keep in mind that this IG account of mine is just for fun and is not to be taken too seriously. 😊 So, full disclosure: I am a guy, nerd, sci-fi fan, and former RPGer. 🤓

Diving deeper into generative AI

For those who don’t know it already, I dived deeper down the AI rabbit hole last year. Also, I have kept my eyes on AI progress since 2014 or so, and therefore, I have a good understanding of the possibilities and development of AI. Therefore, this disruption we are experiencing now did not surprise me. For example, I tinkered around with earlier versions of GPT, Midjourney, Dall-E, and AIVA (AI for composing music) and read quite a few books about AI, machine learning, and the upcoming new industrial revolution in general.

I also accomplished the Elements of AI course at Helsinki University and Reaktor around 2018. I know it’s not much, but it still explained what machine learning and neural networks are. Unfortunately, I never had a chance to learn Python I was interested in. Or, to be more precise, I did not have time or patience.

So, it was not AI that surprised me, but the speed of disruption occurring soon after ChatGPT was released in late 2022. And I guess the speed surprised all but those who work at OpenAI. 😅

Btw, this was not the first time I knew things were going to change profoundly. Internet, social media, digital marketing, and the importance of videos in it: I knew they would get big before people around me understood what was happening. Unfortunately, I was mostly keeping my thoughts to myself or sharing them with family, friends, and colleagues, who I assume kept me a bit crazy at best. 😂

Trying to achieve as photoreal results as possible with Stable Diffusion is fun.

It takes a great team to rock.

Last Autumn, I got a new boss and new people always mean new possibilities. She listened carefully to what I explained about the upcoming (generative) AI revolution. It is pretty absurd how fast she could understand the magnitude of change and adopt the concept of modern generative AI. And the impact this will have on all expert businesses.

Soon, one thing led to another, and she had gotten us at Sherpa doing AI cooperation with the Finnish energy company Neste Oyj. A big part of this happening was due to Neste’s CMO, who is known as a digital marketing & ROMI guru and trailblazer here in Finland, saying YES to our idea. For the background, he and I had had AI discussions already, somewhere around 2018, so I guess he knew that we at Sherpa had some insights that no others in our industry had. 😌

To cut some corners, to our knowledge, we did the world’s first digital marketing campaign utilizing AI in every possible part at the time and with warp-drive speed. In this CampAIgn of ours, several AIs were used during the entire process, from the juridical contract to insight and copy into the media plan and materials (read the more detailed process described here). We learned a lot and are, of course, eager to do more cool stuff for the brave brands willing to test new possibilities. 😉

Read more from https://sherpa.fi/tyot/campaign/

And yes, we are also aware of the fact that the first time is always the most difficult, and from there, the progress just starts. Try to be first in the world in something, and you know how easy it is… 😎

My focus and motivation

The CampAIgn was obviously just the visible part of an iceberg. Since December 2022, I have spent hundreds of hours updating my AI knowledge – including most of my free time and winter holiday. I feel privileged to be able to focus (as I have no kids and my wife supports my nerdiness fully, being kind of a one, too 🥰) and get support from my boss.

Being a (former) sales director (my role was updated to Head of AI & Innovations in June 2023 to reflect better what I do) in an AI-positive creative agency is also a blessing. I think Sherpa provides a good place for my ambitions to be at the forefront of combining creativity and AI where seen as the best (it’s not the best option for any purpose ofc!). This, I believe, is an excellent sales strategy. The truth is that the marketing consultancy sector is more or less saturated with traditional creativity. AI definitely offers like-minded souls and me a great advantage. Actually, I think that in this disruption, only those who are fast and willing to learn will survive.

My focus is to keep up with new innovations and try, test, and educate people. And hopefully also to have a lot of fun with AI.

In the big picture, I consider myself a creative nerd trying to remove the fear and resistance many feel against AI at the moment because, in the end, I firmly believe that ASI (Artificial Super Intelligence) is our best hope to survive from ourselves. I see us humans as just too aggressive, dominant, and territorial apes to live in peace without the help of something more intelligent and wise than we are. Our purpose should be exploring the universe, not fighting with each other.

It is also fun to imagine, with the help of SD, what the future could look like.

My idea of fun is to learn new stuff.

Besides being somewhat gifted to see patterns in data, my idea of fun is to explore and learn new possibilities (most likely, this is why I also started Ph.D. studies last year and researching the impact of GenAI on communication and marketing.). I also have been playing with computers since the 80s and doing CGI since the 90s – both for fun and money – so I guess it is no wonder I am somewhat hooked, especially to the generative visual artificial intelligence.

This is what disruption looks like.

Lately, this has meant exploring the possibilities of Stable Diffusion and, most recently, ControlNet. Rather than just goofing around and doing what people often do with SD (at least if you look at what kinds of models are trending in CivitAI, for example 🙈), I decided to start writing a study journal. It soon evolved into an evolving presentation presenting the possibilities with this combo, GANs, and LLMs in general (– and now, in the Autumn of 2023, much of my work is to couch people on using GenAI). DM me on LinkedIn if you are interested to hear more.

Working with AI is far from “pressing the button.”

Despite AI being disruptive in many ways, many AI tools are still far from “pressing the button.” Yeah, ChatGPT and Midjourney make it look easy, but there are so many more possibilities when one jumps down the rabbit hole, as I did. Midjourney is an excellent tool for fast fun, inspiration, and visualizations. Still, to really use AI as a visual designer tool, SD is the weapon of choice (and it’s also good to note that it’s possible to combine MJ and SD for the best results).

Below is an example where I used my own picture to create a pose for AI and then transformed myself into something completely different (images generated with tech available in early 2023).

This example of OpenPose possibilities required quite much time and effort; SD & ControlNet are not the easiest software to install and operate.

What a great time to be alive for a creative soul!

So, what are Stable Diffusion and ControlNet that I referred to earlier?

Stable Diffusion is “a text-to-image model empowering billions of people to create stunning art within seconds. It is a speed and quality breakthrough, meaning it can run on consumer GPUs.” (Stability AI’s announcement)

ControlNet, in turn, is an extension to Stable Diffusion that allows more control of the images generated with SD.

The Decoder sums ControlNet as:

ControlNet is a simple method for fine-tuning Stable Diffusion. It can be used to create models for better controlling SD. The team behind ControlNet publishes a whole set of models for the image-to-image pipeline that, for example, automatically detect lines or collect depth information and use it as a template for further generation. ControlNet is open-source and can be used with WebUIs for Stable Diffusion.

Below is an example utilizing the detect lines M-LSD algorithm. I.e., it is possible to redecorate rooms quite fast. What a great inspirational tool for anyone (tech-savvy) interested in interior design without learning 3D software or modeling.

From the experience of learning to do 3D modeling and rendering in the past, I can say that this combo does, in a moment, pretty much what previously took years to obtain the skill and then days to build the scene.

There are also many more ways to use the SD & ControlNet combo, and more features are coming out daily. As it is open source, the pace is FAST! I would be thrilled to have more time to explore them in more detail. At least the opportunities are enormous for marketing, branding, and entertainment businesses.

To conclude, at this stage of AI development, the future is shaped now. Therefore, it would be super cool to do more with AI with people and brands that are as open to innovations as Neste is. 🦾

Like-minded souls: feel free to connect!

Please feel free to drop a comment if you are as excited about the possibilities of AI as I am. Possible business inquiries can be sent via DM through LinkedIn. 😊

Vastaa

Sähköpostiosoitettasi ei julkaista. Pakolliset kentät on merkitty *