AI is GrAIt

26 August, 2024

I know that this opinion is not going to win me a whole lot of friends on the small web, or whatever we’re calling this crazy ride, but I think AI is pretty neat. And I know, some people will quibble that what we have now isn’t really AI, it’s just generative models, but listen, you know what I’m talking about when I say those two letters. Let’s not argue about definitions when we’re both thinking about the same thing.

I’ve had a fascination with AI in its various forms since forever, from the old text-based parsers from the 70’s through the 90’s to, well, to the GPTs I suppose. Those really set me off thinking more about AI in my daily life, as I know they did for many people. I spent some time chatting with Chatty, my ChatGPT instance. I generated a few images here and there to OoOoOoo and Aaaaah at what was possible. I was interested! But I didn’t really have a use for any of it until more recently.

Lately, I’ve been on a more creative kick, and a lot of that has been enabled specifically via AI tools. If you know me you probably know that I’m on Second Life a lot. I’ve wanted to be able to create some mesh objects in-world, both as a creative outlet and to kind of give something back to the virtual ecosystem that I’ve been taking from for so many years. Not that I’m giving what I make away for free, but they’re cheap at least.

So, along comes generative 3d modeling. There’s text-to-3d and image-to-3d, and both have their pros and cons. Text-to-3d tends to create more coherent models because it has some sort of prompt to work towards, but it can end up being less creative and relying more on its training data. Image-to-3d does its best to map what you show it onto a model, with varying success depending on what it is and how the image is laid out. I’ve used both tools, but I’ve found that the greater flexibility allowed via image-to-3d has worked better for me overall.

So, where do I get the images from? Am I just ripping pictures off of image search and slamming them in there? No, not me! I’m generating my own with text-to-image generators. I know that some of you will claim that’s exactly the same thing but shhhh, play along here!

Initially I was using Bing’s image generation tool (I know, Bing, right?) because I found it to be pretty creative and give good results to prompts. However, now I’m running Stable Diffusion locally on my old laptop, and I think it’s great! There are so many different models out there that produce totally different results to the same prompts. It’s fun to just spend hours seeing what you can get, or working towards a specific thing that you have in mind.

Also, once I got Stable Diffusion working locally I decided to get some language generation going as well. Why? Because it’s cool and fun to play with. I don’t have as much of an actual use for it, but again it’s fun to experiment with the different models and see what I can get back. Also, Llama 3.1 running locally is pretty competitive with ChatGPT running remotely, so why would I give someone else my data when I can turn off my internet connection and just go to town?

Anyway, I got a little off track there. So, now I can come up with an idea, generate an image for that idea that I had, feed that image into the image-to-3d engine, and then take the resulting model and give it a nice tune-up in Blender. Yes, unless you’re planning to share some semi-wonky models with the world and put your name on them, this still requires a bit of time reshaping and repainting in Blender, but really, it’s a good skill to learn!

As of this writing, I’ve created seventeen models that I felt were good enough to list on my marketplace store in SL using this workflow, and it feels great. These things really wouldn’t have existed if I hadn’t sat down and turned my text into digital reality, but there they are, and other people can enjoy them too! Now, I’m going to keep making more of them until I get bored with it, and then I’ll do something else.

Back