Being able to create something and see the finished product is a massive dopamine hit and generative AI gives that same creative release to anyone able to write a text prompt.
It democratizes the entire process of creativity, allowing a musician with a brilliant song to create a music video from little more than a text prompt and an idea. Or an author can bring voice and visual to their story to promote it on YouTube or TikTok.
The problem is, like the streaming sector where you need dozens of subscriptions to get the full spectrum of brilliant shows and movies, there is no “one app to rule them all”.
Everything is now a service, incurring a monthly subscription fee just to use the tool for more than a couple of prompts, or if you want to unlock the truly creative features and, just like streaming, it can get very expensive very quickly.
Why is everything $20 a month?
This time last year ChatGPT Plus was barely a pilot program with a handful of subscribers. Now almost every major AI tool has a paid plan including Google Gemini, Microsoft Copilot and Anthropic Claude. For some reason they all cost about $20 per month.
My perspective is that it is such a new product category everyone followed the first product to launch and set the pricing accordingly. OpenAI charged $20, so everyone else joined them.
This isn’t that big of an issue as they all also have free versions, albeit with a more limited set of features. For example ChatGPT free has no image analysis or generation, and Gemini doesn’t have access to the most powerful underlying model unless you pay for Advanced.
The problem comes when you want to do more than just have a conversation or make the AI create text-based content. We’re seeing some impressive developments in AI video, image, music and even 3D model creation. None of which comes cheap.
There are free plans available
You can create a handful of images per month for free on platforms like Night Cafe, produce several free videos with Runway and even get free voice generation from ElevenLabs but if you want to use them for more than a novelty you need to open your virtual wallet.
Let's look at an example. Imagine I’m a musician and I’ve recorded this amazing song I’m really proud of and it is just two minutes long. I want to use AI throughout so will pay for one month of all the tools I need to make it work.
First I select ChatGPT Plus and give it my lyrics, asking the AI to generate a frame-by-frame production sheet I can use to create the images and videos with AI tools. For each shot it gives me image prompts I can feed into an AI image tool.
AI video can only currently generate about three seconds per shot. You’ll need roughly five shots per verse or chorus or about 40 shots for the full song. That is either 40 distinct images from something like MidJourney, or 10 prompts with 4 versions of each image.
You then need to run each of those images through Runway or Pika Labs to create clips, remembering you may need to repeat one or more to get it to look exactly the way you want.
At an absolute minimum you’ll need a $20 ChatGPT Plus subscription, a $10 MidJourney subscription and a $12 Runway plan. If you want to use lip sync to make it look like one of your characters is singing then you’ll also need a Pro Pika Labs plan for $70 or $19 for Synclabs.
To make a short music video you are looking at anything from $42 to $112 and that is before you pay for a video editing application and any mastering.
What is the alternative?
There are open source versions of most of the services that you can run locally on a reasonably well-powered laptop. You can even get a chatbot running that looks just like ChatGPT. There are also free versions or at least free-tiers for most tools — but they are limited in abilities.
Tools like Pinokio allow you to 1-click install a range of open source models covering everything from MeloTTS for text-to-speech and Meta AudioGen for music creation to Stable Video Diffusion for generative AI video creation. But you need a hefty machine to make it work.
This compute requirement is also why it costs so much to use apps on the web like Runway, ChatGPT Plus or MidJourney. GPU time is expensive and to create images or video from text you need the most expensive GPUs if you don’t want to wait hours.
Where is it going in the future?
One solution to make AI cheaper is new chips that speed up processing. Groq is one such chip and they are venturing into other types of generative content.
The other solution, potentially built on platforms like Groq, is a move towards all inclusive services like LTX Studio where you can create images, video, music, text and voice all in one place with a single subscription.
Right now it would be cheaper to find a forest somewhere and create a fun video using your iPhone then enhance it with AI. That way you avoid the subscription fees as you could probably get away with what is available on the free plans.
The flipside, and bringing us back to streaming, is that as it gets cheaper to create AI content, as AI content gets better and as all types of generative media can be made from a single prompt, the streaming platforms might start looking to AI for new types of storytelling.
Imagine being able to go to Netflix and when searching for something you want to watch — lets say a superhero unicorn fighting crime on a future Mars base — AI will make it for you, complete with a cool voice for the character, world building and a 10-part series.