Love is in the air this Valentine’s Day and what better way to celebrate than with a song. Now, thanks to AI tools like Suno, making music to express your passion is easier than ever.
Inspired by the story of a Spanish artist marrying a hologram, and to see just how easy it is to make not just a song but a music video using AI, I decided to give it a try.
Using a combination of tools including ChatGPT, Suno, Leonardo, Stable Video and MidJourney I created a music video for the generated song “Echoes of a Digital Heart."
Picking a genre and generating lyrics
The first task was to come up with an idea for a song. I considered working on something for my wife as we’ve been together for 25 years this week, but in the end settled on going down the funny route with a track about the unrequited love between a man and an AI.
The next step was to come up with a genre and have ChatGPT write some lyrics. After a couple of terrible, never to be spoken of again attempts at a classic country love song I realized it needed to have more of an electronic feel and settled on 80s new wave.
Unlike Beyonce, who smashed it with Texas Hold'em, AI doesn't have that country flare. It does seem to be able to tap into the style of A-ha, Ultravox and other icons of the 80s.
After that it was just a case of giving ChatGPT the basic concept of the story, the genre I want to play with and asking it to generate the lyrics for a song complete with two verses and a chorus.
It came up with Echoes of a Digital Heart including the line: “Hearts and circuits, in a dance of despair, I'm lost in your code, a breathless air.” If that isn’t love, what is?
Making the music with Suno
Suno is still the best all round AI music generator I’ve used. There are others coming up and I think Google’s MusicFX does hte best instrumental, but Suno adds voice to its tunes.
I created a custom track in Suno, adding the lyrics from ChatGPT and the chosen genre. It can only produce up to about a minute of music in one go so I split it into five segments to get a fully rounded three minute song.
Each segment included either a verse and chorus, or pre-chorus and chorus. After you’ve generated on clip you can “continue from this clip” to keep the music and style and just change the lyrics. At the end you can “get whole song”
Making the music video
This was the hardest part of the entire endeavor and involved generating close to 80 individual images using a combination of MidJourney and Leonardo, then animating the best.
First I had to create a story outline for the song. I asked ChatGPT to plot a story based on the lyrics, then suggest imagery (assuming 5 second clips) that would fit the narrative.
It did exactly as I asked and even broke down where the clips should go based on time stamp within the song. So I took its suggestions, tweaked the wording and fed them into Leonardo. I used the Australian startup AI platform as it has impressive photorealism and video generation.
Close to 2,000 credits later and I had about 90% of the video generated but two things were missing — the dream sequence and a singer.
For the dream sequence I turned to a new tool I’ll be reviewing soon, Stable Video from StabilityAI. Stable Video is an impressive video generator based on the Stable Video Diffusion model, the same one that powers Leonardo Motion but with more control.
For the singer I turned to MidJourney and asked it to create a British looking singer from the 1980s with a retro AI vibe looking straight to camera. It didn’t disappoint although I had to use the MidJourney zoom out feature to get exactly what I wanted.
I then animated the singer image using Pika Labs to make the mouth move and look like he was performing in the song.
Pulling it all together
The final stage was taking the myriad of generated clips, the music and putting it all together. As I’m on a Mac and don’t have money for Final Cut Pro I turned to iMovie to edit the final three minute music video and it gave me all the tools I needed.
While it isn’t the greatest song of all time, there is a style shift between some of the clips in the music video and a few moments where the narrative is confused, I think overall it did a good job.