There are dozens of artificial intelligence music tools on the market, including from big tech companies like Google and Meta, but Suno has always stood out from the crowd.
Launching out of stealth mode in December last year, it first hit the headlines thanks to a partnership with Microsoft that made it accessible inside the Copilot chatbot.
What makes Suno different to the likes of MusicFX from Google or Meta’s AudioGen is the fact it also creates lyrics and vocals. This was a deliberate choice and one that made training the model much more complicated, Keenan Freyberg, Suno co-founder told Tom’s Guide.
“We want to enable anyone to have fun making music, and vocals are a big part of the fun,” he said and version 3, which is now more widely available, brings radio quality sound to the mix.
Creating a WOW moment
The first time I created a track using Suno AI I was shocked at just how well it generated a full song.
It isn’t perfect — there are still issues with phrasing and it doesn’t always exactly follow the genre in the prompt but it is orders of magnitude better than I could do on my own.
I play guitar, drums, and some piano and have dabbled with Garage Band, but I’m no musician in the composer or songwriter sense.
However, I do enjoy writing lyrics and one potential use for this is a way for a lyricist to get a “rough cut” of a song from their imagination for later recording.
“We’re not trying to make music better, faster, or cheaper — whatever “better” would even mean,” Freyberg told me.
“We’re always trying to explore entirely new ways to experience and engage with music — things you can uniquely do with AI," he added.
They have also added dedicated instrumental support. I used this to create a haunting piano waltz for a video of a dancer made using Pika Labs. It captured the prompt perfectly.
How does Suno work?
There are two main modes to Suno AI; a basic traditional AI-style text prompt with the option to make it instrumental, and a custom mode where you can use your own lyrics, set a genre and give it a title.
“Suno generates songs end-to-end. Each song — vocals, instruments, and all —is generated all at once,” Freyberg explained.
“This can be more challenging from a technical perspective, but we’ve found it produces higher-quality music than a sort of reverse stem separation approach, where you create the vocals, instruments, etc. separately then try to smoosh them together.”
Essentially it generates everything then gives you a complete track to listen to, including offering up the lyrics to read and a picture to illustrate the song.
What comes next for Suno?
That doesn’t mean they aren’t looking at stepping things up. Version 3 is already a step change in the quality of the songs produced, including more natural sounding and less auto-tune style vocals than was the case in Version 1.
“We’re just now getting to a point where fine-grained controls are becoming interesting,” Freyberg told me. There will be new features in future such as being able to “lock the parts of a song you like” and just regenerate the parts that didn’t really work as expected.
“I think these controls will enable people to engage with music at more points along the meme to masterpiece spectrum, which I’m really excited about,” he said. Adding that degree of control over the creative process would also potentially make it copywritable by the user.
What genres work best on Suno?
It is basically a case of “leaving it to your imagination” according to Freyberg. If you can think of it then it can create it. To test this out, I asked Claude 3 to suggest 50 genres and 50 one line story ideas. I then made a Python script to create random prompts from those 100 items.
The first suggestion was a new age tango track about a society where its illegal to express emotion. It offered up lyrics like "emotions outlawed, desire concealed but beneath the surface, our spirits revealed." The music was more tango than anything but it sounded great.
“My Dad is a bit of a hobbyist music ethnographer. I had the good fortune to grow up in a home with an incredible, eclectic collection of CDs, so my taste is all over the place,” said Freyberg.
“I’m amazed by a lot of the genre x genre and genre x language crossovers — styles uniquely explorable with Suno. Trap sitar… Urdu jazzwave… Chinese bluegrass… strange bedfellows that work surprisingly well together. It’s fun to explore the usual suspects, but it’s a different experience altogether to explore uncharted territory.”
Moderating lyrics and music in Suno
Like any AI tool Suno has the potential for misuse, including from people wanting to create songs that mimic famous artists, or songs with questionable lyrics.
The tool blocks any prompt that includes lyrics to other artists songs and blocks prompts that specify ask for a track “in the style of [artist]”. As Freyberg told me “We’re not here to make a better Fake Drake.”
“We’re somewhat absolutist on copyright moderation, but traditional content moderation is more challenging in some ways,” he told me.
They use third-party content moderation to look for harmful lyrics or dangerous content but this isn’t an easy issue to solve. Freyberg said “we’re actively exploring options that would enable us to take a more nuanced approach.”
“To make the understatement of the 21st century, content moderation is hard. It’s a challenge that routinely embroils companies with trillion-dollar market caps, and we’re trying to put our best foot forward as a small team of 12.”
How does v3 compare?
To put version 3 through its paces I asked some of my colleagues on Slack to suggest a random mix of genres and a topic.
We had everything from space trucking to country western blues to emo polka about leaving the fridge open — that sounded very punk.
I also tested the ability to continue from a clip and create a full track of about four minutes and it made some surprising changes to my lyric order, but more to fit the music than to break it.
The sound quality of version 3 is a marked improvement, it follows prompts more losely and while some vocals — particularly on country tracks — still sound artificial it is a major improvement over version 2.