The pace of development in the world of AI music is rapidly increasing. Just last month, Google made their text-to-music AI MusicLM available for public testing. This week, Meta has released an open-source rival to MusicLM called MusicGen that's now available to the public.
Both MusicGen and MusicLM are generative AI models that use technology powered by machine learning to produce short clips of music in response to text input. MusicGen will generate 15 seconds of audio based on the description provided by the user: type in anything from '90s rock song with electric guitar' to '180bpm gabber track with microtonal synth lead' and it'll do its best to approximate the description in the music it spits back out.
MusicGen will produce clips of up to 120 seconds if the user signs up to HuggingSpace, the platform its hosted on, and runs a more advanced version of the software. It's also capable of recreating melodies specified by the user. Upload a reference audio file, and MusicGen will extract a melody from it and incorporates that into the resulting clip. So if you want to hear a banging EDM remix of Greensleeves, or a thrash metal version of My Heart Will Go On, MusicGen will attempt to create it.
Unlike MusicLM, MusicGen doesn't prevent users from using artist and song names in their prompts, so you're free to ask it to produce a Celine Dion-style ballad, or an EDM track with the same vibe as Deadmau5's Ghosts 'n' Stuff. The results you'll get back, though, likely won't bear much resemblance to the artist or tracks you mention.
This may be because MusicGen has been trained on music from stock libraries, not the music of popular artists: Meta says that MusicGen was trained on 20,000 hours of music, including 390,000 instrumental tracks from stock media libraries ShutterStock and Pond5.
In a paper published by Meta that introduces MusicGen, the company briefly addresses concerns surrounding the ethics of AI-generated music, arguing that their open-source approach levels the playing field by giving musicians and producers access to their tools. "Generative models can represent an unfair competition for artists, which is an open problem," the paper reads.
"Open research can ensure that all actors have equal access to these models", it continues. "Through the development of more advanced controls, such as the melody conditioning we introduced, we hope that such models can become useful both to music amateurs and professionals."
In the same paper, Meta compared example clips produced by the software to examples generated by Google's MusicLM, Riffusion and MusAI, claiming that their model is "superior to the evaluated baselines".
Does Meta's MusicGen really surpass the abilities of Google's MusicLM? Let's run through some examples and find out. We're evaluating the ability of the AI model to fulfil the brief we provide it and judging the overall quality of the music it generates. We'll run the same prompt through both MusicGen and MusicLM and see who comes out on top, then add up our scores to declare a winner.
1. 'Optimistic pop music with upbeat synth lead' - Meta MusicGen
We thought we'd better start with something easy: happy, bouncy pop music with synths. MusicGen handles this pretty well, though the results are featureless and fairly bland: we can imagine this playing in the elevator at the mall. The music admittedly has an optimistic, poppy vibe, and synths are certainly present; however, the upbeat synth lead we asked for is nowhere to be found, and the only synths we can hear are pads in the background.
1. 'Optimistic pop music with upbeat synth lead' - Google MusicLM
MusicLM's response to this prompt is the clear winner. It's nailed the optimistic, poppy tone (we're getting 80s synth-pop vibes) and there's tons more going on in the music in comparison to MusicGen's effort. We can even hear some faint vocals buried in the mix. Though the clip still lacks a clear, defined and recognizable lead synth melody, there's a nice synth bassline present that brings everything together.
2. 'Experimental IDM beats with edgy production' - Meta MusicGen
Now we're cooking with gas. This pretty much hits the nail on the head: with this prompt, we were looking for fast, furious and unconventional drum patterns à la Aphex Twin, and MusicGen has delivered. The complexity of the rhythms here is super impressive, and the drum sounds are clear and punchy. This clip sounds great on its own, but it could easily serve as a building block to produce a loop-based track with. MusicGen wins this one.
2. 'Experimental IDM beats with edgy production' - Google MusicLM
This is an odd one. There are beats, but they're not particularly reminiscent of IDM: layered up with vocals and synths, these could probably serve as electronic pop drums. There's admittedly a somewhat experimental feeling to the clip (note the lo-fi textures, disjointed structure and barely perceptible vocals in the background) but we get the sense that this vibe isn't exactly 'intentional', and is more a result of MusicLM's failure to produce a coherent idea, rather than hitting the brief. We didn't ask for MIDI horns, either. Sorry, Google.
3. 'Spooky shoegaze with angry drum solo' - Meta MusicGen
Have we just walked in to a Radiohead concert? More than anything, this sounds like the middle-eight of an off-cut from Hail to the Thief, which means MusicGen has definitely hit the 'spooky' part of the prompt. The wail in the background could feasibly be a guitar run through a few pedals, and it sounds reasonably shoegaze-y, though it's not quite distorted enough. Sadly, MusicGen doesn't understand what a solo is, so they've given us a steady (and not particularly angry-sounding) backbeat instead.
3. 'Spooky shoegaze with angry drum solo' - Google MusicLM
Interesting. This is not even close to what we asked for, but we really like it nonetheless. The drum pattern has a kind of jazz-IDM-breaks vibe that's giving us a real '90s Amon Tobin flavour. The drums are couched in a lovely, floaty ambience that we find really evocative, and we appreciate the weirdly skronky horns piping up in the background. But it isn't particularly spooky, and it's certainly not shoegaze. You lose again, Google.
4. 'Wistful country song with female vocals' - Meta MusicGen
Excuse us as we wipe a tear from our eye. This is certainly wistful, and it undeniably has the stately lilt of a country ballad. We can hear acoustic and bass guitars in the mix, along with a plodding drum beat, but unfortunately, there is no pedal steel - and crucially, no female vocals - to be found. Overall, not a bad effort, but it's lacking some personality. It sounds a bit like a quick demo that a country musician might make with MIDI instruments in Garageband. It also sounds like a soulless robot's approximation of country music, which, when we think about it... it actually is.
4. 'Wistful country song with female vocals' - Google MusicLM
Much like Meta's effort, there's a recognizably 'country music' vibe present in this clip, so the genre box has been ticked. It's all about those swooning guitars, which sound a touch more realistic than Meta's version. Google's overall mix, though, has less clarity and sounds more lo-fi: we've noticed this on every prompt we've tested. On balance, these two are about equal, but we can hear some weirdly faint, garbled vocals hovering in the background of Google's version, so we'll give them the point on that basis.
5. 'Clean 130bpm drum loop to use in music production' - Meta MusicGen
Here, we're testing the model's ability to produce a loop we could use in a music production scenario, as opposed to a fully-fledged musical clip. MusicGen has absolutely nailed it: the tempo is 130bpm, as requested, the kick drums are punchy (check out those sub frequencies!) and the hi-hats are crisp. It's not quite as clean as we'd like, and there's a few artifacts present, but we can forgive that. All in all, it's a nice and simple 4/4 pattern that we could trim, loop and use as the basis for a track - we're thinking it would lend itself well to something in the realm of dub techno. Who needs Splice when you've got MusicGen? Meta takes the crown here.
5. 'Clean 130bpm drum loop to use in music production' - Google MusicLM
First things first, this isn't 130bpm, it's more like 157. Aside from that, this is pretty decent: it's an interesting groove that could be repurposed for use in a number of musical contexts, the likes of which you might find in any pack of drum loops. Compared to MusicGen's version, this one has a more acoustic flavour to the drum sounds, and we've got a tasty little fill towards the end, along with some 32nd-note hi-hat action. It's relatively clean, in the first half at least, but we can hear some odd little melodic elements creeping in towards the end that we didn't ask for. Meta wins this one.
Winner: Meta MusicGen
Meta's MusicGen has come out on top, nudging past MusicLM on the final straight to snag a 3-2 victory. This was a close one, and the test has demonstrated that both of these AI models are powerful music-making tools with huge creative potential.
Neither MusicGen or MusicLM has the capability to produce fully-fledged tracks that could pass for 'real' music just yet, but that's probably a good thing, right? In the meantime, both could serve as useful (and free) musical assistants that musicians and producers can use to spark ideas and inspiration, or simply generate an unlimited amount of royalty-free samples for use in their tracks.