The rapid rise of OpenAI's artificial intelligence chatbot ChatGPT has left many wondering what else will be changed by generative AI tools. If a Google research paper released this week is anything to go by, songwriting will be—and perhaps the music industry.
The paper describes a tool called MusicLM that “can transform whistled and hummed melodies according to the style described in a text caption.” It can also generate “high-fidelity music from text descriptions such as ‘a calming violin melody backed by a distorted guitar riff.’”
On the paper’s website, examples show results generated by the tool. In one instance, somebody hums “Bella Ciao,” an Italian folk song from the late 19th century. Then, based on that, the tool generates music with various instruments and styles, including guitar solo, string quartet, and jazz with saxophone.
Google announces MusicLM: a model to generate music from text. Here are some crazy things it can do:
— bleedingedge.ai (@bleedingedgeai) January 27, 2023
1. Given audio of a melody, it can generate new music inspired by that melody customized by prompts! Here's someone humming bella ciao turned into a cappella chorus, EDM, etc. pic.twitter.com/HKDnXI1C8U
"Whoa, this is bigger than ChatGPT to me. Google almost solved music generation, I'd say," tweeted Keunwoo Choi, an AI scientist at Gaudio Lab, an AI audio technology company.
“Think of MusicLM as the ChatGPT for music,” tweeted entrepreneur Martin Uetz, adding, “I can't wait for this to go mainstream.”
Generative AI vs. artists
Less eager might be musicians who’ve spent decades mastering their instruments, just as illustrators and graphic artists have been angered by AI tools that create impressive images from mere text prompts.
Among those AI art tools are Midjourney, Stable Diffusion, and DALL-E 2. One man recently used Midjourney to illustrate a children’s book. Impressed with the tool, he shared his experience on social media—and was stunned by the backlash from illustrators. And last year, an image generated with Midjourney won a prize at an art festival, which also angered artists.
The problem artists have with such tools is that they train themselves on a massive collection of digitized artworks without consent. A lawsuit recently filed in San Francisco by working artists describes Stable Diffusion and Midjourney as “collage tools that violate the rights of millions of artists.”
Indeed, copyright concerns are keeping Google AI from releasing MusicLM to the public. But startups might be more willing to release such technology into the wild.
Not that Big Tech isn’t also plowing resources into generative AI.
DALL-E is offered by ChatGPT maker OpenAI. Microsoft is investing billions into OpenAI and will use its technology in a wide variety of products, including the Bing search engine. That in turn has lit a fire under Google parent Alphabet, which is working on similar tools to answer the challenge.
As a tool, MusicLM is far from perfect, but it hints at where things are headed. The same can be said of ChatGPT itself. As billionaire Mark Cuban recently said of the AI chatbot, “imagine what GPT 10 is going to look like.”