Researchers at Johns Hopkins University claim to have come up with a new method of pitch correction that’s “more than just Auto-Tune on steroids” and uses AI to “enhance the naturalness and quality of pitch correction, surpassing previous tools.”
Those are some pretty big boasts, but the creators of Diff-Pitcher, as its known, are confident that they’ve come up with a new way of doing things that delivers better results.
Team member Jiarui Hai, a PhD student in the Whiting School of Engineering’s electrical and computer engineering department, says: “Diff-Pitcher is a generative deep neural network that takes pitch correction technology to a new level. Its precision and control can not only help musical artists and producers but also open new possibilities in areas such as voice rehabilitation and assistive technologies.”
What’s different about it, though? The researchers claim that, unlike traditional pitch correction software, which they say is trained on pairs of corrected and original vocals, Diff-Pitcher analyses the spectrogram of the original vocals that need to be corrected. It then identifies target notes, predicts the necessary adjustments and transforms that corrected spectrogram into audio.
“[The results sound] really natural,” says Hai, “and unlike in older ways of fixing pitch, we can still regulate how high or low the voice goes.”
The new technology was presented by Hai and lead researcher Mounya Elhilali, a professor in electrical and computer engineering, at the 2023 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics last year. They believe that it could have benefits beyond the realms of music production, too: “The technology could revolutionize treatment for a spectrum of speech-related disorders, offering valuable support for post-laryngectomy patients and contributing to the voice rehabilitation of stroke victims,” says Hai.