Get all your news in one place.
100's of premium titles.
One app.
Start reading
Windows Central
Windows Central
Technology
Kevin Okemwa

Microsoft now has an AI that can turn hours of audio into text instantly — and businesses will love it

Microsoft AI CEO Mustafa Suleyman.

Microsoft is doubling down its efforts in the generative AI landscape with new in-house AI models, including "MAI-Transcribe-1". It's an advanced transcription model designed to deliver state-of-the-art speech-to-text accuracy across 25 of the world's most spoken languages, making it a great candidate for meetings, closed captioning, or other forms of dictation.

MAI-Transcribe-1 will be available on Microsoft Foundry alongside MAI-Voice-1 and MAI-Image-2: "With this launch, MAI models will become broadly available for commercial use for the first time, enabling customers to evaluate and build with models across transcription, voice, and image generation," Microsoft says.

Microsoft says MAI-Voice-1 ships with hyper-realistic speech generation capabilities that preserve the speaker's identity across long-form content with emotional range. It ships with a new voice-prompting feature that can create custom brand voices from just one minute of audio.

Plus, MAI-Image-2  is Microsoft's new text-to-image generation model, which excels at natural lighting, accurate skin tones, and clear in-image text. What's more, it had ranked among the top three on the Arena.ai text-to-image leaderboard.


So, is Microsoft building its own AI camp?

It's no secret that Microsoft heavily relies on OpenAI's AI technology, which it has heavily integrated across its tech stack. However, the tech giant has openly criticized the ChatGPT maker's GPT-4 technology, citing that it's too expensive and slow to meet consumer needs.

Last year, Microsoft started developing its own in-house AI models and testing third-party ones for Copilot, potentially freeing itself from an overdependence on OpenAI for its AI efforts. However, Microsoft's AI CEO, Mustafa Suleyman, confirmed that the company is developing "off-frontier" AI models, but admitted that they'd play a close second to OpenAI's sophisticated technology.

Last month, Microsoft made some major changes to its Copilot leadership structure, splitting the division into four pillars: Copilot experience, Copilot platform, Microsoft 365 apps, and AI models.

Related: Microsoft faces its worst quarter since 2008's financial crisis because of AI

Ex-Snap exec Jacob Andreou will lead Copilot experiences, both consumer and commercial, as an executive vice president reporting to Microsoft CEO Satya Nadella. Consequently, Microsoft's AI CEO, Mustafa Suleyman, will now double down on building in-house AI models for the company.

I guess Salesforce CEO Marc Benioff was onto something when he predicted that Microsoft wouldn't use OpenAI's technology in the future, following the announcement of the ChatGPT maker's now-abandoned $500 billion Stargate project designed to facilitate the construction of data centers across the United States.


Join us on Reddit at r/WindowsCentral to share your insights and discuss our latest news, reviews, and more.


Sign up to read this article
Read news from 100's of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.