Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Anton Shilov

Microsoft says Word and Excel AI data scraping was not switched to enabled by default (Updated)

Microsoft Office 2024.

Edit 11/26/2024 7:00am PT: Microsoft, via Twitter (below), has now stated that the company does not use the data to train its large language models (AI models). 



It is not a secret that Microsoft's Office has Connected Experiences which analyze content created by users. However, according to @nixCraft, an author of Cyberciti.biz, Microsoft's Connected Experiences feature automatically gathers data from Word and Excel files to train the company's AI models. This feature is turned on by default, meaning user-generated content is included in AI training unless manually deactivated. However, this deactivation is a very convoluted process. Microsoft has yet to comment on the information, so take it with a grain of salt [EDIT: as stated above, Microsoft has now said this feature does not enable AI].

This default setting allows Microsoft to use documents such as articles, novels, or other works intended for copyright or commercial purposes without explicit consent. The implications are significant for creators and businesses relying on Microsoft Office for proprietary work, as their data could become part of the company's AI development. For this reason, anyone concerned about protecting their intellectual property or sensitive information should take action immediately.

To do so, users must actively opt out by finding and disabling the feature in settings. The process requires unchecking the box 'Turn on optional connected experiences' that is enabled by default.

On a Windows PC, the steps include going to File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and unchecking the box. Seven steps to disable a critical feature that is turned on automatically seems very convoluted.

Microsoft's approach mirrors a broad trend in the tech industry, where other companies have introduced similar features to train their AI models. While all AI models are trained on something generated by humans, doing so without their consent is unethical, to put it mildly.

Microsoft has not publicly confirmed or denied that it uses content from Excel and Word documents generated by users of Microsoft Office to train its AI models. Nonetheless, there is a clause in Microsoft's Services Agreement that grants the company 'a worldwide and royalty-free intellectual property license to use Your Content.'

"To the extent necessary to provide the Services to you and others, to protect you and the Services, and to improve Microsoft products and services, you grant to Microsoft a worldwide and royalty-free intellectual property license to use Your Content, for example, to make copies of, retain, transmit, reformat, display, and distribute via communication tools Your Content on the Services," the clause reads.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.