Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Guide
Tom’s Guide
Technology
Christoph Schwaiger

Even AI struggles to understand Excel sheets – Microsoft swoops in to help

How to use VLOOKUP in Excel.

If sifting through Excel spreadsheets isn’t your thing and you’d rather have an AI chatbot make sense of all the rows and columns for you, Microsoft may hold the key to helping LLMs understand spreadsheets better.

It’s not just you, AI is also known to struggle with processing spreadsheets. Their expansive grids and various cell formats act as hurdles that LLMs must overcome.

Now, a group of Microsoft researchers think they may have found a solution that optimizes LLMs’ approach to deciphering spreadsheets. 

In a pre-print paper submitted on July 12, the researchers unveiled SpreadsheetLLM, a new method that combines encoding and compression with leading AI chatbots to help them handle spreadsheets more efficiently.

Their data suggests using their method, the GPT4 AI model improved by 27% in terms of spreadsheet table detection and by nearly 26% in performance on in-context learning. Their method also led to cost reductions of up to 96% based on GPT4 and GPT3.5-turbo prices.

A version of this could be integrated into Microsoft Copilot for 365 in the future, making it easier than ever to make sense of data.

What makes SpreadsheetLLM useful?

(Image credit: Shutterstock)

The key to SpreadsheetLLM’s success is Microsoft’s SheetCompressor, an encoding framework that compresses spreadsheets effectively for LLMs. 

It comes with three different modules: one that makes spreadsheets more legible for LLMs, another that bypasses empty cells and repeating numbers, and another module that helps LLMs better understand what a number means (like if it’s a year or a phone number).

This compression method reduced token usage for spreadsheet encoding by 96%. Their compression method significantly boosted performance on larger spreadsheets, where the challenges of high token usage are felt the most.

We may soon be able to upload entire spreadsheets and ask the chatbots questions in plain language to receive data summaries or analysis based on the file we uploaded.

In their paper, the authors also said they created “Chain of Spreadsheet”, a framework extender that helps identify the table relevant to a question and determines the boundaries of the relevant content. The question and the data are then presented again to the LLM which then processes the trimmed information to generate a response.

Directly inputting a typical spreadsheet often meant the token limits of conventional models simply got exceeded. The Chain of Spreadsheet method helped LLMs focus only on regions relevant to the questions posed, reducing unnecessary data, thus keeping the LLM efficient.

One limitation that the Microsoft researchers pointed out about their current method was that it can’t yet handle spreadsheet formatting details such as background color and borders since this information costs too many tokens.

While this won’t immediately mean much for the average user, if newer versions of chatbots such as ChatGPT and Claude incorporate Microsoft’s SpreadsheetLLM, we may soon be able to upload entire spreadsheets and ask the chatbots questions in plain language to receive data summaries or analysis based on the file we uploaded.

More from Tom's Guide

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.