Artificial intelligence has a sustainability problem and the solution, according to IBM’s former global head of AI, is a paradigm shift away from today’s large language models like OpenAI’s GPT-4 or Anthropic’s Claude.
Tools like ChatGPT run on LLMs, artificial neural networks that have been trained on vast amounts of data scraped from the web and provide AI-generated answers to text-based prompts.
Speaking at the Fortune Brainstorm AI Singapore conference last week, Seth Dobrin said the future could belong to small language models (SLMs) that are tailor-made to address specific applications and require far less energy to operate.
“These massive large models are not what the technology was built for. They’re cool, they’re fun, but that’s not the solution,” Dobrin, a general partner at venture capital fund 1infinity Ventures, told conference participants last week. “Look at using small, task-specific models."
Experts have been warning that AI will not be able to reach its full potential until it solves its addiction to energy. Arm Holdings, a designer of power-efficient microchips for handheld devices, predicted earlier this year that GenAI could gobble up a quarter of all the electricity consumed in the United States come 2030.
It’s not just energy either. Many data centers additionally use water along with air to cool the servers as they crunch through terabytes of data in seconds.
Dobrin said too few people are even aware of the ecological impact when they use ChatGPT or Claude today.
“For every 25 to 50 prompts, depending on how big they are, you use about half of liter of water — just through evaporation,” he said. “We absolutely need new paradigms of cooling.”
Unfortunately as miniaturization advances and process technology is shrunk down to 2 and 3 nanometer nodes, the smallest dimension circuits capable of being printed onto a silicon wafer, their thermal properties are becoming a bigger and bigger problem.
This presents a growing dilemma for the industry as fans and air conditioners cannot transport the heat away fast enough, and even cooling plates attached directly to chips are not effective past a certain computational speed.
Immersing servers into baths of oil to quickly dissipate heat
“We’re getting less and less efficient with every step, and this has coincided with the rise of AI, which unfortunately uses the most energy intensive and hottest chips out there,” Tim Rosenfield, co-founder and co-CEO of Sustainable Metal Cloud, told the conference.
He believes SMC may have the answer. Its flexible, modular HyperCube data center hardware can roughly halve carbon emissions over a conventional air-cooled H100 HGX system by submerging servers directly in a bath of oil, a relatively new process known as immersion cooling.
Oil is far more effective than air at extracting heat and is also less electrically conductive than water, making it better suited for server racks that will have to run bleeding edge 2nm and 3nm AI training and inference chips.
“The problem that we’re trying to solve for—how to turn energy into knowledge as cost-effectively and efficiently as possible—starts with heat,” said Rosenfield. “That is one of the drivers of AI’s energy challenge.”
While immersing servers directly in nonconductive liquids can help address the need to dissipate ever growing volumes of heat quickly and efficiently, this new technology comes with its own challenges in terms of upfront investment costs, maintenance and repair.
As a result, Rosenfield’s plan is to offer it as a complete packaged deal he’s calling cooling-as-a-service.
Venture capitalist Dobrin had one more bit of unconventional advice that might help to minimize AI's substantial carbon backpack—beyond using new cooling technologies.
Before reaching for a tool as powerful as ChatGPT or Claude, forget the GenAI hype for a moment and start by asking yourself whether something else might be equally suited for the task.
“Focus on the use case—what problem you are trying to solve?” he said. “Do you really need generative AI?”