OpenAI has moved on from unveiling its new GPT-4o model to announcing a deal with Reddit. The AI firm has inked the deal to gain access to real-time content from Reddit's Data API. This follows on from Reddit striking a similar deal with Google earlier in the year, which was believed to be worth about $60m.
While no financial terms were discussed in OpenAI's blog post announcing the agreement, the deal will also give Reddit the option to "bring new AI-powered features to Redditors and mods". And Reddit's share price jumped more than 10% following the announcement.
The final part of the deal involves OpenAI becoming a Reddit advertising partner.
While neither company mentioned what part training data plays in the deal, it's hard not to see OpenAI making use of the treasure trove of Reddit content that could be used to give ChatGPT even more context with which to handle queries and requests from users. The use of Reddit's posts for training data was explicitly stated in the Google deal; revealing it would give Mountain View, "more efficient ways to train models".
Of course, Reddit has its downsides if we're discussing its suitability as a dataset. Unlike literature or regulated publications, the grammar and colloquialism is much looser, inside jokes and memes are rampant and there's also a lot of information on there that's just plain wrong.
“Reddit has become one of the internet’s largest open archives of authentic, relevant, and always up to date human conversations about anything and everything," said Steve Huffman, co-founder and CEO. "Including it in ChatGPT upholds our belief in a connected internet, helps people find more or what they’re looking for, and helps new audiences find community on Reddit."
Content vs AI
The deal also formalizes relations between an AI firm and a content company at a time when the two industries are at loggerheads. It's been well documented that multiple copyright owners have started legal claims against AI creators for scraping their content without permission.
How Reddit's own users, who went dark on over 7,000 subreddits last year to protest changes to its API pricing, will react to having their posts used to train AI remains to be seen.
But OpenAI has also agreed deals with publishers like the Financial Times and Associated Press in recent months.
The company is also busy rolling out its GPT-4o model, a multimodal AI that's faster and can understand text, image, video and audio prompts. There's no exact timeframe for when you'll get it, but if you’ve got access to 4o on your account it will be available in both the mobile app and online as a free upgrade.