X has had its own AI chatbot, Grok, for a while, but it'd be fair to say it's not mentioned in the same way that OpenAI's ChatGPT or Google Gemini are.
That's not for the want of trying, though, and with a huge user base of X users providing data for the model, a new version was always expected.
Now. the obviously-named Grok-2 has entered beta. In a new blog post, X says it represents "a significant step forward from our previous model Grok-1.5, featuring frontier capabilities in chat, coding, and reasoning."
"At the same time, we are introducing Grok-2 mini, a small but capable sibling of Grok-2. An early version of Grok-2 has been tested on the LMSYS leaderboard under the name "sus-column-r." At the time of this blog post, it is outperforming both Claude 3.5 Sonnet and GPT-4-Turbo."
Grok-2 outperforms Claude 3.5 Sonnet and GPT-4-Turbo
So, what's new? As the graph above shows, the overall Elo score for an early model of Grok-2 beats out every comparable chatbot except for ChatGPT-4o and Google Gemini.
X also says that Grok-2 and its Mini counterpart "achieve performance levels competitive to other frontier models in areas such as graduate-level science knowledge (GPQA), general knowledge (MMLU, MMLU-Pro), and math competition problems (MATH)," while also pointing to vision-based tasks as an area of improvement.
Grok will also gain a new interface on X, as well as the option to generate images with a prompt. This is achieved through the integration of the popular Flux AI image generation model from Black Forest Labs.
Grok will be offered through a new enterprise API later this month, which X is promising will offer a "bespoke tech stack" as well as mandatory multi-factor authentication.