Artificial intelligence has already had a big year with the release of GPT-4o from OpenAI, Claude 3.5 Sonnet from Anthropic and the Gemini 1.5 family from Google, but Meta’s release of a massive 405 billion parameter Llama 3.1 is up there for the "most important" crown.
Meta released version 3.1 of its open-source Llama AI model family yesterday and quickly gained a reputation as one of the most powerful and useful models available, beating the proprietary AI from OpenAI and Anthropic on many major benchmarks.
Overall, the performance is roughly on par with GPT-4o or Sonnet, slightly worse at some things and slightly better at others. Yet it isn’t the performance that makes it significant: it's the fact it is open-source, available widely and can even be downloaded and used on your own machines.
This level of access and control over a frontier-grade AI model is groundbreaking as it will lead to new research, new types of models and advancements in areas that might not be worth investing the per token cost of using GPT-4o or Claude Sonnet 3.5.
If you don’t have a data center of your own the smaller models can run on a good gaming laptop or there are a multitude of cloud platforms and services offering access including Groq, Perplexity and if you’re in the U.S. it's available on WhatsApp and the Meta.ai chatbot.
Why is Llama 3.1 405b so significant?
Training a large language model is incredibly expensive. There has been a focus recently on building efficiency over scale and even OpenAI has released a smaller model in GPT-4o mini.
However, as good as the smaller models are becoming, size does matter when it comes to frontier-level intelligence and with Llama 3.1 405b Meta has found a compromise and managed to pack a trillion parameter quality model into one half the size.
This is the first frontier-grade model to be made available open-source, and Meta went a step further in allowing companies, organizations or individuals to use data generated in 405b to fine-tune or even completely train their own models.
Meta is also not just releasing the model family but a full ecosystem complete with sample applications, prompt guards for moderation and guardrails and has proposed a new API interface standard that will make building applications using AI easier.
Aside from being open-source, offering advanced capabilities and a full ecosystem with smaller models and custom features, Llama 3.1 405b seems to excel at multilingual translation, general knowledge and mathematics. It is also great in terms of customization for specific needs.
Victor Botev, CTO of AI research company Iris.ai described Llama 3.1 405b as a "significant step forward in democratizing access to AI technology." This is because being open and accessible makes it easier for researchers and developers to "build on state-of-the-art language AI without the barriers of proprietary APIs or expensive licensing fees."
Where can I try Llama 3.1 405b?
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet.Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context… pic.twitter.com/1iKpBJuReDJuly 23, 2024
Llama 3.1 405b might already be one of the most widely available AI models, although demand is so high that even normally faultless platforms like Groq are struggling with overload.
1. Meta.ai/WhatsApp
The best place to actually try it out is on Meta’s own meta.ai chatbot or the WhatsApp messaging platform. These both give an "as intended" way to view and use the model and on Meta.ai it comes with access to image generation and other features.
The downside to this option is that it is only available in the U.S. and Meta has done a good job at blocking VPNs, as it also requires a Facebook or Instagram account.
2. Groq inference engine
I’ve long extolled the brilliance of Groq, the AI startup building chips designed to run AI models very quickly. It has provided easy access to all of the open-source models including previous versions of the Lllama family and it now has all three Llama 3.1 models.
It does have access to 405b but demand was so high it has been limited and so may not appear if you visit. This includes through the chatbot and through GroqCloud — but its a great way to try out the 70b and 8b models which have also been given an upgrade in Llama 3.1.
3. Perplexity search
Perplexity is an amazing tool for searching the web, using a range of custom and public AI models to enhance the results returned by traditional web crawlers. It can also generate custom pages offering Wikipedia-style guides to a topic.
One of the newest models available with Perplexity is Llama 3.1 405b but there is a catch — it is only available if you have a ‘pro’ plan which costs $20 per month, although this is a good choice if you’re looking for a way to search the web and work with a range of AI models.
4. Hugging Face Chat
HuggingChat is something of a hidden gem in the AI chatbot space, offering access to a wide range of models including some not available anywhere else and tools such as web search and image creation. You need a HuggingFace account but its easy to set up and get started.
It is completely free to use and once you’ve signed in just go to settings and select Llama 3.1 405b. The downside to this platform is that there is a learning curve and it doesn’t shy away from using full model names and descriptors. It isn’t beginner-friendly.
5. Poe chat marketplace
Poe, the Quora-backed chatbot marketplace works a bit like HuggingChat in that it gives you access to a range of models and allows you to customize the way they interact with you but with a more user-friendly, consumer-focused approach.
Unlike HuggingChat which seems relatively open and largely free to use, Poe charges "compute points" per message sent. You get a relatively generous amount per day for free but 405b is an expensive model, costing 485 compute points per message — so you’ll only get half a dozen or so without paying $17 a month for a premium account.
Are there other alternatives?
If none of these work, you want more control over how you plan to use Llama 3.1 405b and you don’t have your own data center, then it's worth looking to one of the multitude of cloud computing platforms and not just AWS from Amazon, Google Cloud or Microsoft Azure. However, they all have access to the new model.
Snowflake, Cloudflare, DataBricks, Nvidia AI Foundry and IBM Cloud are just a few of the places you can sign up for a developer account and gain access to the open-source model. You can also try it directly without signing up from SambaNova Systems.