Anthropic shares hilarious demo of Claude AI ditching…

Anthropic shares hilarious demo of Claude AI ditching a coding prompt to look at scenic national park photos on Google: "Claude has ADHD, this is very relatable"

What you need to know

Anthropic recently shipped an upgraded version of Claude 3.5 Sonnet alongside a new Computer Use API.
The AI firm has been documenting the model's advances, including instances where it took a break from coding to look at pictures of Yellowstone National Park.
Multiple reports suggest coding could be dead in the water as a future career path for the next generation with the rapid prevalence of AI, but this revelation suggests otherwise.

With the emergence of generative AI, there's a lot of speculation and predictions swirling in the air about the augmentation of certain professions using the technology. To this end, the banking sector, design jobs, and software development seem to be first on AI's chopping block.

NVIDIA CEO Jensen Huang indicated that coding could be dead in the water with the rapid adoption of AI in software development companies. He discouraged the next generation from taking up software development as a profession. Instead, he recommends seeking alternative career paths in biology, education, manufacturing, or farming.

Amazon Web Services CEO Matt Garman seemingly shares the same sentiments, predicting a drastic shift in the software development landscape. "If you go forward 24 months from now, or some amount of time — I can't exactly predict where it is — it's possible that most developers are not coding," added Garman.

As you might already know, AI is well beyond the image and text generation phase and is on to sophisticated and advanced tasks like coding. For instance, OpenAI's GPT-4o and OpenAI-o1 models have been touted for their advanced capabilities in writing and detecting errors in code.

However, recent coding demos featuring Anthropic's Claude AI model suggest that we might have jumped the gun a tad about AI taking over the profession from humans.

Are we ready for a world predominantly run by AI agents? It's too early to say

Anthropic Claude 3.5 Sonnet (Image credit: Anthropic)

Anthropic has seemingly been documenting upgraded Claude 3.5 Sonnet's advances where it happened to stumble on some interesting discoveries.

In one of the company's demos, Claude AI can be seen writing code, but in another instance, it seemingly changes its train of thought and swiftly transitions to Google and schemes through a library of images of Yellowstone National Park. One user joked:

"Claude innocently checking out the dormant super volcano that could send us back to the Ice Age."

Even while recording these demos, we encountered some amusing moments. In one, Claude accidentally stopped a long-running screen recording, causing all footage to be lost.Later, Claude took a break from our coding demo and began to peruse photos of Yellowstone National Park. pic.twitter.com/r6Lrx6XPxZOctober 22, 2024

Another highlighted incident shows Claude AI accidentally disrupting the screen recording of a long clip. In the process, the recorded footage is lost, forcing the daunting and tedious task to be repeated from scratch.

Claude has ADHD, this is very relatableOctober 22, 2024

This news comes at a time when major tech corporations in the AI landscape are big on the automation of tasks using AI agent campaigns. Recently, it was announced that Copilot Studio will soon support the creation of autonomous agents. Like Salesforce's Agentforce offering, Microsoft's Copilot agents will help automate tasks across IT, marketing, sales, customer service, and finance. Salesforce Marc Benioff interpreted the launch as "panic mode."

The CEO took the opportunity to throw jabs at Microsoft while tooting Agentforce as the superior and reliable alternative:

"Copilot’s a flop because Microsoft lacks the data, metadata, and enterprise security models to create real corporate intelligence. That is why Copilot is inaccurate, spills corporate data, and forces customers to build their own LLMs. Clippy 2.0, anyone? Meanwhile, Agentforce is transforming businesses now. Agentforce doesn’t just handle tasks—it autonomously drives sales, service, marketing, analytics, and commerce. With data, LLMs, workflows, and security all integrated into a single Customer 360 platform: This is what AI was meant to be."

Interestingly, Microsoft unveiled a new benchmark called Windows Agent Arena. It provides a platform for testing AI agents in realistic Windows operating system environments. The platform an avenue for deep research which could significantly enhance the development of AI agents.

Benchmarks shared indicate that multi-modal AI agents have an average performance success rate of 19.5% compared to the coveted average human performance rating of 74.5%, raising performance concerns on top of the security issues abound.