Last week a global IT outage left planes grounded, emergency services offline and people unable to work. The exact cause is subject to investigation but it looks like the cause was a buggy software update from cybersecurity company CrowdStrike.
Getting to grips with exactly what happened and how can be difficult as it was so far-reaching and had such a significant impact, which led me to ask: can AI help?
To get an answer out of artificial intelligence for a very recent news story I’d need to pick a model capable of browsing the internet. That puts Copilot, Gemini, Perplexity and ChatGPT as the primary contenders and — as Microsoft was badly hit — I picked Copilot.
Then I thought it would be worth seeing how it compares to another model, so I’ve put Microsoft Copilot and Google Gemini head-to-head on a series of ‘interview-type’ questions I might have asked a cybersecurity expert or a computing specialist — including who is to blame.
Creating the prompts
Unlike other experiments where the prompts need to cover a range of topics and often require starting a new chat for each, this was specifically contained within a single message window.
The AI had to be able to remember the first question I asked by the end, to ensure it flowed like an interview and not just a disjointed run of queries.
Q: Can you explain what happened with the most recent CrowdStrike global IT outage?
The first prompt was obvious as the idea was to get both chatbots to explain what happened. This tests the ability they have to find up-to-date information by searching the web and then analyzing and presenting it clearly and concisely.
Microsoft Copilot
According to Microsoft Copilot, “CrowdStrike distributed a faulty update to its security software” that “caused approximately 8.5 million computers running Microsoft Windows to crash, rendering them unable to restart properly.”
Given how badly Microsoft was affected, and that Copilot is Microsoft’s flagship AI I’d say this was a fair and balanced summary with no embellishment or attempt to hide the impact.
Copilot said: “The incident is considered the largest outage in the history of information technology, affecting daily life, businesses, and governments worldwide. Industries such as airlines, hospitals, banks, and more were heavily impacted, resulting in an estimated financial damage of about $10 billion.”
It did make a point of stressing that a fix was released within hours but that manual repairs were necessary which caused lingering outages. It didn’t go into a lot of detail but had a cited source for every claim and comment, which was a useful addition.
Google Gemini
Gemini was a little more specific in its opening, talking about the Falcon software rather than a generic update to security software. It also mentions Windows' Blue Screen of Death (BSOD), which Copilot avoided completely.
It said: “The update, intended to enhance security for Windows hosts, contained a defect that triggered a "blue screen of death" error on an estimated 8.5 million Microsoft Windows devices worldwide.”
Google’s chatbot explained the scale of the disruption in a similar way to Copilot and that CrowdStrike quickly issued a fix.
It went further in explaining the delays to restoring services, explaining: “The recovery process was time-consuming, requiring manual removal of the faulty code on each device. Experts warned that it could take weeks for global tech infrastructure to fully recover from the outage.”
Winner: Gemini for a more detailed response
Q: Should businesses think twice about using CrowdStrike as a vendor?
This next prompt was asked immediately after the answers came back for the first question and put the AIs in a position where they had to offer an opinion. It also tests their reasoning skills.
Microsoft Copilot
I didn’t get much in the way of a response from Copilot and its reply began a weird fascination with using emojis in responses, such as 🛡️. It opened with “While the recent CrowdStrike global IT outage was significant, it’s essential to consider the broader context.”
Copilot went on to defend the fact CrowdStrike is a well-established cybersecurity company that also has a “track record of effective threat detection and incident response.”
The AI said any business impacted should weigh the incident against its wider security needs and evaluate CrowdStrike’s “response and transparency during the outage” but that they should also consider alternative vendors. “Ultimately, due diligence and risk assessment are crucial when choosing any security solution.”
Google Gemini
Gemini gave me a pro and con list for whether to consider using CrowdStrike as a security vendor, but also opened with: “While CrowdStrike is a reputable cybersecurity provider, this outage highlights potential risks associated with their services.”
It’s pros for keeping CrowdStrike include “advanced threat detection and response capabilities, protecting against various cyber threats, and a “strong reputation in the industry, serving numerous organizations globally.”
Gemini also praised CrowdStrike for acknowledging the issue and quickly releasing a recovery solution, demonstrating a “commitment to resolving problems.”
But, it wasn’t all good news for the cybersecurity company as Gemini also declared that the “outage underscores the potential for service disruptions due to software updates or other issues,” and that “relying heavily on one vendor can increase vulnerability if that vendor experiences problems.”
It concluded by suggesting businesses should “consider using multiple vendors for different security aspects to mitigate the risk of a single point of failure.”
Winner: Gemini for more detail and the pro/con list
Q: How could something like this be prevented in the future?
This is the million-dollar question and something AI should be useful for as it can weigh up all the data and present conclusions based on hard facts. The obvious answer will be improved quality assurance and testing, plus better diversification of security providers.
Microsoft Copilot
Copilot gave a good balanced response, breaking it down into bullet points starting with the need for better quality assurance and testing. “Rigorous testing of software updates before deployment is crucial. Automated and manual testing should cover various scenarios to catch potential issues,” it declared.
Microsoft’s AI also suggested CrowdStrike switch to gradual rollouts for updates, starting with a small subset of systems and monitoring performance before expanding to more devices.
It also called for comprehensive backup and recovery plans, monitoring and alerts throughout the system, transparency with vendors and users and collaboration with users.
Copilot concluded with this: “Remember that no system is entirely immune to issues, but proactive measures can significantly reduce the risk of large-scale disruptions.” Then, for reasons only known to itself, added these emojis: 🌐🔒.
Google Gemini
Gemini gave a much more comprehensive response, splitting the answer into two categories — for software vendors like CrowdStrike and for businesses. This allowed it to give a more nuanced response for required changes in each category.
For software vendors like CrowdStrike Gemini says they need a more stringent quality assurance and testing procedure before deploying any updates. They go further than Copilot in offering an idea: "real-world simulations and extensive beta testing on diverse systems."
Gemini also suggested phased rollouts and transparency but added a need for automated rollbacks to "quickly detect and revert faulty updates, minimizing the impact on users and systems."
When it comes to businesses Gemini says diversification is key, not relying on a single vendor for critical systems to "mitigate the risk of a single point of failure."
It also says there is a need for an incident response plan, regular backups and tracking the industry to ensure all best practices is being followed.
Where Gemini wins this one is the next step, where it also goes further in suggesting a need for better collaboration between vendors and regulatory frameworks to set out standards for updates, testing and deployment.
Winner: Gemini for a detailed, well-structured response
Q: Should there be financial or legal consequences for CrowdStrike?
Next, I pushed both Copilot and Gemini to make a judgement call. Something AI almost always refuses to do because of guardrails but as this is a relatively cut-and-dry issue I wanted to see if they'd take the bait.
Microsoft Copilot
Copilot does exactly what I expected, outline the details and offers a list of balanced pros and cons on both sides of the argument without offering any actual specific guidance beyond "weigh up the options" and investigate.
"Determining financial or legal consequences for CrowdStrike would depend on various factors," Copilot wrote, suggesting it would depend on different contracts, how severe the outage was and whether it broke specific laws in different locations.
In terms of contracts, it could be, according to Copilot, that it failed to meet service level agreements that specify uptime and liability, in which case it could find itself subject to financial penalties as part of the contract.
Copilot suggested for there to be legal consequences it would have to be the case that negligence or inadequate quality control led to the faulty update, in which case affected parties "might seek compensation," although this is still an open topic so we don't know what led to the faulty update.
"In summary," Copilot ended: "while financial and legal consequences are possible, a thorough investigation would be needed to determine the appropriate course of action." It then added the now customary emojis: 🕵️♂️💼.
Google Gemini
Gemini was similarly guarded and nuanced in its response, likewise offering a breakdown of where it could have breached contracts, regulatory requirements or even laws, including suggesting that: "Arguably, they failed to exercise due care in testing and deploying the update."
It also goes on to suggest that: "Holding CrowdStrike accountable could deter similar incidents in the future and encourage greater diligence in the software industry."
Unlike Copilot it doesn't just outline where it might be liable but also offers arguments against consequences in a more nuanced response than I'd have expected from a large language model.
These include that it appears to be unintentional, that there is limited legal precedent for holding vendors liable for disruption and that there is a degree of shared responsibility due to a lack of adequate recovery plans from some businesses.
"Ultimately," Gemini ended: "deciding on appropriate consequences for CrowdStrike requires careful consideration of various factors and a balanced approach that considers both accountability and the potential for future improvements in the industry."
Winner: Gemini for the more nuanced response but it was close
Q. Does the CrowdStrike outage raise concerns about IT reliance on Windows?
This is the final question and one I've dubbed the 'bias test'. Microsoft owns one of the competitors in this AI match-up so its only fair to see if its AI will offer a balanced view on a potentially unbalanced question.
Microsoft Copilot
I didn't get much out of Copilot for this one, just a single short paragraph suggesting it highlights the importance of diversifying IT infrastructure.
This is a good thing as it doesn't outright defend keeping Windows as the de facto standard. Suggesting: "The recent CrowdStrike outage highlights the importance of diversifying IT infrastructure."
Copilot added that while Windows is widely used "relying solely on any single operating system can pose risks." Suggesting that "diversification allows businesses to mitigate the impact of such incidents and maintain continuity."
It also finished with a pair of emojis: 🖥️🌐.
Google Gemini
Gemini's response was both much longer and more unequivocal. Asked if it raises. concerns about IT reliance on Windows Gemini simply starts with "Yes, the CrowdStrike outage does raise concerns about IT reliance on Windows."
It then explains why, including creating a single point of failure, leading to a cascade effect where one update can trigger a chain reaction impacting the underlying OS and the fact limited alternatives lead to difficulties switching quickly in an outage.
The chatbot added some balance, suggesting that it does not necessarily mean Windows is inherently flawed. "The issue stemmed from a faulty software update interacting with Windows, not a fundamental flaw in the operating system itself," Copilot cautioned.
It then went on to reiterate some of the points from earlier in the conversation including a need to diversify, ensure robust testing, and to have adequate recovery systems in place.
"While Windows remains a widely used and generally reliable operating system, the CrowdStrike outage underscores the importance of taking a proactive approach to risk management and ensuring resilience in IT infrastructure," it concluded without the need to resort to emojis.
Winner: Gemini for the forthright and balanced response
Winner: Copilot vs Gemini
On results alone this is a clear win for Google Gemini. Its responses were more detailed and well thought out. It didn't use emojis at the end of every line and was able to reference moments from earlier in the conversation.
That isn't the full picture though, as from a user perspective, especially if I'm on my phone or looking for some quick update on a breaking story Copilot may have been more useful. It also offered citations and links for every comment, although you can get that from Google if you click the G icon under each response.
What we're starting to see, now the AI chatbot space is maturing, is a diversification based on general user profile, need and taste. Built into Windows and Microsoft 365 Copilot has to be more general purpose than the Gemini web app. Microsoft also creates a more consistent response across all platforms.
I personally prefer Gemini to Copilot, but I prefer GPT-4o, the underlying AI model powering Copilot to Gemini Pro 1.5.