What can businesses learn from the CrowdStrike chaos?

26.07.2024

Following 8.5 million Microsoft devices being hit by a faulty software update from CrowdStrike causing global chaos, we look at what happened, how, and why. Whilst The PC Support Group and its clients have not been affected by this issue as we do not use CrowdStrike as part of our recommended security stack, our CEO, Phil Bird, thinks it’s vital to highlight the lessons end-user businesses and software developers should take away from it.

The Worst Cyber Event in History

The scale of effects of the disruption, which began on 18 July (Microsoft) or 19 July (according to CrowdStrike), makes it the worst cyber event in history. It surpasses the WannaCry cyber-attack in 2017 where 300,000 computers in 150 countries were affected. The issue with CrowdStrike, whereby a faulty file in a software update caused many Windows computers to blue screen caused major disruptions across a wide variety of industries globally, which included: 

Airlines

Airlines faced massive disruptions with thousands of flights cancelled or grounded, leading to chaos at major airports around the globe. Passengers endured long waits as airlines grappled with schedules and customer service failures.

Healthcare

Hospitals were thrown into disarray, with delays in procedures and tech failures. Some facilities had to go manual, disrupting patient care and cancelling services. Pharmacies couldn't fill prescriptions, leaving customers without vital medications.

Financial Services

Banks stumbled with transaction hiccups, leading to ATM outages and online banking crashes. Customers faced major inconveniences as financial services faltered.

Media and Broadcasting

Sky News and other broadcasters went dark, highlighting a critical dependence on secure IT infrastructure. The public missed out on crucial updates during the outage.

Emergency Services

Emergency call centres were hampered, slowing response times and risking public safety. Delays in emergency responses raised serious concerns.

Retail

Retailers faced chaos with broken point-of-sale systems and online glitches, stalling transactions and scrambling inventory management. Both physical and online sales took a hit.

CrowdStrike has released a fix for a recent issue, but it requires each affected device to be manually rebooted in safe mode, causing significant work for IT departments. The effects of this incident are ongoing and expected to last at least a week.

The widespread disruption shows the interconnected nature of modern businesses and the risks of single points of failure. Companies must not only enhance their cybersecurity but also closely manage the cybersecurity practices of their partners.

The incident may lead to legal and financial repercussions, prompting businesses to improve their insurance and legal strategies. The warning from CrowdStrike about increased cyber-attack risks calls for heightened vigilance against phishing. Businesses should educate employees on recognising threats and adopt comprehensive cybersecurity measures to prevent future disruptions.

So, what’s the takeaway for business owners?

Our CEO Phil Bird, is unequivocal in his views.  Firstly, users of CrowdStrike and other cyber-security software should not “throw the baby out with the bathwater”. Cyber-security software remains a key element to protect businesses and individuals from malicious cyber-attacks, which are much more common than a mistake like this. It is vital for companies to continue to bolster their IT system security with multiple layers of cyber-security.

Secondly, it’s important to recognise that major IT problems like this will happen again in the future, and just because you or your business avoided it this time doesn’t mean you can relax. All businesses, large and small, need to consider “what happens if part or all of my IT systems fail?”.

SMEs often ignore the potential consequences of such outages and have no data recovery or business continuity plans in place. Every company leader should ensure that a solid plan is in place for the various parts that make up their IT systems. What if they can’t access their CRM system? What if they can’t send or receive emails? What if their VOIP phone system doesn’t work? What if their internet connection goes down? Etc…

Thirdly, this major issue will have repercussions for all software developers. CrowdStrike will understandably have a post-mortem and ask, “how did this happen?”, but all major software providers, not least Microsoft, need to consider the implications of this.

Why wasn’t this software tested to a level that would have picked up on this bug prior to it being rolled out? With software now being developed at a pace to ensure it’s always ahead of the game, has the art of solid software testing been lost because the developers think they can quickly recover from any mistakes and roll a fix out in a flash?

It is now commonplace for software providers to roll out updates whenever they like, with the users having little say. As the CEO of an IT managed service provider, we find this is a recurring scenario which is incredibly disruptive. We frequently find that software which was working perfectly well one day will suddenly cause problems because an update occurred with no knowledge, agreement or control by us or our clients. This has to stop! Software updates should be offered to users with an explanation of what they are for, and then those users (or their IT support providers) should decide if and when to accept those updates. This simple change – back to how it used to be - would have prevented the issues today.

And finally, Microsoft need to consider how and why any software can crash the Windows operating system. The operating system should be a protected environment that other software runs within, not something other software can change at will. Microsoft must make Windows more stable and secure.

This issue has caused chaos but others in the future could be worse if we don’t learn from this.