11 October 2021
Unless you have been living under a huge rock, you must be aware of the global outage that Facebook had to face hours ago. It lasted for over six hours and affected Facebook, WhatsApp, Instagram, and OculusVR. None of these services were working and as a result, many third-party offerings that rely on Facebook login also stopped working. The good thing is that all those apps and services have resumed operations and we have started to get answers about why it happened.
Initial reports suggested that Facebook might be under some kind of hacking attack that has brought down all of its services. However, these claims have now been denied officially. The crux of the matter is that configuration changes to the company's routers caused massive downtime. It even affected the company's internal tools and systems that are used by teams and engineers to communicate with each other. This is the reason it took so long for them to get everything back in order.
Santosh Janardhan, VP of Infrastructure at Facebook said,
"Our engineering teams have learned that configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt."
It was the longest outage Facebook experienced since the one that happened in 2019. That one lasted for over 14 hours and was also said to be a result of some server configuration changes. If you are interested in knowing the in-depth technical explanation of what happened, Cloudflare has shared an explainer that you can follow.