EXPLAINER: What caused Amazon's outage? Will there be more?

by Associated Press - December 23, 2021
Reading time: 3 min

Post Views: 434

Robotic vacuum cleaners wouldn’t start. Doorbell cameras stopped watching for package thieves, though some of those deliveries were canceled anyway. Netflix and Disney movies got interrupted and The Associated Press had trouble publishing the news.

A major outage in Amazon’s cloud computing network on Dec. 7 severely disrupted services at a wide range of U.S. companies for hours, raising questions about the vulnerability of the internet and its concentration in the hands of a few firms.

That uncertainty was underscored Wednesday when Amazon reported another outage that, while much shorter and less disruptive that the Dec. 7 problem, still created problems for many of its cloud customers. On a status dashboard for the service, Amazon reported that a power failure in one of its data centers had disrupted customers whose tasks ran on its servers.

Power was restored after about 45 minutes, although the company said some customers continued to experience problems almost 12 hours following the outage. Hardware failures in the affected data center forced some Amazon customers to restart their cloud-based systems entirely.

HOW DID IT HAPPEN?

Amazon has still said nothing about what, exactly, went wrong in the early December outage. The company limited its communications at the time to terse technical explanations on an Amazon Web Services dashboard and a brief statement delivered via spokesperson Richard Rocha that acknowledged the outage had affected Amazon’s own warehouse and delivery operations but said the company was “working to resolve the issue as quickly as possible.” It didn’t immediately respond to further questions Wednesday.

The incident at Amazon Web Services mostly affected the eastern U.S., but still impacted everything from airline reservations and auto dealerships to payment apps and video streaming services to Amazon’s own massive e-commerce operation.

WHAT IS AWS?

Amazon Web Services is a cloud-service operation — it stores its customers’ data, runs their online activities and more — and a huge profit center for Amazon. It holds roughly 40% of the $64 billion global cloud infrastructure market, a larger share than its closest rivals Microsoft, Alibaba and Google, combined, according to research firm Gartner.

It was formerly run by Amazon CEO Andy Jassy, who succeeded founder Jeff Bezos in July.

TOO MANY EGGS IN ONE BASKET?

Some cybersecurity experts have warned for years about the potentially ugly consequences of allowing a handful of big tech companies to dominate key internet operations.

“The latest AWS outage is a prime example of the danger of centralized network infrastructure,” said Sean O’Brien, a visiting lecturer in cybersecurity at Yale Law School. “Though most people browsing the internet or using an app don’t know it, Amazon is baked into most of the apps and websites they use each day.” O’Brien said it’s important to build a new network model that resembles the peer-to-peer roots of the early internet. Big outages have already knocked huge swaths of the world offline, as happened during an October Facebook incident.

Even under the current model, companies do have some options to split their services between different cloud providers, although it can be complicated, or to at least make sure they can move their services to a different region run by the same provider. Tuesday’s outage mostly affected Amazon’s “US East 1” region.

“Which means if you had critical systems only available in that region, you were in trouble,” said Servaas Verbiest, lead cloud evangelist at Sungard Availability Services. “If you heavily embraced the AWS ecosystem and are locked into using solely their services and functions, you must ensure you balance your workloads between regions.”

HASN’T THIS HAPPENED BEFORE?

Yes. The last major AWS outage was in November 2020. There have been been numerous other disruptive and lengthy internet outages involving other providers. In June, the behind-the-scenes content distributor Fastly suffered a failure that briefly took down dozens of major internet sites including CNN, The New York Times and Britain’s government home page. Another that month affected provider Akamai during peak business hours in Asia in June.

In the October outage, Facebook — now known as Meta Platforms — blamed a “faulty configuration change” for an hours-long worldwide outage that took down Instagram and WhatsApp in addition to its titular platform.

WHAT ABOUT THE GOVERNMENT?

It was unclear how, or whether, Tuesday’s outage affected governments, but many of them also rely on Amazon and its rivals.

Among the most influential organizations to rethink its approach of depending on a single cloud provider was the Pentagon, which in July canceled a disputed cloud-computing contract with Microsoft that could eventually have been worth $10 billion. It will instead pursue a deal with both Microsoft and Amazon and possibly other cloud service providers such as Google, Oracle and IBM.

The National Security Agency earlier this year awarded Amazon a contract with a potential estimated value of $10 billion to be the sole manager of the NSA’s own migration to cloud computing. The contract is known by its agency code name “Wild and Stormy.” The General Accountability Office in October sustained a bid protest by Microsoft, finding that certain parts of the NSA’s decision were “unreasonable,” although the full decision is classified.

AI Cyberattacks Are Outpacing Assumed Security, Dan Bird MBE Warns

AI's Infrastructure: Who Pays for Data Centers' Boom?

Why Diaspora Communities Are the Most Important Customers in International Calling

Spectrum Scarcity? No, It’s Data Scarcity. The Role of Fresh Satellite Imagery in Network Expansion

A novel human-agent talent model to support telco transformation

Amazon’s $11.6 Billion Globalstar Acquisition Will Compete with Starlink

Starlink’s Path to Gigabit Satellite Internet with Gen2, Gen3 Satellites

Fiber, Cable, 5G Vie to Power Next-Gen Industrial Connectivity

Wi-Fi 8 Taking Connectivity to New Levels Starting 2028

Meta’s Under Sea Internet Cables Will Keep Us Connected

France Builds Decentralized AI Mesh to Survive US CLOUD ACT Chokehold

Israel, UAE Set Up Joint Weapons Fund as Iran War Deepens Gulf Security Ties

EA-37B Compass Call Shows How the US Air Force Is Winning Wars Without Shots Fired

Beijing’s Abandoning Old Missile Doctrines After Iran War

Iran’s Strike on Qatar’s Gas Hub Stops Helium Exports, Disrupts Global Semiconductor Supply Chain

Hundreds of iPhone Apps Expose AI Credentials, Wake Forest Study Finds

Microsoft’s China AI Growth Puts Innovation Under US Spotlight

Beijing Vows to Punish AI Stock Hype, Opens STAR Market to Large Model IPOs

Washington Anthropic Models Ban Shatters G7 Trust

Microsoft Exists $3 Bln Oracle Cloud Lease over FedRAMP Compliance

Starmer’s High-Tech Plan to Protect UK Children from Social Media

Age Verification App Will Be Europe’s Digital Shield for Youth

Social Media Drives Radicalized Connections

Spain Wants Social Media Bans Under 16, Will Jail CEOs for ‘Manipulating’ Algorithms

Nvidia-Powered Robot Bartender Pours into Hospitality

AI Cyberattacks Are Outpacing Assumed Security, Dan Bird MBE Warns

Fiber and 5G Investments Are Reshaping Telecom Infrastructure

Exclusive Interview: Is AI Rewriting Internal Communications Before Employees Ever Read Them?

Syntelligence AI Utilizes Advanced Technology to Fight Global Telecom Scams

MyMonty: The New Era of Banking