I’d be surprised if those in IT were not aware of this, but Amazon Web Services had significant issues yesterday which extended with ripple effects into today, including internet slowness today. Reported to start approximately 10:45AM, the AWS Status page was updated around 6PM indicating partial service restoration.
Insider reported the culprit being a sudden increase in traffic that caused congestion across multiple network devices in Northern Virginia, the biggest region for AWS data centers.
Outages were reported across services such as Disney+, Netflix, Amazon’s own e-commerce store, Zoom, and much of Amazon’s own internal warehouse systems and delivery services. Of note for MSPs was ConnectWise reports regarding issues with Chat, Single Sign On, and Manage Email connector related to the AWS outage.
Why do we care?
Outages need to be unique for me to cover them – this one impacting Amazon’s own physical goods delivery during the holidays seems like exactly that. That feels worthy of note.
Several MSPs noted the design concern of ConnectWise acting as middle man for SSO on systems. That’s my angle on why we care – outages of cloud systems are something to be planned for, and graceful failover of business operations is something to plan for. Note I said business operations, not just technical ones. It was notable to me that while systems were offline, an airline website I was using gracefully redirected me to phone reservations, versus the manual workaround efforts reported by MSPs trying to log in to their backend systems.
The value is in the expertise of planning for the outage and handling it gracefully as much as the design of the system itself… which is why we care.