Network connection issues between AWS eu-west and Bitbucket Cloud
Incident Report for Atlassian Bitbucket
Postmortem

Incident

Some Bitbucket customers hosted in the EU-WEST-1 AWS region experienced slow connections when cloning repositories. The Bitbucket Support team received notifications about these slow connections and triggered an incident to have our SRE team investigate the issue. A team of engineers gathered to look at the issue and were able to correlate the time when the reports started coming in with a networking change that updated the Bitbucket production traffic route between Dublin and where Bitbucket is hosted. Once this correlation was discovered, the change was rolled back and the the immediate problem was addressed.

Root cause and prevention

In order to put the networking route change back in place, our team of network engineers started investigating the root cause of the problem. After a couple of days of troubleshooting and data analysis the root cause was pointed to physical cross-connect issues that were causing CRC errors on the provider side. This caused slow download speed in one of the directions between where Bitbucket is hosted and Dublin. The physical cross-connect issues have been fixed and the root cause of the problem has been addressed. The networking route change is now back in place with no incidents. Additionally, the redundancy link between these locations, which had been plan all along but was waiting to be installed, is also now in place.

Our Network Engineering team conducts different ping tests (big size, short timeout etc) when commissioning new circuits, however none of these tests caught this particular issue. They are adding new steps to do a thorough performance testing before bringing circuits in production.

Posted Jun 26, 2017 - 17:10 UTC

Resolved
This incident has been resolved.
Posted Jun 15, 2017 - 16:27 UTC
Monitoring
The connection issues from the AWS EU West region to the Bitbucket Cloud platform are resolved. We will continue to monitor to make sure the issue is fully resolved.
Posted Jun 15, 2017 - 15:52 UTC
Investigating
We are currently investigating reports of connection issues from the AWS EU West region to the Bitbucket Cloud platform.
Posted Jun 15, 2017 - 14:54 UTC