TEAM Implements Network Infrastructure Enhancements
08/22/17 | 9:00AM | Posted by Bob Pelzer, Systems Architect, IT
“Always on” is a must when it comes to providing business software on the cloud (otherwise known as Software as a Service or SaaS). If the network is down, our customers can’t access their software to run their business, and that’s a big problem. That’s why TEAM has made our network and infrastructure a top priority.
Over the last year, TEAM’s IT Infrastructure group has been working to increase redundancy to our network. We added an additional internet provider for our primary datacenter in Papillion, Nebraska, and added another distinct internet provider in our secondary datacenter in Kansas City, Missouri. Now, all TEAM SaaS clients are distributed across these multiple internet providers based on the fastest path. If one provider loses service, the network infrastructure recognizes this loss of connectivity and reroutes traffic to the other providers.
In June 2017, our datacenter provider experienced an internet connectivity issue, and the system did exactly what it should have: It recognized the interruption and began rerouting to the alternate providers. Because of our multiple internet providers, all traffic was diverted within 30 seconds. And, the outage only affected a fraction of our total users. Any user affected would have experienced a short pause in service and at worst, would have had to log back on to resume their active session. If we had not taken these steps to increase our resiliency and availability, our SaaS customers would have experienced an internet outage of about 3.5 hours.
We have also taken great strides in improving our ability to recover from other forms of disaster. We increased our Storage Area Network (SAN) capacity and hardware at our backup facility to closely match our primary site. And, we added server/compute capacity, along with improved network switches and firewalls. So now, our secondary site more close mirrors the primary site improving business continuity.
In addition to hardware improvements, we also fully implemented Microsoft SQL Server Availability Groups to automatically synchronize databases between the primary and secondary datacenters. That means, we have backups and copies of live, production servers sent over a private network to our secondary site. These improvements have taken our previous 12+ hour disaster recovery time down to around 4 hours in the case of a complete loss of the primary datacenter.
And, that’s just the start. Our team will continue to work on our secondary site, as well as to evaluate our capacity, bandwidth and performance needs at both sites. We plan to perform test failovers from our primary site to our secondary site to ensure all systems remain 100 percent functional. All this continued effort to improve and upgrade our infrastructure processes is just another way we keep our customers first.