I’m sorry for the worst week ever, in the history of ZippyKid, and it’s happy customers. We’ve been having slow downs and downtime across the board. We’ve isolated the problem down to a strained and maxed out Firewall, which is operating at 80% CPU capacity, and near full limits of it’s bandwidth capability.
We’re working very closely with Rackspace to get this managed device replaced, but due to the size of our network, doing so requires downtime, and coordination of the complex rules across the board. I’m really sorry for all the delays and not being able to give a good estimate of when things will be back to stable, we’re guessing by Friday night everything should be 100% stable, and zippy again.
Once again, I apologize profusely for this downtime, and my lack of communication throughout this, I’ve been busy trying to figure out the problem, and then working with Rackspace on a solution and procedures to implement it thoroughly.
I messed up, I didn’t see this coming sooner, I got caught up in the growth and dealing with new customers, but this is just a reminder to me, we’re a startup, we’re a small company and we need to slow down every now and then, and take inventory of where we are.
Once we fix these major issues with the existing network (we’re adding 3X capacity to the firewall), we’ll be able to introduce some of the major things we’re trying to introduce to the world of hosting.
The following two charts, are near real time graphs of the traffic being pushed through just two of our systems.. now imagine similar traffic across 20+ systems, and one firewall trying to manage it all.