Saurabh Arora


How to handle unexpected downtime?

Okay, your website  is down. You know it. How do you let others know it? The following outlines the steps to be taken/ processes to be put in place in order to meet the customer expectations in such scenarios. Although, the below is applicable to most online businesses dealing with B2B segments, some of these are applicable to B2C segments (paid) as well.


First and foremost, it’s important to accept that something is not working. It’s important to understand the problem along three dimensions:

  • Whole web site or subset of features/site not working
  • Completely non-functional or functional but at a sub-optimal level
  • Impacts all users or part of the user base (segment based on geography, type of products etc)

For example, it may so happen that for a set of users with large data sets, feature X is working very slow.

Now that we understand the problem, it’s time to communicate the same.


a) To customers

  • Status Page: One should host a separate status page showing status of all products. Usually, it’s a sub-domain ( of your domain that your users can go to and see the current status. Ideally, it has to be hosted on a separate server independent of your main production servers and on a different ISP. The status page can take the shape of a blog or one page showing today’s status. See below samples from Google and Dreamhost.
Google Apps Status Dashboard

Google Apps Status Dashboard

Dreamhost Status
Dreamhost Status Blog
  • Email/SMS: One should also email and SMS (with prior consent at the time of sign-up) the user base informing the problem and the likely resolution time (more on this below).
  • Online: If the site is not working partially, a status on the site can be put up (linked to your status page) to notify users about the problem.
  • Social Media: One should update the current status on social media channels such as Twitter and Facebook. Obviously, don’t forget to update the same once resolved.

b) To Sales Team

  • Believe me, there’s nothing more embarrassing than the site not working while giving a demo to a potential client. It’s disastrous for the sales person and the customer may feel that your company is not competent to deliver the services. If possible, host the demo site (can be on a separate server with a separate database.
  • Many times, if the customer has bought your product/service offline through a direct sales person, he/she will call the sales person first. In this case, salesperson would contact the customer care team resulting in greater number of calls to customer care team. Hence, it’s important that sales team is also communicated about the problem (ideally by SMS/email since they are most likely in the field) at the earliest.

c) To Customer Care Team

  • Generally, if you are providing telephone / online chat support to your paid customers, customer care team is the first one to get notified of the problem by the customer. Hence, it’s important that the customer care team knows about the problem and the likely resolution time.
  • Further, it’s important that the customer care executive (CCE) doesn’t give a cookie-cutter answer such as “Your complaint has been noted and it will be resolved within 24 hours”. A better approach is to say something like, “We are aware of the issue [explain the actual issue with possible root cause - one of the server down etc] and our engineers are working hard to resolve the same. Currently, we are not in a position to give a definite time but we expect it to be resolved in the next X hours. You should receive an email/SMS once it’s resolved”. Generally, X is defined in most organizations as the maximum time for a defined category of problem to be rectified.
  • One a side note, this team should be equipped to give on the spot discounts to customers based on their judgement. For example, the team may be able to give one extra day of service free to customers in order to compensate for the loss. Or X% discount on next purchase.
Customer care team shld talk 2 technical folks 2 know likely resolution time. Book answer of 24 hrs doesn't fly. #Airtel #broadband #fail
Saurabh Arora

Post crisis

Needless to say, a root cause analysis is performed by the engineering team once the problem has been resolved. However, what’s more important is to forward some (or most depending on the severity of the problem) of the client conversations (email, phone calls) to everyone in the engineering team. Hearing the customers first hand is the second best way (best is to meet and observe them in person) to know them and better understand the severity of the problem. What may seem a trivial bug to an engineer might be a big problem for (some of) the customers. I may cover this aspect in detail in another post.

Also, in some cases, it maybe important to compensate the impacted users for the duration of the downtime. Say, someone bought your product only for a day and was not able to use for 4 hours. This is generally useful for subscription based products where customer is billed monthly.

Is there anything I have missed? Can you think of Indian online businesses who communicate effectively in such scenarios?

Category: Uncategorized


  • emt training

    Terrific work! This is the type of information that should be shared around the web. Shame on the search engines for not positioning this post higher!

[This site has not been updated for a while]
Saurabh Arora is is crunching numbers at Faceboook. Previously, he got his hands dirty doing product development, online customer acquisition, product marketing and online revenue generation for one of the India's leading online job portal.

More |  Facebook | Twitter |  Email | Networks
Subscribe / Follow
RSS   Subscribe on Facebook
Twitter  Follow on Twitter
RSS   Subscribe to RSS Feed

Email Updates  Enter your email address:

Flickr Photostream  more
Wheels up! Chocolate Chips Tartelette Camera Roll-1548 2044! Good enough to eat Columbia Japan Society St. Luke's Hospital @ Amsterdam Av   

Recent Tweets

follow me on twitter

Call me