Going Viral Without Crashing: Preparing Your Website for Heavy Traffic

  • Chris Sulham
  • April 27, 2015

What do the Eiffel Tower, Best Buy, and XeroShoes have in common? They can all tell you how the opportunity of a viral event can go to waste if you aren’t ready for it.

The Eiffel Tower’s website crashed on March 31 when Google’s doodle drove too much traffic to it. XeroShoes’ site went down earlier this year when it was featured on an episode of ABC’s Shark Tank.  Best Buy’s site crashed in 2014 when an influx of Black Friday traffic overwhelmed its servers, costing them an unknowable amount of sales on the busiest shopping day of the year.

Every company dreams of going viral, but dreams can become nightmares when the proper infrastructure isn't in place.

The Stakes

Websites often go down when a surge of traffic, predictable or otherwise, overwhelms the server’s ability to respond to requests. This is the principle behind Distributed-Denial-of-Service (DDoS) attacks; with enough traffic and load, servers will eventually crash or slow to a crawl. Without a website that’s optimized for the amount of traffic it generates, a viral event you’ve been hoping for can actually do as much harm to your brand as a viral attack.

A slowed or crashed website isn’t just an inconvenience. An Akamai study showed that 47% of users expect websites to take no more than 2 seconds to load, and grow impatient or leave when it takes longer. KISSMetrics’ infographic, “How Loading Time Affects Your Bottom Line,” shows that by the time 10 seconds have passed, well over 30 percent of users have fled – and the numbers only head north from there.

This translates to losing customers. Apica’s Peak Performance Blog reports that 8 out of 10 people will not return to a slow site, and that 3 out of 8 people will tell others about their experience. So if your website isn’t ready for a sudden influx of traffic from a viral event, it can cost you in users, time, and even reputation.

Capitalizing On Viral Success

Here’s the good news: You don’t have to get blindsided. Not even the so-called “Front Page of the Internet” was able to take down one of our clients.

In March, one of the largest global research and public policy organizations in the world published a post on Reddit. It wasn’t the first post they had published there, but it was the first one to get significant numbers of “upvotes” on Reddit. Soon it made it onto the main feed – which all of Reddit’s more than 3 million members see.

""Within 48 hours they had a few hundred thousand page views and the post had been rebroadcasted by several major outlets. Traffic to the site increased by nearly 2,500% – the equivalent of 6 months of normal traffic realized in a single day!"

It was a dream come true for the client. And their website withstood the deluge gracefully – no crashed servers and no decrease in page load times. That’s because we had helped them optimize their infrastructure for just such an event.

Shore Up Your Floodgates

In 2014, this organization had realized that its website’s performance, speed, and scalability weren’t adequate. They worked with Velir to build a new infrastructure that could handle traffic even in extreme scenarios.

The Google Web Master Blog expounds on the standard best practices for getting your site ready for a traffic spike or viral event. We’ve taken this a step further and embraced the following techniques to make our clients’ websites high-performing and resilient.

Performance profiling tests - These can give you a better idea of what’s really behind performance issues on your site. They show which files might be driving up your site’s load times and point out bottlenecks such as inefficient database queries or searches that are retrieving too many records.  Ideally, these profiling tests are run throughout development (at the end of each release).

Application Instrumentation - Profiling in a sandbox environment can be useful, but nothing beats real-time information on the performance of your live site in production.  Tools like New Relic or Dynatrace help you measure and monitor the impact of traffic on your application’s actual server. They let you keep an eye on its health and alert you when the server is getting close to its capacity limits.

Cache Strategies - Almost all websites are designed to store certain data in the cache to preserve load speed. Static content, like the About page or the Contact page, can go in the cache so that searches for them don’t tax the site’s processing capacity or interfere with other, more dynamic requests. In high-traffic situations, the cache can help keep capacity available to handle the unusual number of requests. Certain search requests can also be cached to alleviate the burden on the search server. Even small cache optimizations while loading up the page will help. While these factors might be a small portion of the total load time, in the aggregate, small tweaks can pay real dividends.

Content Delivery Networks (CDN) - In some circumstances, the best strategy may be to offload your traffic to a CDN. This will take some of the load off of your servers so that no matter how heavy the traffic gets, they won’t be handling it alone. As a surefire way to handle traffic, it’s also the most expensive, so make sure to do your due diligence on its effectiveness and value for your particular situation.

Load Testing - Load testing is the process of creating a simulated environment and testing its capacity for responding to requests. The results of these tests show you what is likely to happen on the real site in different conditions and guide your decisions about how to bulk up the network’s capacity in case of increased traffic.

Visual Studio Ultimate has built-in load testing capability. You can easily build out and run your load tests on a remote server and simulate traffic to your website to see how it reacts.

Image optimization - Many sites have relatively large images, with file sizes between 1.5 and 5 MB. However, mobile phone users who are browsing your site only need to see 100x100-pixel crops of those images. Sitecore, a customer experience and content management system that we frequently use, has a great image scaling API. This and some snazzy Javascript can make it possible for the website to serve up the appropriately sized image for the requesting device. Scaling images like this on demand drastically reduces the amount of data transmitted, and you can cache the scaled images on the server and CDN.

Prepare For Success

Viral events have as much capacity for harm as for good – more, in fact. If your server goes down under ordinary traffic conditions, you may be able to recover quickly and limit the damage but a website crash during a high-traffic period can mean a widespread effect on the perception of your brand. There’s a lot you can do to protect your brand from that, and many tools to help you do it. With the proper infrastructure in place, you can enjoy your viral successes and look forward to more.