Fast-Forward Performance – The Future Looks Bright

Note: This is a cross-post, the original post can be found as part of the 2014 perf calendar.


Generally, I prefer to mention the bad news first:

  • Slow websites will always exist.
  • Websites will continue to become more complex and bigger.
  • Our demand for speed and patience will certainly not decline.

These facts shouldn’t come as a surprise to anyone who cares about web performance.

Predicting the future is difficult and science has not been able to make time travel possible for us to peak ahead to what will happen to web performance. However, have you paid attention to the W3C activities recently? There is some really exciting performance stuff cooking.

My contribution to this year’s performance calendar is to tell you what convenient features we can expect in the future when dealing with web performance.

The future is (almost) here

The good news is that the W3C Web Performance Working Group and browser vendors have acknowledged that performance is an important piece of web development. They have already pushed out, and continue to propose new standards and implementations for those performance APIs.

The purpose of a web API is to provide you with better access to your users’ browser and device. Performance APIs can ease the process of accurately measuring, controlling, and enhancing the users’ performance. In addition, new protocols and HTML elements have been proposed to help serve content even faster and more optimized to users. Prior to these enhancements, it was impossible for developers to accurately measure their website performance.

Please note, I added a browser compatibility table at the end of this post so you can verify each API against current browser support.

I’m excited about the future of web performance and this post describes why.

Can I get an API with that?

There are several already existing, but also new performance APIs that are currently being worked on. To ensure quality and interoperability, W3C standards go through a specification maturity process, as shown below. Starting from step 1, “Editor’s Draft” to step 5, “W3C Recommendation”. Most start landing in browsers (“behind a flag”) during the “Working Draft” phase and get refined over time. After step 3 (“Candidate Recommendation”), developers can expect the API feature to be released un-prefixed in some browsers.

Let’s take a closer look at each API listed in the boxes above, from right to left.

Navigation Timing

This specification defines an interface for web applications to access timing information related to navigation and elements. – W3C

The Navigation Timing API helps measure real user data such as bandwidth, latency, or the overall page load time for the main page, and it is mainly used to collect RUM data.

The API allows developers to inquire about the page’s performance via JavaScript through the PerformanceTiming interface.

varpage = performance.timing,
    plt = page.loadEventStart - page.navigationStart,
// Page load time (PTL) output for specific browser/user in msconsole.log(plt);

Navigation timing covers metrics of the entire page. To gather metrics about individual resources, please check out the Resource Timing API further down below.

You can use this API to collect performance metrics about your user, especially when using RUM as one of your measurement techniques.

Navigation Timing 2 has been announced and will replace the first version.

High Resolution Timing

This specification defines a JavaScript interface that provides the current time in sub-millisecond resolution and such that it is not subject to system clock skew or adjustments. – W3C

varperf =;
// console output 439985.4570000316

When it comes to performance, accurate measurements are very beneficial. The High Resolution Timing API supports floating point timestamps providing measurements to a microsecond level of detail.

Page Visibility

This specification defines a means for site developers to programmatically determine the current visibility state of the page in order to develop power and CPU efficient web applications. – W3C

The visibilitychange event is fired on document whenever the page gains or loses focus.

document.addEventListener('visibilitychange', function(event){if(document.hidden){// Page currently hidden.}else{// Page currently visible.}});

This event is very helpful to programmatically determine the current visibility state of the page. For example, the API can be applied if your user has several browser tabs open and you don’t want specific content to execute (e.g playing a video, or rotating images in a carousel). Especially on mobile devices, this can be a great advantage in saving battery consumption for your users when they don’t have your page visible, but open in an inactive tab.

Here is a neat sample page illustrating the firing of the visibilitychange event.

Resource Timing

This specification defines an interface for web applications to access the complete timing information for resources in a document. – W3C

The Resource Timing API is a bit newer and not as well supported as the Navigation Timing API. You can dig deeper into understanding the behaviour of each individual resource of a page. Imagine you putting an image on your page, but not knowing how it performs in the real world, therefor, you would like to know the Time to First Byte (TTFB) metric for this image.

As an example, let’s pick the performance calendar logo (

varimg = window.performance.getEntriesByName("")[0];
varttfb = parseInt(img.responseStart - img.startTime),
    total = parseInt(img.responseEnd - img.startTime);
console.log(ttfb); // output 93 (in ms)console.log(total); // output 169 (in ms)// you could log this somewhere in a database or // send an image beacon to your serverlogPerformanceData('main logo', ttfb, total);

If Timing-Allow-Origin header is set by third party providers, you can even check the performance of third party resources on your page.

Beyond the main page’s performance (via Navigation Timing API), you can track real user experiences on a more granular basis (i.e. resource-basis). By having knowledge of this data, you can find potential performance bottlenecks for a specific resource.

Performance Timeline

This specification defines an unified interface to store and retrieve performance metric data. This specification does not cover individual performance metric interfaces. – W3C

The Performance Timeline specification defines a unifying interface to retrieve the performance data collected via Navigation Timing, Resource Timing and User Timing.

// gets all entries in an arrayvarperf = performance.getEntries();
for(vari = 0; i < perf.length; i++){console.log("Asset Type: " +
    perf[i].name +
    " Duration: " +
    perf[i].duration +

Check out the detailed post by Andrea Trasatti on the performance interface. He created a tool to generate HAR files from the performance timeline API, which provides you with a timeline view of performance metrics as they happen. You can plot the results as well. Andy Davies created a great waterfall bookmarklet to illustrate this.

Battery Status

This specification defines an API that provides information about the battery status of the hosting device. — Source

The API provides you access to the battery status of your users battery-driven device, as well as events that can be fired.

The charging, chargingTime, dischargingTime and level can be inquired, as well as events can fire based on these statuses.

varbattery = 
  navigator.battery || 
  navigator.webkitBattery ||
  navigator.mozBattery ||
if(battery){console.log("Battery charging? " + battery.charging ? "Yes" : "No");
  console.log("Battery level: " + battery.level * 100 + " %");
  console.log("Battery charging time: " + battery.chargingTime + " seconds");
  console.log("Battery discharging time: " + battery.dischargingTime + " seconds");

More samples and details are posted on the Mozilla Battery Status API page, as well as here

By knowing the users’ battery state, you could serve content based on the status (e.g. don’t send energy intensive elements to the user if the battery level is below 20%).

User Timing

User Timing provides a simple JavaScript API to mark and measure application-specific performance metrics with the help of the same high-resolution timers. – W3C

With the User Timing API, you can set markers to measure specific blocks or functions of your application. The calculated elapsed time can be an indicator for good or bad performance.

performance.measure("measureIt", "start", "end");
varmarkers = performance.getEntriesByType("mark");
varmeasurements = performance.getEntriesByName("measureIt");
console.log("Markers: ", markers);
console.log("Measurements: ", measurements);        
functionloadSomething(){// some crazy cool stuff here :)console.log(1+1);

The markers can help you focus on specific activities on your page and measure important milestones when your application/website is being executed.


This specification defines an interoperable means for site developers to asynchronously transfer small HTTP data from the User Agent to a web server. – W3C

With the beacon API, you can send analytics or diagnostic code from the user agent to the server. By sending this asynchronously, you won’t block the rendering of the page.

  "any information you want to sent");

Here is a neat demo page.

You can use the recommended beacon to carry performance information to a specific URL for further RUM analysis.

Animation Timing

This document defines an API web page authors can use to write script-based animations where the user agent is in control of limiting the update rate of the animation. The user agent is in a better position to determine the ideal animation rate based on whether the page is currently in a foreground or background tab, what the current load on the CPU is, and so on. Using this API should therefore result in more appropriate utilization of the CPU by the browser. – W3C

Instead of using setTimeOut or setInterval to create animations, use the requestAnimationFrame. This method grants the browser control over how many frames it can render; aiming to match the screen’s refresh rate (usually 60fps) will result in a jank-free experience. It can also throttle animations if the page loses visibility (e.g., the user switches tabs), dramatically decreasing power consumption and CPU usage.

Check out Microsoft’s demo page comparing setTimeOut with requestAnimationFrame.

Smoother animations result in happy users, low CPU usage, and low power consumption.

Resource Hints

This specification defines a means for site developers to programmatically give the User Agent hints on the download priority of a resource. This will allow User Agents to more efficiently manage the order in which resources are downloaded. – W3C

Predictive browsing is a great way to serve your users with exactly what they want to see or retrieve next. “Pre-browsing” refers to an attempt to predict the users’ activity with your page (i.e. is there anything we can load prior to the user requesting it?).

The following pre-browsing attributes are meant to help you with pre-loading assets on your page.


For example, if you set up tracking on your page, you probably know where your users are headed most often. You could use resource hints to pre-load subsequent resources of the next page to allow for quicker loading of that consecutive page.

Other proposals (not supported yet)

  • Frame Timing

    This specification defines an interface to help web developers measure the performance of their applications by giving them access to frame performance data to facilitate smoothness (i.e. Frames per Second and Time to Frame) measurements. – W3C

  • Navigation Error Logging

    This specification defines an interface to store and retrieve error data related to the previous navigations of a document. – W3C

Protocols, standards, and new HTML elements


HTTP/2 and SPDY protocol (developed by Google) allow several concurrent HTTP requests to run across one TCP session, and provide data compression of HTTP headers.

It’s no secret that SPDY has been a huge motivation for revamping the HTTP protocol. SPDY is not HTTP/2, however, when the HTTP/2 proposals were introduced in 2012, SDPYs specifications were adopted as a starting point (one single TCP connection, HTTP header compression, Server Push etc.), see more here.

When HTTP was introduced a decade ago, latency wasn’t something that was necessarily thought about. In HTTP/1.1, only the client can initiate a request. Even if the server knows the client needs a resource, it has no mechanism to inform the client. Therefore, it must wait to receive a request for the resource from the client. HTTP/2 promises to make HTTP requests cheaper, reducing the need for us to come up with techniques (or maybe hacks?), such as CSS image sprites, inlining etc., to minimize the number of HTTP requests needed.

The HTTP/2.0 encapsulation enables more efficient use of network resources and reduced perception of latency by allowing header field compression and multiple concurrent messages on the same connection. It also introduces unsolicited push of representations from servers to clients. — HTTP/2.0 Draft 4


WebP is a lossy compression format to promise optimized image delivery for the web. WebP is open-source, developed by Google, only supported in Chrome, Opera and Android but promises 30% smaller file size than a comparable JPEG image. Big websites such as Facebook have started to adopt these techniques with great success, 90% of images sent to Facebook and Messenger for Android use the WebP format, while they see a file size decrease up to 80% from PNG to WebP.

You can convert your images using a WebP converter like ImageMagick or others.

One of the drawbacks is that as long as not all browsers support this new format, you will need to save two versions of the image, one in WebP and one in the legacy image format, resulting in more storage costs.

Some examples of WebP images can be found here.

Picture element and srcset attribute

This specification defines the HTML picture element and extends the img and source elements to allow authors to declaratively control or give hints to the user agent about which image resource to use, based on the screen pixel density, viewport size, image format, and other factors. – W3C

The <picture> element and srcset attribute provide two ways of getting responsive images in the browser. The srcset attribute allows you to target particular screen densities with images that have been scaled. The picture element, on the other hand, is primarily used for “art directed” content: where the contents of the image changes based on a CSS breakpoint. The standard <img> tag serves both a fallback and as the actual image container. Together, picture and srcset help deal with different device size, and accommodates for different image sizes when using responsive websites.

<picture><sourcemedia="(min-width: 1280px)"srcset="large-hero.jpg, large-hero-2.jpg 2x"><sourcemedia="(min-width: 600px)"srcset="med-hero.jpg, med-hero-2.jpg 2x"><sourcesrcset="small-hero.jpg, small-hero-2.jpg 2x"><imgsrc="hero-1.jpg"alt="Hero image"></picture>

Responsive image solutions can help save bandwidth by providing the most optimized image to the users’ screen width and device.

Browser support overview

The number in each column describes the browser’s version number and subsequent versions up.

Specification Internet Explorer Firefox Chrome Safari Opera iOS Safari Android
Navigation Timing 9 31 all 8 26 8 (not 8.1) 4.1
High Resolution Timing 10 31 all 8 26 8 (not 8.1) 4.4
Page Visibility 10 31 all 7 26 7.1 4.4
Resource Timing 10 34 all 26 4.4
Battery Status* 31 (partially) 38 26
User Timing 10 all 26 4.4
Beacon 31 39 26
Animation Timing 10 31 all 6.1 26 7.1 4.4
Resource Hints Canary limited
Frame Timing
Navigation Error Logging
WebP* all 26 4.1
Picture element and srcset attribute * 38 26

*Not part of Web Performance Working Group

More information can be found here and here.

Ready, set, go?

The W3C brings together industry and community to make performance recommendations and specification reality so that developers can implement them. Please note, not all browsers implement the APIs exactly according to specification, so make sure to verify its functionality for your supported browser list.

With power comes great responsibility and while offering all these new techniques and APIs to developers, we will need to make sure that we understand their power. Browsers and content delivery networks (CDNs) have helped us quite a bit in prioritizing and optimizing web delivery. However, giving developers additional tools to boost web performance is much appreciated.

Stay up-to-date

Your best look into the future is to subscribe to the Web Performance Working Group mailing list for any performance related updates.


This blog post is a compilation of different sections from my upcoming book “Lean Websites”, where I discuss not only web performance APIs, but also provide general guidance on how to create lean websites. Feel free to pre-order your copy before the book officially launches in 2015.

Happy Holidays, Everyone!

“Lean Websites” – The ultimate performance bootcamp

I’m extremely delighted to let you know that I’ve started to write my very first own book. The book is called “Lean Websites” and focuses on front-end web performance.

“Lean Websites” will help you understand today’s web performance hurdles and guide you through a fun performance bootcamp with the goal to shave off some unnecessary page weight and increase the speed of your site.

Check out the link for more details.

Follow-up 3rd party footprint

The following post outlines the links, tools and articles I mentioned in my 3rd party footprint talk

Main Slides

Slides (Webdirections, Melbourne) and Slides (Velocity NYC 2014)

Shared Links and Articles

Tools and Tricks

WebPagetest Results

The Answer to Your Mobile Strategy and Performance Could Simply be…ESI

ESI stands for Edge Side Include and is an XML-based markup language. ESI support is offered by content delivery network (CDN) vendors like Akamai and F5 and also now Varnish (but only a very limited subset). If you use any of these vendors, you have ESI at your disposal. ESI can be used for caching purposes, however, in today’s post I will focus on how it could help you with your mobile strategy and performance.

The W3C states:
“ESI allows for dynamic content assembly at the edge of the network, whether it is in a Content Delivery Network, end-user’s browser, or in a “Reverse Proxy” right next to the origin server.” (

So, let’s pay attention to “dynamic content assembly” in the context of the title’s blog post.

If you are familiar with Server-Side-Include (SSI) and XML/XSLT, you will have no problem understanding ESI. It also supports the same access to variables based on HTTP request attributes, e.g. you can easily check for HTTP_USER_AGENT or HTTP_HEADER. ESI can also include fragments or snippets of additional content via an include command. To make this even more powerful, ESI supports conditional processing, which means logic can be applied to execute specific content <include/>s based on specific conditions, e.g. user agents. All of this is done at the edge; the user will never get content they are not supposed to receive. Additionally, when processing ESI, there is no need to go back to the origin for processing, hence the load at the origin is cut down.

What is (your) Mobile Strategy?

A mobile strategy or approach could range from “We don’t have one”, or “We swear on responsive web design” to “We take mobile very seriously and have dedicated sites”. If you opt for the first statement, please read this and then come back. If you opt for the second or third statement, please continue reading. While there are several options out there to do device detection via PHP or any other server-side language, I’d like to provide several insights on how the same can be achieved with ESI. Similar to other redirect strategies out there, ESI can be used to redirect users to a different site based on the visitor’s user agent. The redirect occurs at the edge and is faster than putting the redirect logic at the origin, hence, you experience performance improvements. ESI is powerful and cheap tool, and your way to a proper mobile device strategy could work by following the steps below.

To continue, please go to my guest post for Stoyan’s perf calendar, December 3:

“Grunt” your way to frontend performance optimization

Performance optimization has been more than ever in the spotlight of web developers, especially for mobile web developers who have to understand and know by heart the challenges and constraints of mobile devices: these devices run off a battery that e.g. drains faster if performance is not taken seriously. The devices are powered by smaller CPUs than desktop devices. Unknown factors like latency and network connectivity challenge developers to build slim, light-weight and fast websites. Data plans still remain expensive and inconsiderate use of served data by web developers should not be ignored.

Clearly, performance is (and should) not (be) an after-thought anymore. When web developers create websites, performance can influence the success or failure of a web product. We’ve been hearing from leading performance advocates like Ilya Grigorik that speed is a feature and should not only be thought of just before a product hits production but rather as an essential part of the web product development cycle.

For example, instead of minifying and concatenating CSS and JavaScript files manually, tools and processes are out there that can help and put these performance tasks into an automated workflow, and more importantly right from the beginning of a product development cycle.

I’ve been using Maven to run most of the automated performance optimization at work, however I’ve been always interested in using Grunt for the same purpose. Grunt is a task runner, created for web products, based on JavaScript that can be leveraged to make performance part of the deployment process.

In today’s blog post I will be sharing some of the plugins for Grunt that can be used to speed up and automate performance optimization. At the end of the blog post, I will present performance results that will show that frontend optimization (FEO) can be fun and easily be automated to cut page load time.

Google’s “Make the Web Fast” team recommends frontend best practices as well as Steve Souder’s “High Performance Web Sites” outlines rules that can be applied for FEO. I decided to pick two of Steve’s rules “Make Fewer HTTP Requests” and “Minify JavaScript” (and CSS, HTML) by using Grunt plugins that can help automate those specific rules. So here it goes.

Note: The post assumes that you’ve worked with Grunt before and know how it’s been installed, and how to install plugins (I won’t go into details, however links at the bottom will help you)

“Let’s grunt it up”

(All plugin headings in this post are clickable links to their appropriate pages)


Montage helps you sprite images to reduce HTTP requests. You will need ImageMagick to be installed. Alternatively, you can also try out grunt-spritefiles.


This plugin is useful when you want to develop and debug a version of your site that doesn’t use the minified and concatenated version of your JavaScript or CSS files. A comment blocks is wrapped around your JavaScript and CSS assets that will be concatenated to your destination after deployment.

<!-- build:js js/magic.min.js -->
<script src="js/1.js"></script>
<script src="js/2.js"></script>
<script src="js/3.js"></script>
<script src="js/4.js"></script>
<!-- endbuild -->

will become

<script src="js/magic.min.js"></script>

You can use grunt-processhtml instead.


Alternatively you could use grunt-closure-compiler instead of combining concat and uglify and cssmin for JavaScript and CSS files.


Uglify and concat go almost hand in hand and should be used together, the concat plugin first makes sure to combine all defined JavaScript files. Uglify only works on JavaScript files. It minifies all code in a one line block of code. Use cssmin for CSS files.


Same logic and idea than uglify, once your CSS files are all concatenated, use cssmin to shrink several lines of CSS code into one single one.


Combine CSS and JavaScript files with this plugin, it allows you to reduce your HTTP requests for each and every file to just one combined file.


imagemin minifies JPG and PNG images. It’s a handy Grunt plugin if you don’t know if the assets you got handed from your designer (or yourself) are optimized for web yet. By using this tool, you have the piece of mind that you use image files in an efficient way. Alternatively, you could use grunt-smushit, it’s based on Yahoo’s great smushit tool that is available in the YSlow plugin for several browsers.


This plugin encodes images as base64 and leverages the technique of data URIs for images, something that can be used inline with CSS to reduce HTTP requests and hence reduce page load time. I’ve written a blog post where this is explained in more detail.


This plugin is using htmlcompressor tool to minify and compress HTML files. The options parameter is handy to tweak your compression, my example uses compressJS and preserveServerScript to also compress inline JavaScript and server script tags in case I wanted to include some SSI code.


Use grunt-exec to run the SPOFcheck. An excellent tool to identify bad 3rd party scripts includes, developed by the eBay team. I didn’t include the scripts asynchronously, hence SPOFcheck complaints to avoid SPOF.

“It’s Magic” Sample Page

I created a simple page themed “It’s magic” where I applied all mentioned plugins. You can find the files including Gruntfile.js here. Please note, I intentionally didn’t put a lot of effort into the styling (It is supposed to look as simple and cheesy as it feels to you)

In a nutshell, the page has a logo, uses JQuery from the Google CDN, includes a simple JQuery gallery with previous and next buttons. Simple JavaScript and CSS files are being used.


The logo is a png logo, the images are not optimized. There are several CSS and JavaScript files that are all individual being included, not minified nor concatenated.


This file is the one that Grunt will create for you. Visually, the file doesn’t look that different, besides the fact that the title has changed….see yourself

Below are screenshots of the two pages (and links) side by side, the one on the left before Grunt tasks were applied. The right one shows the page after Grunt tasks were applied. For the user they both look the same (except for the heading).

without/magic.html with/magic.html
No magic here!

Can you spot the differences?

  1. The logo was being transformed into a data URI
  2. The title has changed from It’s not magic to It’s magic
  3. The local CSS and JavaScript files were being minified and concatenated
  4. The HTML was being compressed, comments were taken out automatically
  5. The next and previous buttons were converted to a sprite file
  6. On build, SPOFcheck was applied and gave us the following warnings so we could address possible SPOF issues

Screen shot 2013-08-02 at 7.25.05 PM

Let’s take a look under the hood

Here are the waterfalls for both versions:

Without magic

With magicmagic

WebpageTest Results

I ran WebpageTest for both files with 9 runs for IE8 with a DSL connection to retrieve the median. Here are the performance results:

  1. Without magic results
  2. With magic results
  • HTTP requests dropped by ~48%
  • Page load time (PLT) dropped by ~10%
  • File sizes dropped by ~10%

Even if those numbers are not that high (mostly due to the simplicity of the experiment), it shows that Grunt can help you automatically optimize your deployment process.

As you can tell by the sample code, there are many mix and match options available, depending on the magnitude and granularity of your page structure. Nevertheless, this little sample shows how to use Grunt to optimize performance and to illustrate what is possible. Feel free to use the code as a starting point, and tweak or customize it to your likings.

General references and info to get you started with Grunt

Follow-up on my talk “Embracing Performance in Today’s Multi-Platform Macrocosm”

Hello everyone, this is a follow-up blog post on my talk presented at BDConf in San Diego and WebExpo in Prague.

If you landed here because you’ve typed in the URL after attending my talk, great! Thanks for making it all the way here. I hope you enjoyed my talk.

If you landed here via Google, Twitter or any other sites, I welcome you too, of course! You might want to first have a look at my slides (see link below) before clicking on any of the other links below.

Either way, feel free to leave a comment or contact me via twitter with my handle @bbinto.



Slides available on SlideShare

Links and articles, recommended content

Maven Tools

Continuos Integration Tools (<3)

General Links

Image Credits

The Power of a Private HTTP Archive Instance: Finding a Representative Performance Baseline

(Note: cross-posted at

Be honest, have you ever wanted to play Steve Souders for a day and pull some revealing stats or trends about some web sites of your choice? Or maybe dig around the HTTP archive? You can do that and more by setting up your own HTTP Archive. is a fantastic tool to track, monitor, and review how the web is built. You can dig into trends around page size, page load time, content delivery network (CDN) usage, distribution of different mimetypes, and many other stats. With the integration of WebPagetest, it’s a great tool for synthetic testing as well.

You can download an HTTP Archive MySQL dump (warning: it’s quite large) and the source code from the download page and dissect a snapshot of the data yourself.  Once you’ve set up the database, you can easily query anything you want.


You need MySQL, PHP, and your own webserver running. As I mentioned above, HTTP Archive relies on WebPagetest—if you choose to run your own private instance of WebPagetest, you won’t have to request an API key. I decided to ask Patrick Meenan for an API key with limited query access. That was sufficient for me at the time. If I ever wanted to use more than 200 page loads per day, I would probably want to set up a private instance of WebPagetest.

To find more details on how to set up an HTTP Archive instance yourself and any further advice, please check out my blog post.


Going back to the scenario I described above: the real motivation is that often you don’t want to throw your website(s) in a pile of other websites (e.g. not related to your business) to compare or define trends. Our digital property at the Canadian Broadcasting Corporation’s (CBC) spans over dozens of URLs that have different purposes and audiences. For example, CBC Radio covers most of the Canadian radio landscape, CBC News offers the latest breaking news, CBC Hockey Night in Canada offers great insights on anything related to hockey, and CBC Video is the home for any video available on CBC. It’s valuable for us to not only compare to the top 100K Alexa sites but also to verify stats and data against our own pool of web sites.

In this case, we want to use a set of predefined URLs that we can collect HTTP Archive stats for. Hence a private instance can come in handy—we can run tests every day, or every week, or just every month to gather information about the performance of the sites we’ve selected. From there, it’s easy to not only compare trends from to our own instance as a performance baseline, but also have a great amount of data in our local database to run queries against and to do proper performance monitoring and investigation.

Visualizing Data

The beautiful thing about having your own instance is that you can be your own master of data visualization: you can now create more charts in addition to the ones that came out of the box with the default HTTP Archive setup. And if you don’t like Google chart tools, you may even want to check out D3.js or Highcharts instead.

The image below shows all mime types used by CBC web properties that are captured in our HTTP archive database, using D3.js bubble charts for visualization.

Mime types distribution for CBC web properties using D3.js bubble visualization. The data were taken from the requests table of our private HTTP Archive database.

Mime types distribution for CBC web properties using D3.js bubble visualization. The data were taken from the requests table of our private HTTP Archive database.

Querying the Database

Sometimes, you want to get some questions answered without creating a chart. That’s when you can query the MySQL tables directly. Let’s run a simple query on the requests table.

For example, some of the CBC sites use YUI, some use jQuery—but we would really like to avoid having pages serve both. A simple sample query like the one below could help identify those sites:

SELECT req_referer
FROM requests
WHERE url LIKE "%/jquery_.js%" OR url LIKE "%/i/l/yui/%"
GROUP BY req_referer

And More …

We will share more of the queries and insights we’ve gathered from our HTTP Archive instance that helped us identify bottlenecks. In addition, we will also discuss how this setup came in very handy to discover problems with some unnecessary page weight that we thought we didn’t have.

Join our talk at the Velocity conference in Santa Clara in June titled “The Canadian Public Broadcaster on A Diet: Slimming Down for A Whole Nation.” The talk will not only cover the private HTTP Archive instance but furthermore cover many other aspects of how to focus on (mobile) web fitness and how to “slim down.”

Related Posts to our Talk

Warming up for Velocity 2013 in Santa Clara

I have attended several conferences in the last few years. The first one that really changed my “developer life” was the Velocity 2011 conference in Santa Clara. I have always been interested in optimizing and being diligent about the web, however, my learning during those three days in Santa Clara has influenced my every day life and the way I see performance.

I truly admire each and every speaker and attendee at the conference because they all share the same passion: Optimizing the web and making performance count. I am honoured to announce that it is my turn this year to give back to that same community of people and share what I have learned over the last few years and what I have been applying at Canada’s public broadcaster, the CBC.

Our talk “The Canadian Public Broadcaster On A Diet: Slimming Down For A Whole Nation“ will focus on (mobile) web fitness and how to “slim down”. Tips and tricks will be shared about how to stay in shape when developing (mobile) sites for millions of people.

My talented co-worker Blake and I will be talking about how we apply performance optimization at the CBC, one of Canada’s largest web properties with over 5 million pages. As a publicly funded organization, all Canadian eyes are on us making sure we stay on budget and deliver quality and optimized content to users.

While Blake will be talking more about the backend, server and CDN aspects of performance optimization and tips, I will be sharing information about how we optimize and tweak performance from a frontend development and automated deployment perspective, basically – how to get and stay in shape.

Don’t worry; this definitely will not be your typical boring and horrifying boot-camp experience! Our talk will utilize fun and catchy analogies to explain the weight and performance of pages. I will be your honest CBC “fitness trainer”, telling the audience about the page weight of our sites on multiple platforms, how we measure performance and set budgets. However, putting our content on a scale will tell the truth: a content breakdown of our pages will help the audience understand how content is structured and where we can “slim down”, but also where a fitness routine cannot help.

Keep us company while we share some insights about setting up our own HTTP Archive instance as a tool – or how I would describe it: the BMI of web sites – to compare our own weight to the public HTTP Archive instance. We will share some queries from our HTTP Archive database to help identify bottlenecks, and we will tell you about how we discovered problems with some unnecessary weight that we thought we didn’t have.

Additionally, sweet and dangerous temptations will be placed in front of your eyes, the kinds that we all have to deal with when creating high traffic sites, including, 3rd party scripts that could significantly harm the performance of our sites when not properly implemented. We compare client-side versus server-side 3rd party implementation. We will also reveal the amount of improvement we saw in load time once we turned off all ads on our mobile touch site for a weekend.

During our talk, you will also hear about our fitness stack regarding how we monitor our fitness level, and why it is so important to stay on a strict exercise schedule and avoid gaining too much unwanted weight, which can happen without even knowing it. If you want to exercise and stay in shape, there are tons of great tools out there to help you achieve that. We will cover how we organize and optimize our sites, our releases and deployment and how easily you can include tools in your deployment process to automate performance optimization.

If you want to know how we use RUM in combination with synthetic testing, and what our RUM numbers reveal, then you shouldn’t miss out on our talk.

Lastly, we will explain the challenges that we have faced, as the national news broadcaster in a world of ever changing news, with the potential for a breaking news story at any moment, that could drive our traffic to the roof, and how we need to respond to that.

Come join our talk and if you like, wear your favorite running shoes because you never know, you might want to start exercising right after.

We look forward to meeting you all!

More details to our scheduled talk and location:

Performance check: CBC’s logo as pure CSS, Data URI and simple PNG on the scale

There is always room for improvement. Period.

Think about the 100 meter men’s sprint. I am  amazed how it continues to be possible for human beings to still become faster and improve their performance.

I’m not Usain Bolt - I can’t run 100 meters in 9.58 seconds but I might be able to run (mobile) websites under 10 seconds.

Today, I want to focus on a technique I first heard about at the Velocity Conference in 2011 in Santa Clara and how to compare it with other ways to serve images in HTML pages.

Data URI is based on base64 encoding and basically encodes data (e.g. images) into bites. It can be used to improve performance.

Data URI as “Performance Enhancer”

Instead of requesting for example a PNG image, you could encode it as base64 and serve it inline with your HTML code. That way, you reduce one HTTP request – right there - 200ms saved. Instead putting it inline, you could also put it encoded in an external stylesheet.

Watch out for caching limitations though. Data URIs can’t be cached as they don’t have a standalone cache policy, they are part of the file that includes them. So they can only piggy-bag on other cacheable assets, e.g CSS or HTML files.

As Nicholas explains, if you put data URI images inline with HTML, they can only be cached as part of the HTML cache control header. If the HTML is not cached, the entire content of the HTML markup, including the inline data URI will be re-downloaded every single time. That approach, if the image is big in size, can slow down and weight down your page. Hence, one option is to to put it in stylesheets because they can have an aggressive cache while the HTML page can have no cache or a very limited cache.

Limitations and Browser Support

Think about the browser audience of the site you want to leverage data URIs for. If you target modern browsers, including new mobile devices because that’s where you really want to focus on performance the most, you might be able to ignore the following limitations and accept the little restricted list (thanks to Fire) of supported browsers.

  • Firefox 2+
  • Opera 7.2+ – data URIs must not be longer than 4100 characters
  • Chrome (all versions)
  • Safari (all versions)
  • Internet Explorer 8+ (data URIs must be smaller than 32KB)

Motivation for Comparison

I’ve been reading a lot about web performance techniques and for some reason the data URI practice got stuck with me. I started off by creating the CBC gem (logo) in CSS to verify if CSS performs better than serving images. While I was playing around with that, I thought why not adding another dimension to the test and check the performance of the CBC logo as data URI. Voilà, I had my basic scenario for the test:

Check the performance of the CBC logo as

  1. An image in pure css
  2. A plain PNG image as background image
  3. A data URI (in CSS and inline with HTML)

Setting up the Test

The purpose of the test was to figure out what kind of presentation for the CBC gem would be the fastest and slimmest.

Prerequisites and Conditions

  • All HTML and CSS files were minified and use the same markup (despite the logo in pure CSS which needed to have a few more div classes to render the circles)
  • Each HTML version was tested with empty cache
  • Performance results were performed with WebPagetest (10 runs on an 3G simulated browser) to find the Median.

1. Logo in pure CSS (30px x 30px)

Pure CSS 30x30purecss Screen Shot 2013-05-01 at 5.40.46 PM
Description: Thankfully, the CBC gem consists of circles, 1/2 and 1/4 circles, those shapes can easily be created with CSS. I used this page to help me get started. Instead of setting up a fixed size and color, I decided to use SASS to help me be more flexible with my settings for the logo. The scss file lets me define color and size of the gem.
Note: Maybe the pure CSS logo has a bit of issues with some of the 1/4 circles but that’s probably due to some math formulas I didn’t do right in the SASS, I believe this can be ignored. Hence, This version cannot be used as the official CBC gem.

2. Plain PNG Image (30px x 30px)

PNG Image 30x30
Screen Shot 2013-05-01 at 5.40.59 PM
Description: Simple PNG file  included in the CSS as a background image. CSS included in main HTML.

3. Data URI in CSS (30px x 30px )

30x30 data URI

Screen Shot 2013-05-01 at 5.41.04 PM
Description: I used Nicholas’ tool to create my CSS files including data URI. However there are many tools to help you create your own data URI encoded files.

You can see from the browser screenshots above that all logos look pretty much the same to the user.

Test Results

Screen shot 2013-07-23 at 2.33.49 PM

The results show that the median load times serving the logo as pure CSS in comparison to the Data URI solution are being almost the same whereas the logo as a background image in CSS took the longest.

I looked at the base64 string and thought how big it would be if I had used a bigger image. So I googled and found the following “It’s not worth to use data URIs for large images. A large image has a very long data URI string (base64 encoded string) which can increase the size of CSS file.” (source). I decided to test it out myself. So, my next question was “How would the test above turn out if I used a bigger CBC gem logo”. I picked a width and height of 300px. While I was preparing the 300px x 300px pages, I also decided to create another version of the Data URI, not part of the CSS but inline within the HTML.

1. Logo in pure CSS (300px x 300px )

There was not much of a different in terms of markup and setup for the pure CSS and PNG in CSS version. I updated the SASS for the cbcgem.scss to accomodate a logo of 300px x 300px instead of 30px x 30px. The file size didn’t change much because it is all based on math calculations

2. Plain PNG Image (300px x 300px )

Instead of loading gem-30.png, I created a version gem-300.png and updated the CSS.

3a. Data URI in CSS (300px x 300px)

Screen shot 2013-05-01 at 9.09.15 PMI noticed that the size of the Data URI encoding as expected increased dramatically from a 30px x 30px encoded image to a 300px x 300px image (almost 10 times, see full view of screenshot on the left).

3b. Data URI inline within HTML (300px x 300px)

Screen shot 2013-05-01 at 9.31.58 PMInstead of pasting the long base64 string into the CSS, I added it as an img src to the HTML page (see full view of screenshot on the left)

I used WebPagetest again to run 10 tests to find the Median.

Screen shot 2013-07-23 at 2.33.57 PMThe links to the WebPagetest results can be found at the bottom of this post.

Observations & Take-Aways

  • Creating simple shapes in CSS (via SASS) is highly scalable because it doesn’t influence the size of the CSS file significantly. The size of the CSS file won’t change much if I choose to produce a 300px x 300px logo or a 10px x 10px logo. For all tests performed this solution seems to be the most efficient and fastest one.  
  • I didn’t  find the observation true that if the encoded image is bigger than 1-2kB it wouldn’t be worth using Data URI to improve performance. When looking at the last test round (300px x 300px), we can see in the results that the page with the encoded image is still faster than the page with a 300px x 300px PNG image.
  • It is interesting to note that the inline data URI version is faster than the data URI CSS version (and almost as fast as the pure CSS version).  Having to serve 2 HTTP requests with a total size of 4kB, the median load time was faster than the one serving the data URI via CSS.

Further Readings and References

WebPagetest Results

CBC Gem 30px x 30px

CBC Gem 300px x 300px

Simulating Frontend SPOF – The day a tiny 3rd party script almost slowed down the entire Internet

How realistic is it really that a script that you didn’t even write could dramatically slow down your site and other major sites as well? Keep reading….scripts can slow down sites and it hurts to watch!

I watched the Fluent talk by Steve Souders from 2012 about High Performance Snippets (must-watch for all SPOF fans) and got inspired to test out how an “innocent” 3rd party script (btw. I call them 3rd party monsters), not loaded properly could result in a single point of failure (SPOF) and to make a site very slooooooow to load.

Developers are always proud and optimistic about their code, and when it comes to including 3rd party scripts, basically code they don’t usually touch, they assume those 3rd party providers like Google never go down, a non-responsive ad server like DoubleClick won’t hurt or Twitter won’t have server failures. 3rd party script developers try their best to make it easy and painless for us to include their high performing scripts into our sites. That’s a fact. True but also not true. They can only do so much. If you don’t properly include the script on your end and their service goes down, their high performing code won’t be able to help you at all. The rule of thumb is to include those scripts asynchronously. That way you make sure that your content won’t be blocked from rendering in case the 3rd party service is down.

However, scripts that use document.write can’t be loaded asynchronously (unfortunately). Read more about this in the great Krux post and some of Steve Souders’ posts.

It’s kind of like the elephant in the room to me; you pray e.g. that Twitter doesn’t go down meanwhile you are too afraid to test it out or are over-confident that it won’t break your page or you basically don’t really know how you would test this scenario in the first place. Am I right? Well, what if you could run a quick test on a web site and pretend all of their 3rd party scripts and providers were down. Let’s play the “3rd party scripts game“: would your web site still render…how confident are you?

Simulating SPOF – Slow down your own site until it really hurts

Are you ready for this? First, edit your hosts file to point to a blackhole IP address for simulation (I used the blackhole IP address Steve shared in his talk on slide 9).

sudo vi /etc/hosts

While setting up my test, I don’t want to play the really bad gal (yet) and assume all 3rd party providers were down. I’d like to start with the simplest but yet most used and harmful domain A lot of web sites include ads and use DoubleClick.

So let’s use this domain for our blackhole test. By all means, you can add more 3rd party scripts to your hosts file.

// add this line to your hosts file

Once you’ve updated your host file, remember to flush your DNS cache after.

dscacheutil -flushcache

Now, open your browser (with cache disabled so your browser is not using any DoublClick scripts from the cache). Type in your site’s URL and be prepared for the worst. How long will it take for the website to load?

That’s a very easy (scary) and quick way of evaluating what is on your critical rendering path and obviously (now) what should not be on it anymore!

I ran this test on our site and let me tell you, it hurt. Period. It took almost 1.5 min for to display useful content faking that DoubleClick was down. The browser finally gave up.


I wasn’t ready to stop the game. I wondered if it’s just our site that doesn’t properly handle the outage of one single domain such as So I continued and tried the following random websites and measured the time it took so see useful content on those.

URL Time past to see useful content ~4.5 mins ~2.5 mins Fine, didn’t seem to use DoubleClick Fine, they seemed to be doing the proper handling Surprise, surprise Facebook doesn’t use DoubleClick. They use their own, so no real delay here.


If you don’t want to edit your hosts file and want to get more concrete waterfall and timing information as well as video captioning, try out what Steve Souders suggested in his Fluent talk by using the scripts (now SPOF) box at to include DNS changes. The results will give you great details on how the website performed, with and without SPOF.

SPOF doubleclick

Note: I’ve tried WebPagetest SPOF myself and didn’t notice a big difference between non-SPOF and SPOF version; my suspicion is that WebPagetest might not be using empty cache for SPOFs setting. The tests I ran manually on my local machine showed more visible negative impact of the SPOFs (I shall confirm this).

3rd party scripts are everywhere

It was verified last month that 18% of the world’s top 300K URLs load jQuery from Google hosted libraries. So that means in theory if that service goes down and a web site uses JQuery from (and doesn’t have a fallback), the site might not work at all. Isn’t that scary? If you develop for a web site that already uses a CDN, don’t use Google’s CDN for scripts like JQuery. Avoid those 3rd party dependencies as much as possible.

I ran two queries on my local HTTP Archive database (dump from March 2013) and followed the same filter that Steve Souders used above. I restricted the query to only look at 292, 297 distinct URLs from the March 1 2013 crawl (with their respective unique pageid’s). I wanted to see how many of the top 300K URLs use Twitter widgets and any sort of Facebook scripts (without a distinction if they were loaded synchronously or asynchronously).



13% of the Top 300K URLs include Twitter scripts somewhere on their page.



29% of the Top 300K URLs include Facebooks scripts somewhere on their page.

Feel free to extend this exercise to include more 3rd party domains.

Cached 3rd party scripts

You can’t really rely much on the cache settings of your 3rd party scripts to ignore their outage if it happens for less than a few hours. 3rd party providers tend to set a very low cache time on their scripts to make it flexible for them to change the file frequently.

That setup plays against you in the case where you don’t load 3rd party scripts asynchronously. For example Twitter’s widget.js has a cache time set to 30 mins (only). I wonder what change could be so important for Twitter that can’t wait for more than 30min to be loaded on sites consuming this widget.js file.

So imagine the following: You go to a site with the Twitter widget loaded synchronously at the top of the page (bad!) at 9 AM (getting the latest, freshest version of widget.js). Twitter goes down at 9:10 AM. You go back to the site you visited at 9 AM, now at 9:15 AM, everything is still fine, you won’t see any problems because you are getting the cached Twitter widget script from the browser cache. What if Twitter is still down at 9.40 AM and you visit the same page again, you now are past the cache modified time and your browser will request a new version of the Twitter script, trying to reach the Twitter server that is still down. You are now getting a time out response for the Twitter script that (with the setup described above) will block the page content from rendering. Bottom line, you wouldn’t be able to see any content until Twitter is back up (and the cache has expired). It’s easy to check those cache times yourself, e.g. use Chrome dev tools and check out the response headers from those 3rd party scripts.

The screenshot below shows Twitter and Facebook’s cache-control settings:



In order to really focus on your site’s performance, you need to isolate (potential bad) performance of 3rd party monsters (the ones that you decided to invite to your site). Don’t make your users wait for your own content if a 3rd party provider is down.