It all gets pretty overwhelming. How do you know how well you're doing? Which metrics should you use? Where do you even start when looking at your own performance? Let's start by stepping back and remembering why performance is important.
Whilst performance is inherently technical in its implementation, it has a direct impact on the overall user experience. Users come to your site to complete a task such as reading an article or making a purchase. Anything which gets in the way of the user completing their task risks distracting them and may lead people to adandon your site – costing you both lost page views and revenue.
A number of usability studies show that there are boundaries of acceptable response times in application design, after which the user will perceive there to be a delay. Typical recommendations, such as those in Usability Engineering by Jakob Nielsen are to have a 0.1 second response time for something to feel instant, or under 1 second to avoid disrupting the user's flow of throught.
There are a number of case studies which demonstrate the correlation between page load times and user engagement which help validate these guidelines. For example, the BBC found that, for every additional second a page takes to load, 10 per cent of users leave the site.
User-centric performance metrics
As there is a known correlation between performance and user engagement then what we want to find are the metrics which best measure performance as perceived by users. Google provide four questions to keep in mind when finding your performance metrics.
- Is it happening?
- Did the navigation start successfully? Has the server responded?
- Is it useful?
- Has enough content rendered that users can engage with it?
- Is it usable?
- Can users interact with the page, or is it busy?
- Is it delightful?
- Are the interactions smooth and natural, free of lag and jank?
We're going to use these questions to help ensure we use the right metrics for understanding a typical user experience on our site. We can think of these as user-centric performance metrics.
Web pages don't just appear in the browser in their fully formed state. They are typically made up of dozens of different types of network request all loading in real-time, visible to the user.
In many ways, this progressive loading is one of the best "features" of the web – it enables the page to be useful even if it has not fully loaded or when some of the requests have failed. It also gives clearer feedback to the user that something is happening than an empty screen. The downside to this is it makes poor performance painfully visible to the user.
If your page is dependant on lots of requests before it can display anything to the user then this will feel slower than a site which dispays content very quickly but then continues loading for a longer overall period. This is known as perceived performance and is where art gets mixed with science as you utilise the progressive nature of the web to ensure your webpages aren't just technically fast but feels it too. Sometimes changes which make your site slower will actually make the page feel faster – which makes measuring based on perception really important.
This whole progressive loading situation means that there is no single metric that categorically tells you how well you're doing. Instead, you need to use multiple metrics that cover the key milesones in the page load. This is especially true as the various metrics that are available all take a different perspective – some measure when something is visible on the page whilst others might check how much time the CPU is spending processing long tasks.
For example, you want to know how long it takes until the user can see something on the page so that we can answer the "is it happening?" question. To measure this, we can use the metric First Contentful Paint.
Similarly, the point where the majority of our content is visible on the page will answer the "is it useful?" question which can be done using Largest Contentful Paint. Time to Interactive helps answer "is it usable?" by tracking when enough of the logic of the page has loaded for users to be able to interact with it, whilst First Input Delay helps with "is it delightful?" by letting us know just how quickly our pages can respond to interaction to enable a smooth experience without lag.
Visualising these milestones as a graph helps highlight some of the potential pitfalls of only considering the overall load time. Focusing on the early milestones may mean the page content starts to appear quickly, but if the page is unresponsive because of a slow Time to Interactive then it'll feel broken to a user. Similarly, if you focus just on reducing Time to Interactive but users are still spending a long time with a blank screen as the page loads they will feel like the page is unresponsive and abandon the site before it's even finished loading.
User-centric metrics are an area that has been rapidly evolving over the last few years. A lot of the performance metrics that have historically been used, like the
onload event, are telling us about performance as seen internally by the browser which doesn't neccessarily translate into how that is then perceived by the user. User-centric metrics help fill this gap with ever-increasing accuracy.
Within this space there are still a number of different metrics available, with some being best suited to lab-based synthetic testing whilst others are best suited to data from real user monitoring (RUM).
I typically use 6 metrics for getting a complete understanding of the page load from start to finish:
- First Contentful Paint
- Largest Contentful Paint
- Time to Interactive
- Total Blocking Time
- First Input Delay
- Cumulative Layout Shift
Google's Web Vitals initiative aims to simplify and standardise web performance metrics. A evolving subset of these, known as Core Web Vitals, act as the key indicators of page experience that all sites should pay attention to and will be used as a ranking factor in search results. The Core Web Vital metrics as currently Largest Contentful Paint, First Input Delay and Cumulative Layout Shift.
A key consideration when choosing your metrics is how you will go about measuring them. You'll want to make sure you're picking the metrics that can be measured by your performance monitoring tooling. Some metrics as unique to either synthetic or RUM tools, whilst others can be measured by both.
With the Core Web Vital metrics, Google has committed to making these available across a range of their tools and services giving you a number of ways you can start tracking your performance.
User-centrics metrics help you measure your performance as experienced by users. What they don't tell you is why your site took the time it did to load. To understand the why, we need to look at the factors which influenced your performance.
The metrics which tell us about these factors are known as proxy metrics – on their own they only give you an incidation of what your performance might be like but combining them with user-centric metrics helps you diagnose and improve your performance.
For each user-centric metric that are a number of different proxy metrics that could be contributing towards it. Think of user-centric metrics as the ones you benchmark against and monitor over time whilst your proxy metrics are more like a toolbox of metrics where you'll focus on the relevant ones for a shorter period of time as-and-when required.
As user-centric metrics are looking at key milestones of the page load they tend to use time as the unit they measure. Proxy metrics use a wider range of units, including time, quantity and size.
There are a lot of proxy metrics to choose from. Here are a few examples:
- Time to First Byte – how long it takes for the server to start sending data to the user
- Total number of assets – how many files does the page need for it to load correctly?
- Size of page – size in bytes of the various assets on the page
It's taking a long time for your page to start to display anything when it loads. Typically, this means you'll have a slow First Contentful Paint time as your page isn't painting anything for the user to see.
To understand the why, there are few proxy metrics we could look at.
Digging into these metrics gives you an in-depth understanding of how your page is behaviing, and why you have a slow First Contentful Paint.
Earlier, I mentioned the six user-centric metrics that I typically use when looking at web performance. There's plenty of in-depth information available elsehwere on what these metrics mean, how to measure them and what impacts them but here's a quick summary for each of them.
First Contentful Paint
The time taken, in seconds, for the page to first start displaying text or images. 2 seconds or less is considered fast. This metric is useful for demonstrating to the user that something is happening – the page has started loading.
Largest Contentful Paint
Point in the page load when the main content has loaded, defined as the largest image or text block within the viewport. This is a good measurement of how long it takes for the page content to be visible on the page, and therefore for the page to start being useful. Reaching Largest Contentful Paint within 2.5 seconds is considered good.
First Input Delay
The delay, or latency, between a user first interacting with your site such as by pressing a button and your code being able to respond to that interaction. This isn't a measurement of how long your event handler itself takes, but instead how long the browser is blocked from executing your code due to the main thread being occupied by other tasks. A 100 ms First Input Delay is considered good – which corresponds to the maximum time usability studies provide in order for the interaction to feel instant.
As First Input Delay requires a human interaction it can only be measured with real user monitoring tools. See Total Blocking Time for a synthetic alternative.
Time to Interactive
Total Blocking Time
Total Blocking Time makes for a good synthetic partner to First Input Delay, and typically correlates well with Time to Interactive but there are some differences in the story it tells us.
It's calculated by combining the amount of time over 50 milliseconds spent on each long task between First Contentful Paint and Time to Interactive. So rather than telling us when the page is fully interactive, Total Blocking Time tells us just how much work it was doing to get to that state.
If your page has a long Total Blocking Time then there's a very high chance that a user attempting to interact with the page as it loads will find the page "locked" and noticeably unresponsive – creating that link with First Input Delay.
300 milliseconds or less is considered a good Total Blocking Time.
Cumulative Layout Shift
Rather than measuring time, Cumulative Layout Shift is instead a score showing how much your layout moves around (shifts) for a typical user. These layout shifts are likely to be visually jarring for the user and could be caused by things like your webfonts loading, or new elements being inserted causing the rest of the layout to shift.
Cumulative Layout Shift is the sum of all layout shifts experienced on a page. Each layout shift is calculated by multiplying two fractions – impact and distance. The impact fraction measures how much of the viewport was impacted in the frames before and after the layout shift. Whereas the distance fraction measures the distance within the viewport that the element moved between the two frames.
layout shift score = impact fraction * distance fraction
Combining both impact and distance helps to ensure we track both small elements moving around the page a lot, and larger elements moving about a little bit on the page. A Cumulative Layout Shift of less than 0.1 is considered good.
Lighthouse Aggregate Scores
We've established that it takes multiple user-centric metrics to really understand the user experience of your site performance.
One of the most popular synthetic performance testing tools, Lighthouse, provides an aggregate percentage score for your performance based on a six different user-centric metrics.
This gives you a nice and straightforward score, but on it's own it doesn't help you with understanding the overall user experience or how to go about improving it. The Lighthouse documentation itself advises that it might be "more useful to think of your site performance as a distribution of scores, rather than a single number".
Performance is best measured as the timing of a number of key milestones as the page loads. These milestones are known as user-centric metrics, giving us an understanding of how users perceive your performance.
Start with user-centric metrics, but use proxy metrics as secondary information to help delve into the technical reasons behind the timings they show.
Don't limit the proxy metrics you use – think of them like a toolbox where you pick the right metrics for the area you are focusing on.
Cover photo by patricia serna on Unsplash.