Subscribe to receive notifications of new posts:

Prototyping optimizations with Cloudflare Workers and WebPageTest

01/08/2020

6 min read

This article was originally published as part of  Perf Planet's 2019 Web Performance Calendar.

Have you ever wanted to quickly test a new performance idea, or see if the latest performance wisdom is beneficial to your site? As web performance appears to be a stochastic process, it is really important to be able to iterate quickly and review the effects of different experiments. The challenge is to be able to arbitrarily change requests and responses without the overhead of setting up another internet facing server. This can be straightforward to implement by combining two of my favourite technologies : WebPageTest and Cloudflare Workers. Pat Meenan sums this up with the following slide from a recent getting the most of WebPageTest presentation:

So what is Cloudflare Workers and why is it ideally suited to easy prototyping of optimizations?

Cloudflare Workers

From the documentation :

Cloudflare Workers provides a lightweight JavaScript execution environment that allows developers to augment existing applications or create entirely new ones without configuring or maintaining infrastructure.A Cloudflare Worker is a programmable proxy which brings the simplicity and flexibility of the Service Workers event-based fetch API from the browser to the edge. This allows a worker to intercept and modify requests and responses.

worker-4

With the Service Worker API you can add an EventListener to any fetch event that is routed through the worker script and modify the request to come from a different origin.

Cloudflare Workers also provides a streaming HTMLRewriter to enable on the fly modification of HTML as it passes through the worker. The streaming nature of this parser ensures latency is minimised as the entire HTML document does not have to be buffered before rewriting can happen.

Setting up a worker

It is really quick and easy to sign up for a free subdomain at workers.dev which provides you with 100,000 free requests per day. There is a quick-start guide available here.To be able to run the examples in this post you will need to install Wrangler, the CLI tool for deploying workers. Once Wrangler is installed run the following command to download the example worker project:    

wrangler generate wpt-proxy https://github.com/xtuc/WebPageTest-proxy

You will then need to update the wrangler.toml with your account_id, which can be found in the dashboard in the right sidebar. Then configure an API key with the command:

wrangler config

Finally, you can publish the worker with:  

wrangler publish

At this the point, the worker will be active at

https://wpt-proxy.<your-subdomain>.workers.dev.

WebPageTest OverrideHost  

Now that your worker is configured, the next step is to configure WebPageTest to redirect requests through the worker. WebPageTest has a feature where it can re-point arbitrary origins to a different domain. To access the feature in WebPageTest, you need to use the WebPageTest scripting language "overrideHost" command, as shown:

This example will redirect all network requests to www.bbc.co.uk to wpt-proxy.prf.workers.dev instead. WebPageTest also adds an x-host header to each redirected request so that the destination can determine for which host the request was originally intended:    

x-host: www.bbc.co.uk

The script can process multiple overrideHost commands to override multiple different origins. If HTTPS is used, WebPageTest can use HTTP/2 and benefit from connection coalescing:  

overrideHost www.bbc.co.uk wpt-proxy.prf.workers.dev    
overrideHost nav.files.bbci.co.uk wpt-proxy.prf.workers.dev
navigate https://www.bbc.co.uk

 It also supports wildcards:  

overrideHost *bbc.co.uk wpt-proxy.prf.workers.dev    
navigate https://www.bbc.co.uk

There are a few special strings that can be used in a script when bulk testing, so a single script can be re-used across multiple URLs:

  • %URL% - Replaces with the URL of the current test
  • %HOST% - Replaces with the hostname of the URL of the current test
  • %HOSTR% - Replaces with the hostname of the final URL in case the test URL does a redirect.

A more generic script would look like this:    

overrideHost %HOSTR% wpt-proxy.prf.workers.dev    
navigate %URL% 

Basic worker

In the base example below, the worker listens for the fetch event, looks for the x-host header that WebPageTest has set and responds by fetching the content from the orginal url:

/* 
* Handle all requests. 
* Proxy requests with an x-host header and return 403
* for everything else
*/

addEventListener("fetch", event => {    
   const host = event.request.headers.get('x-host');        
   if (host) {          
      const url = new URL(event.request.url);          
      const originUrl = url.protocol + '//' + host + url.pathname + url.search;             
      let init = {             
         method: event.request.method,             
         redirect: "manual",             
         headers: [...event.request.headers]          
      };          
      event.respondWith(fetch(originUrl, init));        
   } 
   else {           
     const response = new Response('x-Host headers missing', {status: 403});                
     event.respondWith(response);        
   }    
});

The source code can be found here and instructions to download and deploy this worker are described in the earlier section.

So what happens if we point all the domains on the BBC website through this worker, using the following config:  

overrideHost    *bbci.co.uk wpt.prf.workers.dev    
overrideHost    *bbc.co.uk  wpt.prf.workers.dev    
navigate    https://www.bbc.co.uk

configured to a 3G Fast setting from a UK test location.

Comparison of BBC website if when using a single connection. 

Before After
beforeBBC2-1 afterBBC2

The potential performance improvement of loading a page over a single connection, eliminating the additional DNS lookup, TCP connection and TLS handshakes, can be seen  by comparing the filmstrips and waterfalls. There are several reasons why you may not want or be able to move everything to a single domain, but at least it is now easy to see what the performance difference would be.  

HTMLRewriter

With the HTMLRewriter, it is possible to change the HTML response as it passes through the worker. A jQuery-like syntax provides CSS-selector matching and a standard set of DOM mutation methods. For instance you could rewrite your page to measure the effects of different preload/prefetch strategies, review the performance savings of removing or using different third-party scripts, or you could stock-take the HEAD of your document. One piece of performance advice is to self-host some third-party scripts. This example script invokes the HTMLRewriter to listen for a script tag with a src attribute. If the script is from a proxiable domain the src is rewritten to be first-party, with a specific path prefix.

async function rewritePage(request) {  
  const response = await fetch(request);    
    return new HTMLRewriter()      
      .on("script[src]", {        
        element: el => {          
          let src = el.getAttribute("src");          
          if (PROXIED_URL_PREFIXES_RE.test(src)) {
            el.setAttribute("src", createProxiedScriptUrl(src));
          }           
        }    
    })    
    .transform(response);
}

Subsequently, when the browser makes a request with the specific prefix, the worker fetches the asset from the original URL. This example can be downloaded with this command:    

wrangler generate test https://github.com/xtuc/rewrite-3d-party.git

Request Mangling

As well as rewriting content, it is also possible to change or delay a request. Below is an example of how to randomly add a delay of a second to a request:

addEventListener("fetch", event => {    
    const host = event.request.headers.get('x-host');    
    if (host) { 
//....     
    // Add the delay if necessary     
    if (Math.random() * 100 < DELAY_PERCENT) {       
      await new Promise(resolve => setTimeout(resolve, DELAY_MS));     
    }    
    event.respondWith(fetch(originUrl, init));
//...
}

HTTP/2 prioritization

What if you want to see what the effect of changing the HTTP/2 prioritization of assets would make to your website? Cloudflare Workers provide custom http2 prioritization schemes that can be applied by setting a custom header on the response. The cf-priority header is defined as <priority>/<concurrency> so adding:    

response.headers.set('cf-priority', “30/0”);    

would set the priority of that response to 30 with a concurrency of 0 for the given response. Similarly, “30/1” would set concurrency to 1 and “30/n” would set concurrency to n. With this flexibility, you can prioritize the bytes that are important for your website or run a bulk test to prove that your new  prioritization scheme is better than any of the existing browser implementations.

Summary

A major barrier to understanding and innovation, is the amount of time is takes to get feedback. Having a quick and easy framework, to try out a new idea and comprehend the impact, is key. I hope this post has convinced you that combining WebPageTest and Cloudflare Workers is an easy solution to this problem and is indeed magic

We protect entire corporate networks, help customers build Internet-scale applications efficiently, accelerate any website or Internet application, ward off DDoS attacks, keep hackers at bay, and can help you on your journey to Zero Trust.

Visit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.

To learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check out our open positions.
Cloudflare WorkersSpeedServerlessWorkers SitesDevelopersDeveloper Platform

Follow on X

Cloudflare|@cloudflare

Related posts

June 23, 2023 1:00 PM

How we scaled and protected Eurovision 2023 voting with Pages and Turnstile

More than 162 million fans tuned in to the 2023 Eurovision Song Contest, the first year that non-participating countries could also vote. Cloudflare helped scale and protect the voting application based.io, built by once.net using our rapid DNS infrastructure, CDN, Cloudflare Pages and Turnstile...