Sunday, March 13, 2011

Minimize round-trip times

Round-trip time (RTT) is the time it takes for a client to send a request and the server to send a response over the network, not including the time required for data transfer. That is, it includes the back-and-forth time on the wire, but excludes the time to fully download the transferred bytes (and is therefore unrelated to bandwidth). For example, for a browser to initiate a first-time connection with a web server, it must incur a minimum of 3 RTTs: 1 RTT for DNS name resolution; 1 RTT for TCP connection setup; and 1 RTT for the HTTP request and first byte of the HTTP response. Many web pages require dozens of RTTs.
RTTs vary from less than one millisecond on a LAN to over one second in the worst cases, e.g. a modem connection to a service hosted on a different continent from the user. For small download file sizes, such as a search results page, RTT is the major contributing factor to latency on "fast" (broadband) connections. Therefore, an important strategy for speeding up web page performance is to minimize the number of round trips that need to be made. Since the majority of those round trips consist of HTTP requests and responses, it's especially important to minimize the number of requests that the client needs to make and to parallelize them as much as possible.
  1. Minimize DNS lookups
  2. Minimize redirects
  3. Avoid bad requests
  4. Combine external JavaScript
  5. Combine external CSS
  6. Combine images using CSS sprites
  7. Optimize the order of styles and scripts
  8. Avoid document.write
  9. Avoid CSS @import
  10. Prefer asynchronous resources
  11. Parallelize downloads across hostnames

Minimize DNS lookups

Overview

Reducing the number of unique hostnames from which resources are served cuts down on the number of DNS resolutions that the browser has to make, and therefore, RTT delays.

Details

Before a browser can establish a network connection to a web server, it must resolve the DNS name of the web server to an IP address. Since DNS resolutions can be cached by the client's browser and operating system, if a valid record is still available in the client's cache, there is no latency introduced. However, if the client needs to perform a DNS lookup over the network, the latency can vary greatly depending on the proximity of a DNS name server that can provide a valid response. All ISPs have DNS servers which cache name-IP mappings from authoritative name servers; however, if the caching DNS server's record has expired, and needs to be refreshed, it may need to traverse several nodes in the DNS serving hierarchy, sometimes around the globe, to find an authoritative server. If the DNS resolvers are under load, they can queue DNS resolution requests, which further adds to the latency. In other words, in theory, DNS resolution takes 1 RTT to complete, but in practice, the latency can vary significantly due to DNS resolver queuing delays. It's therefore important to reduce DNS lookups more than any other kinds of requests.
The validity of a DNS record is determined by the time-to-live (TTL) value set by its primary authoritative server; many network administrators set the TTL to very low (between 5 minutes and 24 hours) to allow for quick updates in case network traffic needs to be shifted around. (However, many DNS caches, including browsers, are "TTL disobeyers" and keep the cached record for longer than instructed by the origin server, up to 30 minutes in some cases.)  There are a number of ways to mitigate DNS lookup time — such as increasing your DNS records' time-to-live setting, minimizing CNAME records (which require additional lookups), replicating your name servers in multiple regions, and so on — but these go beyond the scope of web application development, and may not be feasible given your site's network traffic management requirements.
Instead, the best way to limit DNS-lookup latency from your application is to minimize the number of different DNS lookups that the client needs to make, especially lookups that delay the initial loading of the page. The way to do that is to minimize the number of different hostnames from which resources need to be downloaded. However, because there are benefits from using multiple hostnames to induce parallel downloads, this depends somewhat on the number of resources served per page. The optimal number is somewhere between 1 and 5 hosts (1 main host plus 4 hosts on which to parallelize cacheable resources). As a rule of thumb, you shouldn't use more than 1 host for fewer than 6 resources; fewer than 2 resources on a single host is especially wasteful. It should never be necessary to use more than 5 hosts (not counting hosts serving resources over which you have no control, such as ads).

Recommendations

Use URL paths instead of hostnames wherever possible.
If you host multiple properties on the same domain, assign those properties to URL paths rather than separate hostnames. For example, instead of hosting your developer site on developer.example.com, host it on www.example.com/developer. Unless there are good technical reasons to use different hostnames, e.g. to implement DNS-based traffic load-balancing policies, there's no advantage in using a hostname over a URL path to encode a product name. In fact, the latency improvements of consolidating properties on a single hostname can actually enhance user experience: users can link between properties in a single browsing session with no additional DNS lookup penalty. In addition, reusing domains allows the browser to reuse TCP connections more frequently, further reducing the number of round trip times. If one property receives a lot of traffic, reusing that hostname for other properties can also increase the DNS cache hit rate for first-time visits, since the likelihood of a valid mapping already existing on a local caching server is higher. 
Serve early-loaded JavaScript files from the same hostname as the main document.
It's especially important to minimize lookups in the "critical path". We define the critical path as the code and resources required to render the initial view of a web page. In particular, external JavaScript files that you own and that are loaded from the document head, or early in the document body, should be served from the same host as the main document. Most browsers block other downloads and rendering while JavaScript files are being downloaded, parsed and executed. Adding DNS lookup time to this process further delays page load time. If it's not possible to serve those files from the same hostname as the containing document, defer loading them, if possible. The exception is for JS files that are shared by pages served off multiple domains: in this case, serving the file from a unique URL to increase the cache hit rate may outweigh the DNS lookup overhead.

Minimize redirects

Overview

Minimizing HTTP redirects from one URL to another cuts out additional RTTs and wait time for users.

Details

Sometimes it's necessary for your application to redirect the browser from one URL to another. There are several reasons web applications issue redirects:
  • To indicate the new location of a resource that has moved.
  • To track clicks and impressions and log referring pages.
  • To reserve multiple domains, allow for "user-friendly" or "vanity" domains and URLs, and catch misspelled/mistyped URLs.
  • To connect between different parts of a site or application, different country-code top-level domains, different protocols (HTTP to HTTPS), different security policies (e.g. unauthenticated and authenticated pages) etc.
  • To add a trailing slash to URL directory names to make their contents accessible to the browser.
Whatever the reason, redirects trigger an additional HTTP request-response cycle and add round-trip-time latency. It's important to minimize the number of redirects issued by your application — especially for resources needed for starting up your homepage. The best way to do this is to restrict your use of redirects to only those cases where it's absolutely technically necessary, and to find other solutions where it's not.

Recommendations

Eliminate unnecessary redirects.
Here are some strategies for simply eliminating unnecessary redirects: 
  • Never reference URLs in your pages that are known to redirect to other URLs. Your application needs to have a way of updating URL references whenever resources change their location.
  • Never require more than one redirect to get to a given resource. For instance, if C is the target page, and there are two different start points, A and B, both A and B should redirect directly to C; A should never redirect intermediately to B.
  • Minimize the number of extra domains that issue redirects but don't actually serve content. Sometimes there is a temptation to redirect from multiple domains in order to reserve name space and catch incorrect user input (misspelled/mistyped URLs). However, if you train users into thinking they can reach your site from multiple URLs, you can wind up in a costly cycle of buying up new domains just to stop cybersquatters from taking over every variant of your name.
Use server rewrites for user-typed URLs.
Many web servers support internal "rewrites". These allow you to configure mappings from one URL to another; when a client requests an invalid URL, the server automatically remaps it to the correct one and serves the resource, without issuing a redirect. Be sure to use them to catch URLs you can't control. Never use them as a means of easily updating URL references in your pages; you should always refer to one resource with a single URL. Also avoid using them for cacheable resources if possible. The automatic addition of the required trailing slash at the end of directory names is an example of a user-typed URL that would make a good candidate for the rewrite mechanism.
Track web traffic in the background
To track traffic into and between their various properties, some websites use intermediate redirects to a page that does logging for all properties on a central standalone server. However, because such redirects always add latency between page transitions, it's good to avoid them and to find other ways of logging page views in the background. One popular way of recording page views in an asynchronous fashion is to include a JavaScript snippet at the bottom of the target page (or as an onload event handler), that notifies a logging server when a user loads the page. The most common way of doing this is to construct a request to the server for a "beacon", and encode all the data of interest as parameters in the URL for the beacon resource. To keep the HTTP response very small, a  transparent 1x1-pixel image is a good candidate for a beacon request. A slightly more optimal beacon would use an HTTP 204 response ("no content") which is marginally smaller than a 1x1 GIF. Here is a trivial example that assumes that www.example.com/logger is the logging server, and that requests an image called beacon.gif. It passes the URL of the current page and the URL of the referring page (if there is one) as parameters:
<script type="text/javascript">
 var thisPage = location.href;
 var referringPage = (document.referrer) ? document.referrer : "none";
 var beacon = new Image();
 beacon.src = "http://www.example.com/logger/beacon.gif?page=" + encodeURI(thisPage)
 + "&ref=" + encodeURI(referringPage);
</script>
This type of beacon is best included at the very end of the page's HTML to avoid competing with other HTTP requests that are actually needed to render the page contents. In that way, the request is made while the user is viewing the page, so no additional wait time is added.
Prefer HTTP over JavaScript or meta redirects.
There are several ways to issue a redirect:
  • Server-side: You configure your web server to issue a 300 HTTP response code (most commonly 301 ("moved permanently") or 302 ("found"/"moved temporarily")), with a Location header set to the new URL.
  • Client-side: You include the http-equiv="refresh" attribute in the meta tag or set the JavaScript window.location  object (with or without the replace() method) in the head of the HTML document.
If you must use a redirect mechanism, prefer the server-side method over client-side methods. Browsers are able to handle HTTP redirects more efficiently than meta and JavaScript redirects. For example, JS redirects can add parse latency in the browser, while 301 or 302 redirects can be processed immediately, before the browser parses the HTML document. In addition, according to the HTTP/1.1 specification, 301 and 302 responses can be cached by the browser. This means that even if the resource itself is not cacheable, the browser can at least look up the correct URL in its local cache. 301 responses are cacheable by default unless otherwise specified. To make a 302 response cacheable, you need to configure your web server to add an Expires or Cache-Control max-age header (see Leverage browser caching for details). The caveat here is that many browsers don't actually honor the spec, and won't cache either 301 or 302 responses; see Browserscope for a list of conforming and non-conforming browsers.

Example

Google Analytics uses the image beacon method to track inbound, internal, and outbound traffic on any web page owned by an Analytics account holder. The account owner embeds a reference to an external JavaScript file in the web page, which defines a function called trackPageview(). At the bottom of the document body, the page includes a JavaScript snippet that calls this function when a viewer requests the page. The trackPageview() function constructs a request for a 1x1-pixel image called __utm.gif, with multiple parameters in the URL. The parameters specify variables such as the page URL, referring page, browser settings, user locale, and so on. When the Analytics server gets the request, it logs the information and can serve it to account holders when they sign in to the reporting site.

Additional resources

Avoid bad requests

Overview

Removing "broken links", or requests that result in 404/410 errors, avoids wasteful requests.

Details

As your website changes over time, it's inevitable that resources will be moved and deleted. If you don't update your frontend code accordingly, the server will issue 404 "Not found" or 410 "Gone" responses. These are wasteful, unnecessary requests that lead to a bad user experience and make your site look unprofessional. And if such requests are for resources that can block subsequent browser processing, such as JS or CSS files, they can virtually "crash" your site. In the short term, you should scan your site for such links with a link checking tool, such as the crawl errors tool in Google's Webmaster Tools, and fix them. Long term, your application should have a way of updating URL references whenever resources change their location.

Recommendations

Avoid using redirects to handle broken links.
Wherever possible, you should update the links to resources that have moved, or delete those links if the resources have been removed. Avoid using HTTP redirects to send users to the requested resources, or to serve a substitute "suggestion" page. As described above, redirects also slow down your site, and are better to avoid as much as possible.

Combine external JavaScript

Overview

Combining external scripts into as few files as possible cuts down on RTTs and delays in downloading other resources.

Details

Good front-end developers build web applications in modular, reusable components. While partitioning code into modular software components is a good engineering practice, importing modules into an HTML page one at a time can drastically increase page load time. First, for clients with an empty cache, the browser must issue an HTTP request for each resource, and incur the associated round trip times. Secondly, most browsers prevent the rest of the page from from being loaded while a JavaScript file is being downloaded and parsed. (For a list of which browsers do and do not support parallel JS downloads, see Browserscope.)
Here is an example of the download profile of an HTML file containing requests for 13 different .js files from the same domain; the screen shot is taken from Firebug's Net panel over a DSL high-speed connection with Firefox 3.0+:

All files are downloaded serially, and take a total of 4.46 seconds to complete. Now here is the the profile for the same document, with the same 13 files collapsed into 2 files:

The same 729 kilobytes now take only 1.87 seconds to download. If your site contains many JavaScript files, combining them into fewer output files can dramatically reduce latency.
However, there are other factors that come into play to determine the optimal number of files to be served. First, it's important also to defer loading JS code that is not needed at a page's startup. Secondly, some code may have different versioning needs, in which case you will want to separate it out into files. Finally, you might have to serve JS from domains that you don't control, such as tracking scripts or ad scripts. We recommend a maximum of 3, but preferably 2, JS files.

It often makes sense to use many different JavaScript files during the development cycle, and then bundle those JavaScript files together as part of your deployment process. See below for recommended ways of partitioning your files. You would also need to update all of your pages to refer to the bundled files as part of the deployment process.

Recommendations

Partition files optimally.
Here are some rules of thumb for combining your JavaScript files in production:
  • Partition the JavaScript into 2 files: one JS containing the minimal code needed to render the page at startup; and one JS file containing the code that isn't needed until the page load has completed.
  • Serve as few JavaScript files from the document <head> as possible, and keep the size of those files to a minimum.
  • Serve JavaScript of a rarely visited component in its own file. Serve the file only when that component is requested by a user.
  • For small bits of JavaScript code that shouldn't be cached, consider inlining that JavaScript in the HTML page itself.
Position scripts correctly in the document head.
Whether a script is external or inline, it's beneficial to position it in the correct order with respect to other elements, to maximize parallel downloads.

Combine external CSS

Overview

Combining external stylesheets into as few files as possible cuts down on RTTs and delays in downloading other resources.

Details

As with external JavaScript, multiple external CSS files incurs additional RTT overhead. If your site contains many CSS files, combining them into fewer output files can reduce latency. We recommend a maximum of 3, but preferably 2, CSS files.
It often makes sense to use many different CSS files during the development cycle, and then bundle those CSS files together as part of your deployment process. See below for recommended ways of partitioning your files. You would also need to update all of your pages to refer to the bundled files as part of the deployment process.

Recommendations

Partition files optimally.
Here are some rules of thumb for combining your CSS files in production:
  • Partition the CSS into 2 files each: one CSS file containing the minimal code needed to render the page at startup; and one CSS file containing the code that isn't needed until the page load has completed.
  • Serve CSS of a rarely visited component in its own file. Serve the file only when that component is requested by a user.
  • For CSS that shouldn't be cached, consider inlining it.
  • Don't use CSS @import from a CSS file.
Position stylesheets correctly in the document head.
It's beneficial to position references to external CSS in the correct order with respect to scripts, to enable parallel downloads.

Combine images using CSS sprites

Overview

Combining images into as few files as possible using CSS sprites reduces the number of round-trips and delays in downloading other resources, reduces request overhead, and can reduce the total number of bytes downloaded by a web page.

Details

Similar to JavaScript and CSS, downloading multiple images incurs additional round trips. A site that contains many images can combine them into fewer output files to reduce latency.

Recommendations

Sprite images that are loaded together
Combine images that are loaded on the same page and that are always loaded together. For instance, a set of icons that are loaded on every page should be sprited. Dynamic images that change with each pageview, such as profile pictures or other images that change frequently, may not be good candidates for spriting.
Sprite GIF and PNG images first
GIF and PNG images use lossless compression and can thus be sprited without reducing the quality of the resulting sprited image.
Sprite small images first
Each request incurs a fixed amount of request overhead. The time it takes a browser to download small images can be dominated by the request overhead. By combining small images, you can reduce this overhead from one request per image to one request for the entire sprite.
Sprite cacheable images
Spriting images with long caching lifetimes means that the image will not have to be re-fetched once it's cached by the browser.
Using a spriting service
Spriting services such as SpriteMe can make it easier to build CSS sprites.
Minimize the amount of "empty space" in the sprited image
In order to display an image, the browser must decompress and decode the image. The size of the decoded representation of the image is proportional to the number of pixels in the image. Thus, while empty space in a sprited image may not significantly impact the size of the image file, a sprite with undisplayed pixels increases the memory usage of your page, which can cause the browser to become less responsive.
Sprite images with similar color palettes
Spriting images with more than 256 colors can cause the resulting sprite to use the PNG truecolor type instead of the palette type, which can increase the size of the resulting sprite. To generate optimal sprites, combine images that share the same 256 color palette. If there is some flexibility in the colors in your images, consider reducing the resulting sprite's color palette to 256 colors.

Optimize the order of styles and scripts

Overview

Correctly ordering external stylesheets and external and inline scripts enables better parallelization of downloads and speeds up browser rendering time.

Details

Because JavaScript code can alter the content and layout of a web page, the browser delays rendering any content that follows a script tag until that script has been downloaded, parsed and executed. However, more importantly for round-trip times, many browsers block the downloading of resources referenced in the document after scripts until those scripts are downloaded and executed. On the other hand, if other files are already in the process of being downloaded when a JS file is referenced, the JS file is downloaded in parallel with them. For example, let's say you have 3 stylesheets and 2 scripts and you specify them in the following order in the document
<head>
<link rel="stylesheet" type="text/css" href="stylesheet1.css" />
<script type="text/javascript" src="scriptfile1.js" />
<script type="text/javascript" src="scriptfile2.js" />
<link rel="stylesheet" type="text/css" href="stylesheet2.css" />
<link rel="stylesheet" type="text/css" href="stylesheet3.css" />
</head>
Assuming that each one takes exactly 100 milliseconds to download, that the browser can maintain up to 6 concurrent connections for a single host (for more information about this, see Parallelize downloads across hostnames), and that the cache is empty, the download profile will look something like this:

The second two stylesheets must wait until the JS files are finished downloading. The total download time equals the time it takes to download both JS files, plus the largest CSS file (in this case 100 ms + 100 ms + 100 ms = 300 ms). Merely changing the order of the resources to this:
<head>
<link rel="stylesheet" type="text/css" href="stylesheet1.css" />
<link rel="stylesheet" type="text/css" href="stylesheet2.css" />
<link rel="stylesheet" type="text/css" href="stylesheet3.css" />
<script type="text/javascript" src="scriptfile1.js" />
<script type="text/javascript" src="scriptfile2.js" />
</head>
Will result in the following download profile:

100 ms is shaved off the total download time. For very large stylesheets that can take longer to download, the savings could be more.
Therefore, since stylesheets should always be specified in the head of a document for better performance, it's important, where possible, that any external JS files that must be included in the head (such as those that write to the document) follow the stylesheets, to prevent delays in download time.
Another, more subtle, issue is caused by the presence of an inline script following a stylesheet, such as the following:
<head>
<link rel="stylesheet" type="text/css" href="stylesheet1.css" />
<script type="text/javascript">
 document.write("Hello world!");
</script>
<link rel="stylesheet" type="text/css" href="stylesheet2.css" />
<link rel="stylesheet" type="text/css" href="stylesheet3.css" />
<link rel="alternate" type="application/rss+xml" href="front.xml" title="Say hello" />
<link rel="shortcut icon" type="image/x-icon" href="favicon.ico">
</head>
In this case, the reverse problem occurs: the first stylesheet actually blocks the inline script from being executed, which then in turn blocks other resources from being downloaded. Again, the solution is to move the inline scripts to follow all other resources, if possible, like so:
<head>
<link rel="stylesheet" type="text/css" href="stylesheet1.css" />
<link rel="stylesheet" type="text/css" href="stylesheet2.css" />
<link rel="stylesheet" type="text/css" href="stylesheet3.css" />
<link rel="alternate" type="application/rss+xml" title="Say hello" href="front.xml" />
<link rel="shortcut icon" type="image/x-icon" href="favicon.ico">
<script type="text/javascript">
   document.write("Hello world!");
</script>
</head>

Recommendations

Put external scripts after external stylesheets if possible.
Browsers execute stylesheets and scripts in the order in which they appear in the document. If the JS code has no dependencies on the CSS files, you can move the CSS files before the JS files. If the JS code does depend on the CSS contained in an external file — for example, styles that are needed for output you are writing to the document in the JS code — this isn't possible.
Put inline scripts after other resources if possible.
Putting inline scripts after all other resources prevents blocking of other downloads, and it also enables progressive rendering. However, if those "other resources" are external JS files on which the inline scripts depend, this might not be possible. In this case, it's best to move the inline scripts before the CSS files.

Avoid document.write

Overview

Using document.write() to fetch external resources, especially early in the document, can significantly increase the time it takes to display a web page.

Details

Modern browsers use speculative parsers to more efficiently discover external resources referenced in HTML markup. These speculative parsers help to reduce the time it takes to load a web page. Since speculative parsers are fast and lightweight, they do not execute JavaScript. Thus, using JavaScript's document.write() to fetch external resources makes it impossible for the speculative parser to discover those resources, which can delay the download, parsing, and rendering of those resources.
Using document.write() from external JavaScript resources is especially expensive, since it serializes the downloads of the external resources. The browser must download, parse, and execute the first external JavaScript resource before it executes the document.write() that fetches the additional external resources. For instance, if external JavaScript resource first.js contains the following content:
document.write('<script src="second.js"><\/script>');
The download of first.js and second.js will be serialized in all browsers. Using one of the recommended techniques described below can reduce blocking and serialization of these resources, which in turn reduces the time it takes to display the page.

Recommendations

Declare resources directly in HTML markup
Declaring resources in HTML markup allows the speculative parser to discover those resources. For instance, instead of calling document.write from an HTML <script> tag like so:
<html>
<body>
<script>
document.write('<script src="example.js"><\/script>');
</script>
</body>
</html>
insert the document.written script tag directly into the HTML:
<html>
<body>
<script src="example.js"></script>
</body>
</html>
Prefer asynchronous resources
In some cases, it may not be possible to declare resources directly in HTML. For instance, if the URL of the resource is determined dynamically on the client, JavaScript must be used to construct that URL. In these cases, try to use asynchronous loading techniques.
Use "friendly iframes"
In some cases, such as optimization of legacy code that cannot be loaded using other recommended techniques, it may not be possible to avoid document.write. In these cases, friendly iframes can be used to avoid blocking the main page.
A friendly iframe is an iframe on the same origin as its parent document. Resources referenced in friendly iframes load in parallel with resources referenced on the main page. Thus, calling document.write in a friendly iframe does not block the parent page from loading. Despite not blocking the parent page, using document.write in a friendly iframe can still slow down the loading of the content in that iframe, so other recommended techniques should be preferred over the "friendly iframe" technique.

Avoid CSS @import

Overview

Using CSS @import in an external stylesheet can add additional delays during the loading of a web page.

Details

CSS @import allows stylesheets to import other stylesheets. When CSS @import is used from an external stylesheet, the browser is unable to download the stylesheets in parallel, which adds additional round-trip times to the overall page load. For instance, if first.css contains the following content:
@import url("second.css")
The browser must download, parse, and execute first.css before it is able to discover that it needs to download second.css.

Recommendations

Use the <link> tag instead of CSS @import
Instead of @import, use a <link> tag for each stylesheet. This allows the browser to download stylesheets in parallel, which results in faster page load times:
<link rel="stylesheet" href="first.css">
<link rel="stylesheet" href="second.css">

Prefer asynchronous resources

Overview

Fetching resources asynchronously prevents those resources from blocking the page load.

Details

When a browser parses a traditional script tag, it must wait for the script to download, parse, and execute before rendering any HTML that comes after it. With an asynchronous script, however, the browser can continue parsing and rendering HTML that comes after the async script, without waiting for that script to complete. When a script is loaded asynchronously, it is fetched as soon as possible, but its execution is deferred until the browser's UI thread is not busy doing something else, such as rendering the web page.

Recommendations

JavaScript resources that aren't needed to construct the initial view of the web page, such as those used for tracking/analytics, should be loaded asynchronously. Some scripts that display user-visible content may also be loaded asynchronously, especially if that content is not the most important content on the page (e.g. it is below the fold).
Use a script DOM element
Using a script DOM element maximizes asynchronous loading across current browsers:
<script>
var node = document.createElement('script');
node.type = 'text/javascript';
node.async = true;
node.src = 'example.js';
// Now insert the node into the DOM, perhaps using insertBefore()
</script>
Using a script DOM element with an async attribute allows for asynchronous loading in Internet Explorer, Firefox, Chrome, and Safari. By contrast, at the time of this writing, an HTML <script> tag with an async attribute will only load asynchronously in Firefox 3.6 and Chrome 8, as other browsers do not yet support this mechanism for asynchronous loading.
Load Google Analytics asynchronously
The newest version of the Google Analytics snippet uses asynchronous JavaScript. Pages that use the old snippet should upgrade to the asynchronous version.

Parallelize downloads across hostnames

Overview

Serving resources from two different hostnames increases parallelization of downloads.

Details

The HTTP 1.1 specification (section 8.1.4) states that browsers should allow at most two concurrent connections per hostname (although newer browsers allow more than that: see Browserscope for a list). If an HTML document contains references to more resources (e.g. CSS, JavaScript, images, etc.) than the maximum allowed on one host, the browser issues requests for that number of resources, and queues the rest. As soon as some of the requests finish, the browser issues requests for the next number of resources in the queue. It repeats the process until it has downloaded all the resources. In other words, if a page references more than X external resources from a single host, where X is the maximum connections allowed per host, the browser must download them sequentially, X at a time, incurring 1 RTT for every X resources. The total round-trip time is N/X, where N is the number of resources to fetch from a host. For example, if a browser allows 4 concurrent connections per hostname, and a page references 100 resources on the same domain, it will incur 1 RTT for every 4 resources, and a total download time of 25 RTTs.
You can get around this restriction by serving resources from multiple hostnames. This "tricks" the browser into parallelizing additional downloads, which leads to faster page load times. However, using multiple concurrent connections can cause increased CPU usage on the client, and introduces additional round-trip time for each new TCP connection setup, as well as DNS lookup latency for clients with empty caches. Therefore, beyond a certain number of connections, this technique can actually degrade performance. The optimal number of hosts is generally believed to be between 2 and 5, depending on various factors such as the size of the files, bandwidth and so on. If your pages serve large numbers of static resources, such as images, from a single hostname, consider splitting them across multiple hostnames using DNS aliases. We recommend this technique for any page that serves more than 10 resources from a single host.  (For pages that serve fewer resources than this, it's overkill.) 

To set up additional hostnames, you can configure subdomains in your DNS database as CNAME records that point to a single A record, and then configure your web server to serve resources from the multiple hosts. For even better performance, if all or some of the resources don't make use of cookie data (which they usually don't), consider making all or some of the hosts subdomains of a cookieless domain. Be sure to evenly allocate all the resources to among the different hostnames, and in the pages that reference the resources, use the CNAMEd hostnames in the URLs.
If you host your static files using a CDN, your CDN may support serving these resources from more than one hostname. Contact your CDN to find out.

Recommendations

Balance parallelizable resources across hostnames.
Requests for most static resources, including images, CSS, and other binary objects, can be parallelized. Balance requests to all these objects as much as possible across the hostnames. If that's not possible, as a rule of thumb, try to ensure that no one host serves more than 50% more than the average across all hosts. So, for example, if you have 40 resources, and 4 hosts, each host should serve ideally 10 resources; in the worst case, no host should serve more than 15. If you have 100 resources and 4 hosts, each host should serve 25 resources; no one host should serve more than 38. On the other hand, many browsers do not download JavaScript files in parallel*, so there is no benefit from serving them from multiple hostnames. So when balancing resources across hostnames, remove any JS files from your allocation equation. *For a list of browsers that do and do not support parallel downloading of JavaScript files, see Browserscope.
Prevent external JS from blocking parallel downloads.
When downloading external JavaScript, many browsers block downloads of all other types of files on all hostnames, regardless of the number of hostnames involved. To prevent JS downloads from blocking other downloads (and to speed up the JS downloads themselves):
Always serve a resource from the same hostname.
To improve the browser cache hit rate, the client should always fetch a resource from the same hostname. Make sure all page references to the same resource use the same URL.

Example

To display its map images, Google Maps delivers multiple small images called "tiles", each of which represents a small portion of the larger map. The browser assembles the tiles into the complete map image as it loads each one. For this process to appear seamless, it's important that the tiles download in parallel and as quickly as possible. To enable the parallel download, the application assigns the tile images to four hostnames, mt0, mt1, mt2 and mt3. So, for example, in Firefox 3.0+, which allows up to 6 parallel connections per hostname, up to 24 requests for map tiles could be made in parallel. The following screen shot from Firebug's Net panel shows this effect in Firefox: 15 requests, across the hostnames mt[0-3] are made in parallel:


Additional resources

Optimize caching

Most web pages include resources that change infrequently, such as CSS files, image files, JavaScript files, and so on. These resources take time to download over the network, which increases the time it takes to load a web page. HTTP caching allows these resources to be saved, or cached, by a browser or proxy. Once a resource is cached, a browser or proxy can refer to the locally cached copy instead of having to download it again on subsequent visits to the web page. Thus caching is a double win: you reduce round-trip time by eliminating numerous HTTP requests for the required resources, and you substantially reduce the total payload size of the responses. Besides leading to a dramatic reduction in page load time for subsequent user visits, enabling caching can also significantly reduce the bandwidth and hosting costs for your site.
  1. Leverage browser caching
  2. Leverage proxy caching

Leverage browser caching

Overview

Setting an expiry date or a maximum age in the HTTP headers for static resources instructs the browser to load previously downloaded resources from local disk rather than over the network.

Details

HTTP/S supports local caching of static resources by the browser. Some of the newest browsers (e.g. IE 7, Chrome) use a heuristic to decide how long to cache all resources that don't have explicit caching headers. Other older browsers may require that caching headers be set before they will fetch a resource from the cache; and some may never cache any resources sent over SSL.
To take advantage of the full benefits of caching consistently across all browsers, we recommend that you configure your web server to explicitly set caching headers and apply them to all cacheable static resources, not just a small subset (such as images). Cacheable resources include JS and CSS files, image files, and other binary object files (media files, PDFs, Flash files, etc.). In general, HTML is not static, and shouldn't be considered cacheable. 
HTTP/1.1 provides the following caching response headers :
  • Expires and Cache-Control: max-age. These specify the “freshness lifetime” of a resource, that is, the time period during which the browser can use the cached resource without checking to see if a new version is available from the web server. They are "strong caching headers" that apply unconditionally; that is, once they're set and the resource is downloaded, the browser will not issue any GET requests for the resource until the expiry date or maximum age is reached.
  • Last-Modified and ETag. These specify some characteristic about the resource that the browser checks to determine if the files are the same. In the Last-Modified header, this is always a date. In the ETag header, this can be any value that uniquely identifies a resource (file versions or content hashes are typical). Last-Modified is a "weak" caching header in that the browser applies a heuristic to determine whether to fetch the item from cache or not. (The heuristics are different among different browsers.) However, these headers allow the browser to efficiently update its cached resources by issuing conditional GET requests when the user explicitly reloads the page. Conditional GETs don't return the full response unless the resource has changed at the server, and thus have lower latency than full GETs.
It is important to specify one of Expires or Cache-Control max-age, and one of Last-Modified or ETag, for all cacheable resources. It is redundant to specify both Expires and Cache-Control: max-age, or to specify both Last-Modified and ETag.

Recommendations

Set caching headers aggressively for all static resources.
For all cacheable resources, we recommend the following settings:
  • Set Expires to a minimum of one month, and preferably up to one year, in the future. (We prefer Expires over Cache-Control: max-age because it is is more widely supported.) Do not set it to more than one year in the future, as that violates the RFC guidelines. If you know exactly when a resource is going to change, setting a shorter expiration is okay. But if you think it "might change soon" but don't know when, you should set a long expiration and use URL fingerprinting (described below). Setting caching aggressively does not "pollute" browser caches: as far as we know, all browsers clear their caches according to a Least Recently Used algorithm; we are not aware of any browsers that wait until resources expire before purging them.
  • Set the Last-Modified date to the last time the resource was changed. If the Last-Modified date is sufficiently far enough in the past, chances are the browser won't refetch it.
Use fingerprinting to dynamically enable caching.
For resources that change occasionally, you can have the browser cache the resource until it changes on the server, at which point the server tells the browser that a new version is available. You accomplish this by embedding a fingerprint of the resource in its URL (i.e. the file path). When the resource changes, so does its fingerprint, and in turn, so does its URL. As soon as the URL changes, the browser is forced to re-fetch the resource. Fingerprinting allows you to set expiry dates long into the future even for resources that change more frequently than that. Of course, this technique requires that all of the pages that reference the resource know about the fingerprinted URL, which may or may not be feasible, depending on how your pages are coded.
Set the Vary header correctly for Internet Explorer.
Internet Explorer does not cache any resources that are served with the Vary header and any fields but Accept-Encoding and User-Agent. To ensure these resources are cached by IE, make sure to strip out any other fields from the Vary header, or remove the Vary header altogether if possible
Avoid URLs that cause cache collisions in Firefox.
The Firefox disk cache hash functions can generate collisions for URLs that differ only slightly, namely only on 8-character boundaries. When resources hash to the same key, only one of the resources is persisted to disk cache; the remaining resources with the same key have to be re-fetched across browser restarts. Thus, if you are using fingerprinting or are otherwise programmatically generating file URLs, to maximize cache hit rate, avoid the Firefox hash collision issue by ensuring that your application generates URLs that differ on more than 8-character boundaries.
Use the Cache control: public directive to enable HTTPS caching for Firefox.
Some versions of Firefox require that the Cache control: public header to be set in order for resources sent over SSL to be cached on disk, even if the other caching headers are explicitly set. Although this header is normally used to enable caching by proxy servers (as described below), proxies cannot cache any content sent over HTTPS, so it is always safe to set this header for HTTPS resources.

Example

For the stylesheet used to display the user's calendar after login, Google Calendar embeds a fingerprint in its filename: calendar/static/fingerprint_keydoozercompiled.css, where the fingerprint key is a 128-bit hexadecimal number. At the time of the screen shot below (taken from Page Speed's Show Resources panel), the fingerprint was set to 82b6bc440914c01297b99b4bca641a5d:

The fingerprinting mechanism allows the server to set the Expires header to exactly one year ahead of the request date; the Last-Modified header to the date the file was last modified; and the Cache-Control: max-age header to 3153600. To cause the client to re-download the file in case it changes before its expiry date or maximum age, the fingerprint (and therefore the URL) changes whenever the file's content does.

Additional resources

Leverage proxy caching

Overview

Enabling public caching in the HTTP headers for static resources allows the browser to download resources from a nearby proxy server rather than from a remoter origin server.

Details

In addition to browser caching, HTTP provides for proxy caching, which enables static resources to be cached on public web proxy servers, most notably those used by ISPs. This means that even first-time users to your site can benefit from caching: once a static resource has been requested by one user through the proxy, that resource is available for all other users whose requests go through that same proxy. Since those locations are likely to be in closer network proximity to your users than your servers, proxy caching can result in a significant reduction in network latency. Also, if enabled proxy caching effectively gives you free web site hosting, since responses served from proxy caches don't draw on your servers' bandwidth at all.
You use the Cache-control: public header to indicate that a resource can be cached by public web proxies in addition to the browser that issued the request. With some exceptions (described below), you should configure your web server to set this header to public for cacheable resources.

Recommendations

Don't include a query string in the URL for static resources.
Most proxies, most notably Squid up through version 3.0, do not cache resources with a "?" in their URL even if a Cache-control: public header is present in the response. To enable proxy caching for these resources, remove query strings from references to static resources, and instead encode the parameters into the file names themselves.
Don't enable proxy caching for resources that set cookies.
Setting the header to public effectively shares resources among multiple users, which means that any cookies set for those resources are shared as well. While many proxies won't actually cache any resources with cookie headers set, it's better to avoid the risk altogether. Either set the Cache-Control header to private or serve these resources from a cookieless domain.
Be aware of issues with proxy caching of JS and CSS files.
Some public proxies have bugs that do not detect the presence of the Content-Encoding response header. This can result in compressed versions being delivered to client browsers that cannot properly decompress the files. Since these files should always be gzipped by your server, to ensure that the client can correctly read the files, do either of the following:
  • Set the the Cache-Control header to private. This disables proxy caching altogether for these resources. If your application is multi-homed around the globe and relies less on proxy caches for user locality, this might be an appropriate setting.
  • Set the Vary: Accept-Encoding response header. This instructs the proxies to cache two versions of the resource: one compressed, and one uncompressed. The correct version of the resource is delivered based on the client request header. This is a good choice for applications that are singly homed and depend on public proxies for user locality.

Web Performance Best Practices

When you profile a web page with Page Speed, it evaluates the page's conformance to a number of different rules. These rules are general front-end best practices you can apply at any stage of web development. We provide documentation of each of the rules here, so whether or not you run the Page Speed tool — maybe you're just developing a brand new site and aren't ready to test it — you can refer to these pages at any time. We give you specific tips and suggestions for how you can best implement the rules and incorporate them into your development process.

About the performance best practices

Page Speed evaluates performance from the client point of view, typically measured as the page load time. This is the lapsed time between the moment a user requests a new page and the moment the page is fully rendered by the browser. The best practices cover many of the steps involved in page load time, including resolving DNS names, setting up TCP connections, transmitting HTTP requests, downloading resources, fetching resources from cache, parsing and executing scripts, and rendering objects on the page. Essentially Page Speed evaluates how well your pages either eliminate these steps altogether, parallelize them, and shorten the time they take to complete. The best practices are grouped into five categories that cover different aspects of page load optimization:

Send us your feedback

We would appreciate any feedback you would like to give about the rules described in these pages. If you have suggestions on how to make these best practices better (or how to document them better!), post them to our discussion group at page-speed-discuss.

Additional resources

Security Considerations for mod_pagespeed

Any change to a website has the possibility of introducing new security holes. mod_pagespeed is not an exception to this rule. This document covers specific security concerns to keep in mind when using mod_pagespeed.

Untrusted Content

Any time you reference untrusted content on your website, you are at risk of security attack. This is most clear for JavaScript which will have access to your domain's cookies because of the Same Origin Policy. It can also be true for CSS, which can contain JavaScript references (ex. the IE behavior property described in this W3C reference and at this reference by SitePoint®. Even images in certain situations can be used in attacks (ex: GIFAR attack).
Caution: Do not reference untrusted content on your website. If you do store user content or other untrusted content, keep it on a separate cookie-less domain and do NOT tell mod_pagespeed to rewrite from that domain to your main cookied domain.

Private Content

mod_pagespeed rewrites and, effectively, proxies resources referenced in the main HTML document. It respects public caching headers, so if a resource is not explicitly marked public cacheable, mod_pagespeed will not rewrite nor re-serve it. However, mod_pagespeed will re-serve resources which ARE publicly cacheable. If you serve private content as publicly cacheable, mod_pagespeed will proxy it to any who requests a specific URL. Note that any public proxy in the Internet can do the same thing.
Caution: Explicitly mark private content as not publicly cacheable.

Friday, March 11, 2011

Experimenting with mod_pagespeed

Configuring mod_pagespeed_examples

mod_pagespeed ships with a directory of sample HTML, JavaScript, Image, and CSS files to demonstrate the rewrite passes that it executes. These also form the basis of an installation smoke-test to ensure that the configured system is operating correctly. Assuming the files are installed in /var/www/mod_pagespeed_example, the following configuration file fragment will enable them to be served using reasonable caching headers.
# These caching headers are set up for the mod_pagespeed example, and
    # also serve as a demonstration of good values to set for the entire
    # site, if it is to be optimized by mod_pagespeed.

    <Directory /var/www/mod_pagespeed_example>
      # To enable to show that mod_pagespeed to rewrites web pages, we must
      # turn off Etags for HTML files and eliminate caching altogether.
      # mod_pagespeed should rewrite HTML files each time they are served.
      # The first time mod_pagespeed sees an HTML file, it may not optimize
      # it fully.  It will optimize better after the second view.  Caching
      # defeats this behavior.
      <FilesMatch "\.(html|htm)$">
        Header unset Etag
        Header set Cache-control "max-age=0, no-cache, no-store"
      </FilesMatch>

      # Images, styles, and JavaScript are all cache-extended for
      # a year by rewriting URLs to include a content hash..  mod_pagespeed,
      # can only do this if the resources are cacheable in the first place.
      # The origin caching policy, set here to 10 minutes, dictates how
      # frequently mod_pagespeed must re-read the content files and recompute
      # the content-hash.  As long as the content doesn't actually change,
      # the content-hash will remain the same, and the resources stored
      # in browser caches will stay relevant.
      <FilesMatch "\.(jpg|jpeg|gif|png|js|css)$">
        Header unset Etag
        Header set Cache-control "public, max-age=600"
      </FilesMatch>
    </Directory>

Trying out mod_pagespeed using mod_proxy

Ideally, you will experiment with mod_pagespeed on an Apache server that is already serving its own content. However, to experiment with mod_pagespeed on an Apache server that does not serve its own content, you can set up Apache as proxy:
# Proxy configuration file to enable mod_pagespeed to rewrite external
    # content.  In this configuration we use assume a browser proxy,
    # pointing to HOSTNAME:80.

    LoadModule proxy_module /etc/apache2/modules/mod_proxy.so
    # Depends: proxy
    LoadModule proxy_http_module /etc/apache2/modules/mod_proxy_http.so

    <IfModule mod_proxy.c>
      ProxyRequests On
      ProxyVia On

      # limit connections to LAN clients
      <Proxy *>
        AddDefaultCharset off
        Order Deny,Allow
        Allow from all
      </Proxy>
      ProxyPreserveHost On
      ProxyStatus On
      ProxyBadHeader Ignore

      # Enable/disable the handling of HTTP/1.1 "Via:" headers.
      # ("Full" adds the server version; "Block" removes all outgoing Via: headers)
      # Set to one of: Off | On | Full | Block
      ProxyVia On
    </IfModule>
Set the browser proxy to point to that proxy server, and you will then be able to view any Internet site rewritten by Apache and mod_pagespeed.

mod_pagespeed System Integration

Configuring Caching

mod_pagespeed requires publicly cacheable resources to provide maximum benefit. As discussed in the "Cache Extender" filter, the origin TTL specified in the Apache configuration file dictates how quickly changes made to the source can propagate to users' browser caches. However, using mod_pagespeed, resources referenced statically from HTML files will be served with a one-year cache lifetime, but with a URL that is versioned using a content hash.
The cache settings suggested above for mod_pagespeed_example also serve as our recommended starting point for ensuring that your sites' content is cacheable, and thus rewritable by mod_pagespeed.

Configuring Server-Side Cache for mod_pagespeed

In order to rewrite resources, mod_pagespeed must cache them on the server. The output filter must be configured with paths where it can write cache files, and tuned to limit the amount of disk space consumed. The file-based cache has a built in LRU mechanism to remove old files, targeting a certain total disk space usage, and a certain interval for the cleanup process. It is also useful to have a small in-memory write-through LRU-cache that's kept in each Apache process. Keep in mind that in pre-fork mode, Apache spawns dozens of processes, so the total memory consumed (ModPagespeedLRUCacheKbPerProcess * num_processes) must fit into the capabilitiess of the HTTP server.
The default values are, which perform reasonably well are:
ModPagespeedFileCacheSizeKb          102400
    ModPagespeedFileCacheCleanIntervalMs 3600000
    ModPagespeedLRUCacheKbPerProcess     1024
    ModPagespeedLRUCacheByteLimit        16384

mod_pagespeed requires a file-path for the cache. The user can use the following suggested setting:
ModPagespeedFileCachePath            "/var/mod_pagespeed/cache/"
mod_pagespeed also requires another file-path, although it's not currently used. It's reserved for future use as a shared database in a multi-server environment. The user can use the following suggested setting:
ModPagespeedGeneratedFilePrefix      "/var/mod_pagespeed/files/"

Setting the URL fetcher timeout

When mod_pagespeed attempts to rewrite a resource for the firet time, it must fetch it via HTTP. The default timeout for fetches in 5 seconds. A directive can be applied to change the timeout
ModPagespeedFetcherTimeoutMs timeout_value_in_milliseconds
These directives can not be used in .htaccess files or <Directory> scopes.

mod_pagespeed: URL Control

Restricting Rewriting Via Wildcards: ModPagespeedDisallow and ModPagespeedAllow

By default, all HTML files served by your server and all resources (CSS, images, JavaScript) found in HTML files whose origin matches the HTML file, or whose origin is authorized via ModPagespeedDomain, will be rewritten. However, this can be restricted by using wildcards, using the directives:
ModPagespeedAllow wildcard_spec
  ModPagespeedDisallow wildcard_spec
These directives are evaluated in sequence for each resource, to determine whether the resource should be consider for rewriting. This is best considered with an example.

Example 1: Excluding JavaScript files that cannot be rewritten

Some JavaScript files are sensitive to their own names as they traverse the DOM. Therefore any rewriting on these files is invalid. We cannot cache-extend them or minifiy them. When such files are identified, we can exclude from the rewriting process via:
ModPagespeedDisallow */jquery-ui-1.8.2.custom.min.js
  ModPagespeedDisallow */js_tinyMCE.js

Example2: Specifying explicitly which types of files can be rewritten

By default, every resource referenced by HTML from authorized domains is rewritten, as if there was an implicit
ModPagespeedAllow *
at the beginning of every configuration. To change the default to be exclusive, issue
ModPagespeedDisallow *
and then follow it with which files you want to include. For example:
ModPagespeedDisallow *
  ModPagespeedAllow http://*example.com/*.html
  ModPagespeedAllow http://*example.com/*/images/*.png
  ModPagespeedAllow http://*example.com/*/styles/*.css
  ModPagespeedDisallow */images/captcha/*
The later directives take priority over the earlier ones, so the Captcha images will not be rewritten.
Note: Wildcards include * which matches any 0 or more characters, and ?, which matches exactly one character. Unlike Unix shells, the / directory separator is not special, and can be matched by either * or ?. The resources are always expanded into their absolute form before expanding.
These directives can be used in .htaccess files and <Directory> scopes.
Note: The names in wildcards are not evaluated with respect to the <Directory> scope or the directory containing the .htaccess file. The wildcards are evaluated against the fully expanded URL. So if you want to match js_tinyMCE.js you must prefix it with its full path, including http:// and domain, or use a wildcard, as shown above.

Restricting mod_pagespeed from combining resources across paths

Note: New feature as of 0.9.15.1
By default, filters that combine multiple resources together are allowed to combine multiple resources across paths. This works well much of the time, but if there are Apache directory-level access controls, or path-specific cookies associated with JavaScript files, then you may need to turn off this feature.
ModPagespeedCombineAcrossPaths off
These directives can be used in .htaccess files and <Directory> scopes.

Limiting the maximum generated URL segment length

Note: New feature as of 0.9.15.1
The maximum URL size is generally limited to about 2k characters due to Internet Explorer: See http://support.microsoft.com/kb/208427/EN-US. Apache servers by default impose a further limitation of about 250 characters per URL segment (text between slashes). mod_pagespeed circumvents this limitation, but if you employ proxy servers in your path you may need to re-impose it by overriding the setting here. The default setting is 1024.
ModPagespeedMaxSegmentLength 250
These directives can be used in .htaccess files and <Directory> scopes.

mod_pagespeed Authorizing and Mapping Domains

Authorizing Domains

In addition to optimizing HTML resources, mod_pagespeed restricts itself to optimizing resources (JavaScript, CSS, images) that are served from domains that must be explicitly listed in the configuration file. For example:
ModPagespeedDomain http://example.com
    ModPagespeedDomain http://cdn.example.com
mod_pagespeed will rewrite resources found from these two explicitly listed domains. Additionally, it will rewrite resources that are served from the same domain as the HTML file, or are specified as a path relative to the HTML. When resources are rewritten, their domain and path are not changed. However, the leaf name is changed to encode rewriting information that can be used to identify and serve the optimized resource.
These directives can be used in .htaccess files and <Directory> scopes.

Mapping Origin Domains

In order to improve the performance of web pages, mod_pagespeed must examine and modify the content of resources referenced on those pages. To do that, it must fetch those resources using HTTP, using the URL reference specified on the HTML page.
In some cases, the URL specified in the HTML file is not the best URL to use to fetch the resource from the Apache server. Scenarios where this is a concern include:
  1. If the server is behind a load balancer, and it's more efficient to reference the server directly by its IP address, or as 'localhost'.
  2. The server has a special DNS configuration
  3. The server is behind a firewall preventing outbound connections
  4. The server is running in a CDN or proxy, and must go back to the origin server for the resources
In these situations the remedy is to map the origin domain:
ModPagespeedMapOriginDomain origin_to_fetch_from origin_specified_in_html
Wildcards can also be used in the origin_specified_in_html, e.g.
ModPagespeedMapOriginDomain localhost *.example.com
By specifiying a source domain in this directive, you are authorizing mod_pagespeed to rewrite resources found in that domain. For example, in the above directive, '*.example.com' gets authorized for rewrites from HTML files, but 'localhost' does not. See ModPagespeedDomain.
When mod_pagespeed fetches resources from a mapped origin domain, it specifies the source domain in the Host: header in the request.
These directives can be used in .htaccess files and <Directory> scopes.

Mapping Rewrite Domains

When mod_pagespeed rewrites a resource, it updates the HTML to refer to the resource by its new name. Generally mod_pagespeed leaves the resource at the same origin and path that was originally found in the HTML. However, it is possible to map the domain of rewritten resources. Examples of why this might be desirable include:
  1. Serving static content from cookieless domains, to reduce the size of HTTP requests from the browser. See Minimizing Payload
  2. To move content to a Content Delivery Network (CDN)
This is done using the configuration file directive:
ModPagespeedMapRewriteDomain domain_to_write_into_html domain_specified_in_html
Wildcards can also be used in the domain_specified_in_html, e.g.
ModPagespeedMapRewriteDomain cdn.example.com *example.com
Note: It is the responsbility of the site administrator to ensure that Apache httpd is installed with mod_pagespeed on the domain_to_write_into_html. This might be a separate server, or there may be a single server with multiple domains mapped into it. The files must be accessible via the same path on the destination server as was specified in the HTML file. No other files should be stored on the domain_to_write_into_html -- it should be functionally equivalent to domain_specified_in_html.
For example, if mod_pagespeed cache_extends http://www.example.com/styles/style.css to http://cdn.example.com/styles/style.css.pagespeed.ce.HASH.css, then cdn.example.com will have to have a mechanism in place to either rewrite that file in place, or refer back to the origin server to pull the rewritten content.
Note: It is the responsbility of the site administrator to ensure that moving resources onto domains does not create a security vulnerability. In particular, if the target domain has cookies, then any JavaScript loaded from a resource moved to a domain with cookies will gain access to those cookies. In general, moving resources to a cookieless domain is a great way to improve security. Be aware that CSS can load JavaScript in certain environments.
By specifiying a domain in this directive, either as source or destination, you are authorizing mod_pagespeed to rewrite resources found in this domain. See ModPagespeedDomain.
These directives can be used in .htaccess files and <Directory> scopes.

Sharding Domains

Best practices suggest minimizing round-trip times by parallelizing downloads across hostnames. mod_pagespeed can partially automate this for resources that it rewrites, using the directive:
ModPagespeedShardDomain domain_to_shard shard1,shard2,shard3...
Wildcards cannot be used in this directive.
This will distribute the domains for rewritten URLs among the specified shards.
ModPagespeedShardDomain example.com static1.example.com,static2.example.com
Using this directive, mod_pagespeed will distribute roughly half the resources rewritten from example.com into static1.example.com, and the rest to static2.example.com. You can specify as many shards as you like. The optimum number of shards is a topic of active research, and is browser-dependent. Configuring between 2 and 4 shards should yield good results. Changing the number of shards will cause mod_pagespeed to choose different names for resources, resulting in a partial cache flush.
When used in combination with ModPagespeedRewriteDomain, the Rewrite mappings will be done first. Then the shard selection occurs. Origin domains are always tracked so that when a browser sends a sharded URL back to the Apache server, mod_pagespeed can find it.
Let's look at an example:
ModPagespeedShardDomain example.com static1.example.com,static2.example.com
  ModPagespeedMapRewriteDomain example.com www.example.com
  ModPagespeedMapOriginDomain localhost example.com
In this example, example.com and www.example.com are "tied" together via ModPagespeedMapRewriteDomain. The origin-mapping to localhost propagates automatically to www.example.com, static1.example.com, and static2.example.com. So when mod_pagepseed cache-extends an HTML stylesheet reference http://www.example.com/styles.css, it will be:
  1. Fetched by the server rewriting the HTML from localhost
  2. Rewritten to http://example.com/styles.css.pagespeed.ce.HASH.css
  3. Sharded to http://static1.example.com/styles.css.pagespeed.ce.HASH.css
Note: It is the responsbility of the site administrator to set up the shard entries in their DNS or CNAME configuration. Also, please see the note above about the servers for rewrite domains -- this applies to sharded domains as well. They must have access to the same content as the original domain.
By specifiying a domain in this directive, either as source or destination, you are authorizing mod_pagespeed to rewrite resources found in this domain. See ModPagespeedDomain.
This directive can be used in .htaccess files and Directory scopes. However, you should be very careful about the use of ModPagespeedShardDomain in htaccess files. To maximize browser-cache effectiveness, sharding should be consistent across an entire web-site.

Configuring mod_pagespeed Filters

Rewriting Level

mod_pagespeed offers two default "levels" to simplify configuration: PassThrough and CoreFilters. The CoreFilters set contains filters that the mod_pagespeed team believes are safe for most web sites. By using the CoreFilters set, as mod_pagespeed is updated with new filters, your site will get faster.
To disable the CoreFilters, you can specify
ModPagespeedRewriteLevel PassThrough
and then enable specific filters with the ModPagespeedEnableFilters directive. The default level is CoreFilters. The core set of filters is set to:
add_head
   combine_css
   extend_cache
   inline_css
   inline_javascript
   insert_img_dimensions
   rewrite_images
   trim_urls

Enabling and Disabling Specific Filters

To turn off specific filters in the core set, specify:
ModPagespeedDisableFilters filtera,filterb
For example, if you want to use the core set of filters, but specifically disable rewrite_images and combine_css, you can use:
ModPagespeedDisableFilters rewrite_images,combine_css
The ModPagespeedEnableFilters configuration file directive allows specification of one or more filters by name, separated by commas. You can use any number of ModPagespeedEnableFilters directives, each of which can contain multiple filter names separated by commas. For example:
ModPagespeedRewriteLevel PassThrough
    ModPagespeedEnableFilters combine_css,extend_cache,rewrite_images
    ModPagespeedEnableFilters rewrite_css,rewrite_javascript
The order of the directives in the configuration file is not important: the rewriters are run in the pre-defined order presented in the table:

Filter name In Core Set Brief Description
add_head YesAdds a <head> element to the document if not already present
combine_heads NoCombines multiple <head> elements found in document into one
strip_scripts NoRemoves all script tags from document to help run experiments
outline_css NoExternalize large blocks of CSS into a cacheable file
outline_javascript NoExternalize large blocks of JS into a cacheable file
move_css_to_head NoMoves CSS elements into the <head>
combine_css YesCombines multiple CSS elements into one
rewrite_css NoRewrites CSS files to remove excess whitespace and comments
make_google_analytics_async NoConvert synchronous use of Google Analytics API to asynchronous
rewrite_javascript NoRewrites Javscript files to remove excess whitespace and comments
inline_css YesInlines small CSS files into the HTML document
inline_javascript YesInlines small JS files into the HTML document
rewrite_images YesOptimizes images, re-encoding them, removing excess pixels, and inlining small images
insert_img_dimensions YesAdds width/height attributes to <img> tags that lack them
remove_comments NoRemoves comments in HTML files, though not inline js or css
trim_urls YesShortens URLs by making them relative to the base URL
collapse_whitespace NoRemoves excess whitespace in HTML files (avoiding <pre>, <script>, <style>, and <textarea>)
elide_attributes NoRemoves attributes which are not significant according to the HTML spec
extend_cache YesExtends cache lifetime of all resources by signing URLs with content hash
remove_quotes NoRemoves quotes around HTML attributes that are not lexically required
add_instrumentation NoAdds JavaScript to page to measure latency and send back to the server

Tuning the Filters

Once the rewriters are selected, some of them may also be tuned. These parameters control the inlining and outlining thresholds of various resources.
ModPagespeedCssInlineMaxBytes        2048
    ModPagespeedImgInlineMaxBytes        2048
    ModPagespeedJsInlineMaxBytes         2048
    ModPagespeedCssOutlineMinBytes       3000
    ModPagespeedJsOutlineMinBytes        3000
Note: The default settings are reasonable and intuitive, but as of this writing (Feburary 2011) have not been experimentally tuned.
These directives can be used in .htaccess files and <Directory> scopes.

mod_pagespeed Installation and Configuration

Installation Tips

mod_pagespeed is available in binary form as as Debian package for Linux distributions such as Ubuntu, installable with dpkg. It is also available as an RPM package for CentOS or compatible Linux distributions.
You can browse or check out the source code in the open source repository.

Configuring the Module

mod_pagespeed contains an Apache "output filter" plus several content handlers.
Note: The location of the configuration file is dependent on the Linux distribution on which mod_pagespeed is installed.
On Debian/Ubuntu Linux distributions, the directory will be:
/etc/apache2/mods-available
On CentOS/Fedora, the directory will be:
/etc/httpd/conf.d
The mod_pagespeed configuration directives should be wrapped inside an IfModule:
<IfModule pagespeed_module>
....
</IfModule>

Configuring Handlers

mod_pagespeed contains three handlers:
  1. A default handler to serve optimized resources
  2. mod_pagespeed_statistics: shows server statistics since startup, from which one can compute average latency, and thereby measure the effectiveness of various rewriting passes
  3. mod_pagespeed_beacon: part of the infrastructure we provide for measuring page latency.
The following settings for the handlers can be used as a guideline:
# Uncomment the following line if you want to disable statistics entirely.
    # ModPagespeedStatistics off

    # This page shows statistics about the mod_pagespeed module.
    <Location /mod_pagespeed_statistics>
        Order allow,deny
        # One may insert other "Allow from" lines to add hosts that are
        # allowed to look at generated statistics.  Another possibility is
        # to comment out the "Order" and "Allow" options from the config
        # file, to allow any client that can reach the server to examine
        # statistics.  This might be appropriate in an experimental setup or
        # if the Apache server is protected by a reverse proxy that will
        # filter URLs to avoid exposing these statistics, which may
        # reveal site metrics that should not be shared otherwise.
        Allow from localhost
        SetHandler mod_pagespeed_statistics
    </Location>

    # This handles the client-side instrumentation callbacks which are injected
    # by the add_instrumentation filter.
    <Location /mod_pagespeed_beacon>
          SetHandler mod_pagespeed_beacon
    </Location>

Setting up the Output Filter

The output filter is used to parse, optimize, and re-serialize HTML content that is generated elsewhere in the Apache server.
# Direct Apache to send all HTML output to the mod_pagespeed output handler.
    AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER text/html
Note:mod_pagespeed automatically enables mod_deflate for compression.

Turning the module on and off

Turning OFF mod_pagespeed

To turn off mod_pagespeed completely, insert as the top line of pagespeed.conf:
ModPagespeed off
These directives can be used in .htaccess files and <Directory> scopes.

Turning ON mod_pagespeed

To turn mod_pagespeed ON, insert as the top line of pagespeed.conf:
ModPagespeed on

Lower-casing HTML element and attribute names

Note: New feature as of 0.9.16.1
HTML is case-insensitive, whereas XML and XHTML are not. Web performance Best Practices suggest using lowercase keywords, and mod_pagespeed can safely make that transformation in HTML documents.
In general, mod_pagespeed knows whether a document is HTML or not via Content-Type tags in HTTP headers, and DOCTYPE. However, many web sites have Content-Type: text/html for resources that are actually XML documents.
If mod_pagespeed lowercases keywords in XML pages, it can break the consumers of such pages, such as Flash. To be conservative and avoid breaking such pages, mod_pagespeed does not lowercase HTML element and attribute names by default. However, you can sometimes achieve a modest improvement in the size of compressed HTML by enabling this feature with:
ModPagespeedLowercaseHtmlNames on
These directives can be used in .htaccess files and <Directory> scopes.

Risks

This switch is only risky in the presence of XML files that are incorrectly served with Content-type: text/html. Lower-casing XML element and attribute may affect whatever software is reading the XML.

.htaccess files and Directory scopes

The .htaccess file can be used to control most of the directives in mod_pagespeed. This is functionally equivalent to specifying mod_pagespeed directives in a <Directory> scope. Note, however, that the file-matching implied by the <Directory> scope, or the directory of the .htaccess file, is only relevant to the HTML file, and not to any of the resources referenced from the HTML file. To restrict resources by directory, you must use the ModPagespeedAllow and ModPagespeedDisallow directives described above, using full paths or wildcards in those directives.

Directives that cannot be used with .htaccess and <Directory> scope

ModPagespeedFetcherTimeoutMs
ModPagespeedFileCacheCleanIntervalMs
ModPagespeedFileCachePath
ModPagespeedFileCacheSizeKb
ModPagespeedGeneratedFilePrefix
ModPagespeedLRUCacheByteLimit
ModPagespeedLRUCacheKbPerProcess

Directives that can be used with .htaccess and <Directory> scope

ModPagespeed
ModPagespeedAllow
ModPagespeedBeaconUrl
ModPagespeedCombineAcrossPaths
ModPagespeedCssInlineMaxBytes
ModPagespeedCssOutlineMinBytes
ModPagespeedDisableFilters
ModPagespeedDisallow
ModPagespeedDomain
ModPagespeedEnableFilters
ModPagespeedImgInlineMaxBytes
ModPagespeedJsInlineMaxBytes
ModPagespeedJsOutlineMinBytes
ModPagespeedLowercaseHtmlNames
ModPagespeedMapOriginDomain
ModPagespeedMapRewriteDomain
ModPagespeedRewriteLevel
The advantage of .htaccess is that it can be used in environments where the site administrator does not have access to the Apache configuration. However, there is a significant per-request overhead from processing .htaccess files. See The Apache HTTP Server Documentation:
Note: You should avoid using .htaccess files completely if you have access to httpd main server config file. Using .htaccess files slows down your Apache server. Any directive that you can include in a .htaccess file is better set in a <Directory> block, as it will have the same effect with better performance.
Another mechanism available to configure mod_pagespeed for multiple distinct sites is VirtualHost.