Friday, March 11, 2011

mod_pagespeed Installation and Configuration

Installation Tips

mod_pagespeed is available in binary form as as Debian package for Linux distributions such as Ubuntu, installable with dpkg. It is also available as an RPM package for CentOS or compatible Linux distributions.
You can browse or check out the source code in the open source repository.

Configuring the Module

mod_pagespeed contains an Apache "output filter" plus several content handlers.
Note: The location of the configuration file is dependent on the Linux distribution on which mod_pagespeed is installed.
On Debian/Ubuntu Linux distributions, the directory will be:
/etc/apache2/mods-available
On CentOS/Fedora, the directory will be:
/etc/httpd/conf.d
The mod_pagespeed configuration directives should be wrapped inside an IfModule:
<IfModule pagespeed_module>
....
</IfModule>

Configuring Handlers

mod_pagespeed contains three handlers:
  1. A default handler to serve optimized resources
  2. mod_pagespeed_statistics: shows server statistics since startup, from which one can compute average latency, and thereby measure the effectiveness of various rewriting passes
  3. mod_pagespeed_beacon: part of the infrastructure we provide for measuring page latency.
The following settings for the handlers can be used as a guideline:
# Uncomment the following line if you want to disable statistics entirely.
    # ModPagespeedStatistics off

    # This page shows statistics about the mod_pagespeed module.
    <Location /mod_pagespeed_statistics>
        Order allow,deny
        # One may insert other "Allow from" lines to add hosts that are
        # allowed to look at generated statistics.  Another possibility is
        # to comment out the "Order" and "Allow" options from the config
        # file, to allow any client that can reach the server to examine
        # statistics.  This might be appropriate in an experimental setup or
        # if the Apache server is protected by a reverse proxy that will
        # filter URLs to avoid exposing these statistics, which may
        # reveal site metrics that should not be shared otherwise.
        Allow from localhost
        SetHandler mod_pagespeed_statistics
    </Location>

    # This handles the client-side instrumentation callbacks which are injected
    # by the add_instrumentation filter.
    <Location /mod_pagespeed_beacon>
          SetHandler mod_pagespeed_beacon
    </Location>

Setting up the Output Filter

The output filter is used to parse, optimize, and re-serialize HTML content that is generated elsewhere in the Apache server.
# Direct Apache to send all HTML output to the mod_pagespeed output handler.
    AddOutputFilterByType MOD_PAGESPEED_OUTPUT_FILTER text/html
Note:mod_pagespeed automatically enables mod_deflate for compression.

Turning the module on and off

Turning OFF mod_pagespeed

To turn off mod_pagespeed completely, insert as the top line of pagespeed.conf:
ModPagespeed off
These directives can be used in .htaccess files and <Directory> scopes.

Turning ON mod_pagespeed

To turn mod_pagespeed ON, insert as the top line of pagespeed.conf:
ModPagespeed on

Lower-casing HTML element and attribute names

Note: New feature as of 0.9.16.1
HTML is case-insensitive, whereas XML and XHTML are not. Web performance Best Practices suggest using lowercase keywords, and mod_pagespeed can safely make that transformation in HTML documents.
In general, mod_pagespeed knows whether a document is HTML or not via Content-Type tags in HTTP headers, and DOCTYPE. However, many web sites have Content-Type: text/html for resources that are actually XML documents.
If mod_pagespeed lowercases keywords in XML pages, it can break the consumers of such pages, such as Flash. To be conservative and avoid breaking such pages, mod_pagespeed does not lowercase HTML element and attribute names by default. However, you can sometimes achieve a modest improvement in the size of compressed HTML by enabling this feature with:
ModPagespeedLowercaseHtmlNames on
These directives can be used in .htaccess files and <Directory> scopes.

Risks

This switch is only risky in the presence of XML files that are incorrectly served with Content-type: text/html. Lower-casing XML element and attribute may affect whatever software is reading the XML.

.htaccess files and Directory scopes

The .htaccess file can be used to control most of the directives in mod_pagespeed. This is functionally equivalent to specifying mod_pagespeed directives in a <Directory> scope. Note, however, that the file-matching implied by the <Directory> scope, or the directory of the .htaccess file, is only relevant to the HTML file, and not to any of the resources referenced from the HTML file. To restrict resources by directory, you must use the ModPagespeedAllow and ModPagespeedDisallow directives described above, using full paths or wildcards in those directives.

Directives that cannot be used with .htaccess and <Directory> scope

ModPagespeedFetcherTimeoutMs
ModPagespeedFileCacheCleanIntervalMs
ModPagespeedFileCachePath
ModPagespeedFileCacheSizeKb
ModPagespeedGeneratedFilePrefix
ModPagespeedLRUCacheByteLimit
ModPagespeedLRUCacheKbPerProcess

Directives that can be used with .htaccess and <Directory> scope

ModPagespeed
ModPagespeedAllow
ModPagespeedBeaconUrl
ModPagespeedCombineAcrossPaths
ModPagespeedCssInlineMaxBytes
ModPagespeedCssOutlineMinBytes
ModPagespeedDisableFilters
ModPagespeedDisallow
ModPagespeedDomain
ModPagespeedEnableFilters
ModPagespeedImgInlineMaxBytes
ModPagespeedJsInlineMaxBytes
ModPagespeedJsOutlineMinBytes
ModPagespeedLowercaseHtmlNames
ModPagespeedMapOriginDomain
ModPagespeedMapRewriteDomain
ModPagespeedRewriteLevel
The advantage of .htaccess is that it can be used in environments where the site administrator does not have access to the Apache configuration. However, there is a significant per-request overhead from processing .htaccess files. See The Apache HTTP Server Documentation:
Note: You should avoid using .htaccess files completely if you have access to httpd main server config file. Using .htaccess files slows down your Apache server. Any directive that you can include in a .htaccess file is better set in a <Directory> block, as it will have the same effect with better performance.
Another mechanism available to configure mod_pagespeed for multiple distinct sites is VirtualHost.

No comments: