Caucho Technology
  • resin 4.0
  • server caching


    Server caching can speed dynamic pages to near-static speeds. When pages created by database queries only change every 15 minutes, e.g. CNN or Wikipedia or Slashdot, Resin can cache the results and serve them like static pages. Because Resin's caching only depends on HTTP headers, it will work for any JSPs, servlet or PHP page.response.

    Resin's caching operates like a proxy cache, looking at HTTP headers to compare hash codes or digests or simply caching for a static amount of time. Since the proxy cache follows the HTTP standards, applications like Mediawiki will automatically see dramatic performance improvement with no extra work. You can even cache REST-style GET requests.

    Because the cache supports advanced headers like "Vary", it can cache different versions of the page depending on the browser's capabilities. Gzip-enabled browsers will get the cached compressed page while more primitive browsers will see the plan page. With "Vary: Cookie", you can return a cached page for anonymous users, and still return a custom page for logged-in users.

    Overview

    For many applications, enabling the proxy cache can improve your application's performance dramatically. When Quercus runs Mediawiki with caching enabled, Resin can return results as fast as static pages. Without caching, the performance is significantly slower.

    Mediawiki Performance
    CACHE PERFORMANCENON-CACHE PERFORMANCE
    4316 requests/sec29.80 requests/sec

    To enable caching, your application will need to set a few HTTP headers. While a simple application can just set a timed cache with max-age, a more sophisticated application can generate page hash digests with ETag and short-circuit repeated If-None-Match responses.

    If subsections of your pages are cacheable but the main page is not, you can cache servlet includes just like caching top-level pages. Resin's include processing will examine the headers set by your include servlet and treat them just like a top-level cache.

    HTTP Caching Headers

    HTTP Server to Client Headers
    HEADERDESCRIPTION
    Cache-Control: privateRestricts caching to the browser only, forbidding proxy-caching.
    Cache-Control: max-age=nSpecifies a static cache time in seconds for both the browser and proxy cache.
    Cache-Control: s-maxage=nSpecifies a static cache time in seconds for the proxy cache only.
    Cache-Control: no-cacheDisables caching entirely.
    ETag: hash or identifierUnique identifier for the page's version. Hash-based values are better than date/versioning, especially in clustered configurations.
    Last-Modified: time of modificationAccepted by Resin's cache, but not recommended in clustered configurations.
    Vary: header-nameCaches the client's header, e.g. Cookie, or Accept-encoding
    HTTP Client to Server Headers
    HEADERDESCRIPTION
    If-None-MatchSpecifies the ETag value for the page
    If-Modified-SinceSpecifies the Last-Modified value for the page

    Cache-Control: max-age

    Setting the max-age header will cache the results for a specified time. For heavily loaded pages, even setting short expires times can significantly improve performance. Pages using sessions should set a "Vary: Cookie" header, so anonymous users will see the cached page, while logged-in users will see their private page.

    Example: 15s cache
    <%@ page session="false" %>
    <%! int counter; %>
    <%
    response.addHeader("Cache-Control", "max-age=15");
    %>
    Count: <%= counter++ %>
    

    max-age is useful for database generated pages which are continuously, but slowly updated. To cache with a fixed content, i.e. something which has a valid hash value like a file, you can use ETag with If-None-Match.

    ETag and If-None-Match

    The ETag header specifies a hash or digest code for the generated page to further improve caching. The browser or cache will send the ETag as a If-None-Match value when it checks for any page updates. If the page is the same, the application will return a 304 NOT_MODIFIED response with an empty body. Resin's FileServlet automatically provides this capability for static pages. In general, the ETag is the most effective caching technique, although it requires a bit more work than max-age.

    To handle clustered servers in a load-balanced configuration, the calculated ETag should be a hash of the result value, not a timestamp or version. Since each server behind a load balancer will generate a different timestamp for the files, each server would produce a different tag, even though the generated content was identical. So either producing a hash or ensuring the ETag value is the same is critical.

    ETag servlets will often also use <cache-mapping> configuration to set a max-age or s-maxage. The browser and proxy cache will cache the page without revalidation until max-age runs out. When the time expires, it will use If-None-Match to revalidate the page.

    When using ETag, your application will need to look for the If-None-Match header on incoming requests. If the value is the same, your servlet can return 304 NOT-MODIFIED. If the value differs, you'll return the new content and hash.

    Example: ETag servlet
    import java.io.*;
    import javax.servlet.*;
    import javax.servlet.http.*;
    
    public class MyServlet extends HttpServlet
    {
      public void doGet(HttpServletRequest req, HttpServletResponse res)
      {
        String etag = getCurrentEtag();
    
        String ifNoneMatch = req.getHeader("If-None-Match");
    
        if (ifNoneMatch != null && ifNoneMatch.equals(etag)) {
          res.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
          return;
        }
    
        res.setHeader("ETag", etag);
    
        ... // generate response
      }
    }
    
    Example: HTTP headers for ETag match
    C: GET /test-servlet HTTP/1.1
    
    S: HTTP/1.1 200 OK
    S: ETag: xm8vz29I
    S: Cache-Control: max-age=15s
    S: ...
    
    C: GET /test-servlet HTTP/1.1
    C: If-None-Match: xm8vz29I
    
    S: HTTP/1.1 304 Not Modified
    S: Cache-Control: max-age=15s
    S: ...
    
    Example: HTTP headers for ETag mismatch
    C: GET /test-servlet HTTP/1.1
    C: If-None-Match: UXi456Ww
    
    S: HTTP/1.1 200 OK
    S: ETag: xM81x+3j
    S: Cache-Control: max-age=15s
    S: ...
    

    Expires

    Although max-age tends to be easier and more flexible, an application can also set the Expires header to enable caching, when the expiration date is a specific time instead of an interval. For heavily loaded pages, even setting short expires times can significantly improve performance. Sessions should be disabled for caching.

    The following example sets expiration for 15 seconds. So the counter should update slowly.

    Example: expires
    <%@ page session="false" %>
    <%! int counter; %>
    <%
    long now = System.currentTimeMillis();
    response.setDateHeader("Expires", now + 15000);
    %>
    Count: <%= counter++ %>
    

    Expires is useful for database generated pages which are continuously, but slowly updated. To cache with a fixed content, i.e. something which has a valid hash value like a file, you can use ETag with If-None-Match.

    If-Modified-Since

    The If-Modified-Since headers let you cache based on an underlying change date. For example, the page may only change when an underlying source page changes. Resin lets you easily use If-Modified by overriding methods in HttpServlet or in a JSP page.

    Because of the clustering issues mentioned in the ETag section, it's generally recommended to use ETag and If-None-Match and avoid If-Modified-Since. In a load balanced environment, each backend server would generally have a different Last-Modified value, while would effectively disable caching for a proxy cache or a browser that switched from one backend server to another.

    The following page only changes when the underlying 'test.xml' page changes.

    <%@ page session="false" %>
    <%!
    int counter;
    
    public long getLastModified(HttpServletRequest req)
    {
      String path = req.getRealPath("test.xml");
      return new File(path).lastModified();
    }
    %>
    Count: <%= counter++ %>
    

    If-Modified pages are useful in combination with the cache-mapping configuration.

    Vary

    In some cases, you'll want to have separate cached pages for the same URL depending on the capabilities of the browser. Using gzip compression is the most important example. Browsers which can understand gzip-compressed files receive the compressed page while simple browsers will see the uncompressed page. Using the "Vary" header, Resin can cache different versions of that page.

    Example: vary caching for on gzip
    <%
      response.addHeader("Cache-Control", "max-age=3600");
      response.addHeader("Vary", "Accept-Encoding");
    %>
    
    Accept-Encoding: <%= request.getHeader("Accept-Encoding") %>
    

    The "Vary" header can be particularly useful for caching anonymous pages, i.e. using "Vary: Cookie". Logged-in users will get their custom pages, while anonymous users will see the cached page.

    Included Pages

    Resin can cache subpages even when the top page can't be cached. Sites allowing user personalization will often design pages with jsp:include subpages. Some subpages are user-specific and can't be cached. Others are common to everybody and can be cached.

    Resin treats subpages as independent requests, so they can be cached independent of the top-level page. Try the following, use the first expires counter example as the included page. Create a top-level page that looks like:

    Example: top-level non-cached page
    <% if (! session.isNew()) { %>
    <h1>Welcome back!</h1>
    <% } %>
    
    <jsp:include page="expires.jsp"/>
    
    Example: cached include page
    <%@ page session="false" %>
    <%! int counter; %>
    <%
    response.setHeader("Cache-Control", "max-age=15");
    %>
    Count: <%= counter++ %>
    

    Caching Anonymous Users

    The Vary header can be used to implement anonymous user caching. If a user is not logged in, he will get a cached page. If he's logged in, he'll get his own page. This feature will not work if anonymous users are assigned cookies for tracking purposes.

    To make anonymous caching work, you must set the Vary: Cookie If you omit the Vary header, Resin will use the max-age to cache the same page for every user.

    Example: 'Vary: Cookie' for anonymous users
    <%@ page session="false" %>
    <%! int _counter; %>
    <%
    response.addHeader("Cache-Control", "max-age=15");
    response.addHeader("Vary", "Cookie");
    
    String user = request.getParameter("user");
    %>
    User: <%= user %> <%= counter++ %>
    

    The top page must still set the max-age or If-Modified header, but Resin will take care of deciding if the page is cacheable or not. If the request has any cookies, Resin will not cache it and will not use the cached page. If it has no cookies, Resin will use the cached page.

    When using Vary: Cookie, user tracking cookies will make the page uncacheable even if the page is the same for all users. Resin chooses to cache or not based on the existence of any cookies in the request, whether they're used or not.

    Configuration

    cache

    child of cluster

    <cache> configures the proxy cache (requires Resin Professional). The proxy cache improves performance by caching the output of servlets, jsp and php pages. For database-heavy pages, this caching can improve performance and reduce database load by several orders of magnitude.

    The proxy cache uses a combination of a memory cache and a disk-based cache to save large amounts of data with little overhead.

    Management of the proxy cache uses the ProxyCacheMXBean.

    <cache> Attributes
    ATTRIBUTEDESCRIPTIONDEFAULT
    pathPath to the persistent cache files.cache/
    disk-sizeMaximum size of the cache saved on disk.1024M
    enableEnables the proxy cache.true
    enable-rangeEnables support for the HTTP Range header.true
    entriesMaximum number of pages stored in the cache.8192
    max-entry-sizeLargest page size allowed in the cache.1M
    memory-sizeMaximum heap memory used to cache blocks.8M
    rewrite-vary-as-privateRewrite Vary headers as Cache-Control: private to avoid browser and proxy-cache bugs (particularly IE).false
    <cache> schema
    element cache {
      disk-size?
      & enable?
      & enable-range?
      & entries?
      & path?
      & max-entry-size?
      & memory-size?
      & rewrite-vary-as-private?
    }
    
    Example: enabling proxy cache
    <resin xmlns="http://caucho.com/ns/resin">
        <cluster id="web-tier">
            <cache entries="16384" disk-size="2G" memory-size="256M"/>
    
            <server id="a" address="192.168.0.10"/>
    
            <host host-name="www.foo.com">
        </cluster>
    </resin>
    

    rewrite-vary-as-private

    Because not all browsers understand the Vary header, Resin can rewrite Vary to a Cache-Control: private. This rewriting will cache the page with the Vary in Resin's proxy cache, and also cache the page in the browser. Any other proxy caches, however, will not be able to cache the page.

    The underlying issue is a limitation of browsers such as IE. When IE sees a Vary header it doesn't understand, it marks the page as uncacheable. Since IE only understands "Vary: User-Agent", this would mean IE would refuse to cache gzipped pages or "Vary: Cookie" pages.

    With the <rewrite-vary-as-private> tag, IE will cache the page since it's rewritten as "Cache-Control: private" with no Vary at all. Resin will continue to cache the page as normal.

    cache-mapping

    cache-mapping assigns a max-age and Expires to a cacheable page, i.e. a page with an ETag or Last-Modified setting. It does not affect max-age or Expires cached pages. The FileServlet takes advantage of cache-mapping because it provides the ETag servlet.

    Often, you want a long Expires time for a page to a browser. For example, any gif will not change for 24 hours. That keeps browsers from asking for the same gif every five seconds; that's especially important for tiny formatting gifs. However, as soon as that page or gif changes, you want the change immediately available to any new browser or to a browser using reload.

    Here's how you would set the Expires to 24 hours for a gif, based on the default FileServlet.

    Example: caching .gif files for 24h
    <web-app xmlns="http://caucho.com/ns/resin">
    
      <cache-mapping url-pattern='*.gif'
                     expires='24h'/>
    </web-app>
    

    The cache-mapping automatically generates the Expires header. It only works for cacheable pages setting If-Modified or ETag. It will not affect pages explicily setting Expires or non-cacheable pages. So it's safe to create a cache-mapping for *.jsp even if only some are cacheable.

    Debugging caching

    When designing and testing your cached page, it's important to see how Resin is caching the page. To turn on logging for caching, you'll add the following to your resin.xml:

    Example: adding caching log
    <resin xmlns="http://caucho.com/ns/resin">
    
      <logger name="com.caucho.server.cache" level="fine"/>
    
      ...
    
    </resin>
    

    The output will look something like the following:

    [10:18:11.369] caching: /images/caucho-white.jpg etag="AAAAPbkEyoA" length=6190
    [10:18:11.377] caching: /images/logo.gif etag="AAAAOQ9zLeQ" length=571
    [10:18:11.393] caching: /css/default.css etag="AAAANzMooDY" length=1665
    [10:18:11.524] caching: /images/pixel.gif etag="AAAANpcE4pY" length=61
    
    ...
    
    [10:18:49.303] using cache: /css/default.css
    [10:18:49.346] using cache: /images/pixel.gif
    [10:18:49.348] using cache: /images/caucho-white.jpg
    [10:18:49.362] using cache: /images/logo.gif
    

    Administration

    /resin-admin

    block cache miss ratio

    The block cache miss ratio tells how often Resin needs to access the disk to read a cache entry. Most cache requests should come from memory to improve performance, but cache entries are paged out to disk when the cache gets larger. It's very important to keep the <memory-size> tag of the <cache> large enough so the block cache miss ratio is small.

    proxy cache miss ratio

    The proxy cache miss ratio measures how often cacheable pages must go to their underlying servlet instead of being cached. The miss ratio does not measure the non-cacheable pages.

    invocation miss ratio

    The invocation miss ratio measures how often Resin's invocation cache misses. The invocation cache is used for both cacheable and non-cacheable pages to save the servlet and filter chain construction. A miss of the invocation cache is expensive, since it will not only execute the servlet, but also force the servlet and filter chain to be rebuilt. The <entries> field of the <cache> controlls the invocation miss ratio.

    BlockManagerMXBean

    BlockManagerMXBean returns statistics about the block cache. Since Resin's block cache is used for the proxy cache as well as clustered sessions and JMS messages, the performance of the block cache is very important. The block cache is a memory/paging cache on top of a file-based backing store. If the block cache misses, a request needs to go to disk, otherwise it can be served directly from memory.

    BlockManagerMXBean ObjectName
    resin:type=BlockManager
    
    BlockManagerMXBean.java
    public interface BlockManagerMXBean {
      public long getBlockCapacity();
    
      public long getHitCountTotal();
      public long getMissCountTotal();
    }
    

    ProxyCacheMXBean

    The ProxyCacheMXBean provides statistics about the proxy cache as well as operations to clear the cache. The hit and miss counts tell how effectively the cache is improving performance.

    ProxyCacheMXBean ObjectName
    resin:type=ProxyCache
    
    ProxyCacheMXBean.java
    public interface ProxyCacheMXBean {
      public long getHitCountTotal();
      public long getMissCountTotal();
    
      public CacheItem []getCacheableEntries(int max);
      public CacheItem []getUncacheableEntries(int max);
    
      public void clearCache();
      public void clearCacheByPattern(String hostRegexp, String urlRegexp);
      public void clearExpires();
    }
    

    Copyright © 1998-2011 Caucho Technology, Inc. All rights reserved.
    Resin ® is a registered trademark, and Quercustm, Ambertm, and Hessiantm are trademarks of Caucho Technology.