httpd3.0. Note that some of the features appear only in the final version 3.0, not in all the prereleases.
Authorizationheader field is in the request.
httpd's current cache design.
httpdprovides a way to specify a set of URL patterns that should be exclusively cached.
httpdprovides another directive for specifying a set of URL patterns that should never be cached.
ExpiresHTTP header is specified by the remote server it is always used; cache file is considered to be up-to-date until this time is reached.
Especially, if the document expires immediately or within a very short time (a couple of minutes) it is never even written to the cache file. This saves resources because the same file is very unlikely to be requested again within a minute. Actually, sometimes what appears to be a lifetime of one minute could well be an inaccurate machine clock that is off by one minute, and expiry should be immediate.
Documents with an invalid
Expires header line are never
Expires header field is the only correct way to
determine if a document should be cached; strictly speaking documents
without this field should never be cached. However, since in practise
Expires header is extremely rarely given by the current
HTTP servers, it is necessary to use approximative algorithms to
calculate some kind of expiry date for documents that otherwise don't
Expires field is part of only the HTTP
protocol, not other WWW protocols.
httpd handles this via a
or LM factor for short. This factor specifies the fraction of
time since last modification that the file is approximated to remain
For example, with LM factor 0.1 a file that was changed ten hours ago it be considered to be up-to-date for one hour, and a file that was modified a month ago will expire after three days.
LM factor can be specified differently for URL's matching different URL patterns.
However, since documents without a last-modified field are very often produced by CGI scripts on the fly it is safest to keep this value in zero, or very small. Afterall, most of the script responses should never be cached, but reproduced by the script every time, because the content usually changes.
If a CGI script produces output that is valid for a certain time it
should express this by returning an
Expires header field.
As an example consider a result of a database lookup and the database
is updated every night at 2:30; clearly the same query will return the
same results at least until 2:30, so that time should be specified as
the expiry time.
httpdit is possible to configure a cache refresh interval for URL's matching a given pattern. This will cause the proxy server to check that the file is still up-to-date if more than the maximum allowed time has passed since the last check, even if it would still seem to be up to date according expiry date.
As a special case, specifying the refresh interval to be zero every cache access will cause a check to be made from remote server. This is ideal for users who need to have always the absolutely most up-to-date version, but still want faster response times and saves in network costs. This is still cheaper as all the checks are performed using the conditional GET request, which sends the document only if it has changed, and otherwise tells the proxy to use the cache.
httpdhas already a cached version of a document, albeit an expired one, it issues a conditional GET request which causes the document to be sent to the proxy only if it has changed since it was last updated to the cache. If the document has been changed it will get an expiry date in the normal fashion when it is written to the cache.
If the document hasn't changed
httpd will recalculate a
new expiry date for it, using the old last-modification date with the
LM factor approximation; that is of course only if no expiry date was
explicitly given in the "Not modified" response.
gopher) is used as the first path component, the hostname (with optional port part) as the second component, and the rest of the URL is taken directly as the rest of the pathname.
httpd was probably the first WWW proxy ever to
provide caching we picked this particular design to make it easy to
debug the caching system, and sometimes go and fix the cache by hand.
This will eventually be replaced by something more efficient. The
current one clearly has some flaws in it, like DNS aliases for
host names cause multiple caching of same documents. This can be
solved by using the IP number string instead of the host name.
httpdis given a certain amount of diskspace for its cache. If the specified limit is reached
httpdperforms garbage collection, removing cache files that haven't been accessed lately or that have expired expired.
If disk space is a critical factor, that is, if it's desirable that
httpd keeps its cache in the minimum, it always removes
all the expired files during the garbage collection. This is the
However, this is wasteful. Often files that have expired have not in fact changed, and a simple conditional GET request could be made to verify this and make them up-to-date again ("Not modified" HTTP response contains a new expiry date, or one can be calculated from the data in the cache database).
httpd has a mode when it lets its cache fill up and
removes only files that haven't been used in a long time (as usual,
this time can be configured according to URL patterns). All expired
files are kept until it is absolutely necessary to sacrifice some of
them to get space for new ones. If there is sufficient amount of disk
space available this situation is never reached, and
httpd is able to get optimal performance for conditional
httpdto never connect to remote hosts, that is, to run in cache-only mode. This is useful when there is no network connection, e.g. on a portable machine that is no connected to the network for the moment.