However, there are many problems that need to be coped with once caching is introduced. How long is it possible to keep a document in the cache and still be sure that it is up-to-date? How to decide which documents are worth caching and for how long?
Document expiry has been foreseen in the HTTP protocol which contains an object header specifying the expiry date of an object. However, currently there are very few servers that actually give the expiry information, and until servers start sending it more commonly we will have to rely on other, more heuristic approaches, like only making a rough estimate of the time to live for an object.
More importantly, since many of the documents in the Web are "living" documents, specifying an expiry date for them is generally a difficult task. A given document may remain unchanged for a relatively long time, then suddenly change. This change may have been unforeseen by the document author and so wouldn't be accurately reflected in the expiry information.