Annotation of libwww/Library/src/HTCache.html, revision 2.21
2.1 frystyk 1: <HTML>
2: <HEAD>
2.16 frystyk 3: <TITLE>W3C Sample Code Library libwww Persistent Cache Manager</TITLE>
2.1 frystyk 4: </HEAD>
5: <BODY>
2.10 frystyk 6: <H1>
7: Persistent Cache Manager
8: </H1>
2.1 frystyk 9: <PRE>
10: /*
11: ** (c) COPYRIGHT MIT 1995.
12: ** Please first read the full copyright statement in the file COPYRIGH.
13: */
14: </PRE>
2.10 frystyk 15: <P>
2.11 frystyk 16: The cache contains details of persistent files which contain the contents
2.10 frystyk 17: of remote documents. The existing cache manager is somewhat naive - especially
2.11 frystyk 18: in its garbage collection but it is just an example of how it can be
19: done.However, it is a fully HTTP/1.1 compliant cache manager. More advanced
20: implementations are welcome!
2.10 frystyk 21: <P>
22: This module is implemented by <A HREF="HTCache.c">HTCache.c</A>, and it is
2.18 frystyk 23: a part of the <A HREF="http://www.w3.org/Library/">W3C Sample Code Library</A>.
2.1 frystyk 24: <PRE>
25: #ifndef HTCACHE_H
26: #define HTCACHE_H
27:
2.10 frystyk 28: #include "WWWLib.h"
2.1 frystyk 29: </PRE>
2.10 frystyk 30: <H2>
2.11 frystyk 31: Initialize and Terminate the Persistent Cache
2.10 frystyk 32: </H2>
33: <P>
2.21 ! frystyk 34: The <CODE>cache_root</CODE> is the URI of the location of the persistent
! 35: cache. An example is "<CODE>file:/tmp/w3c-lib</CODE>". If
! 36: <CODE>cache_root</CODE> is <CODE>NULL</CODE> then determine a cache root
! 37: using the following algorithm:
! 38: <OL>
! 39: <LI>
! 40: Look for any environment variables (if supported) in the following order:
! 41: <CODE>WWW_CACHE</CODE>, <CODE>TMP</CODE>, and <CODE>TEMP</CODE>. If none
! 42: are set then then fall back on "<CODE>/tmp</CODE>".
! 43: <LI>
! 44: Append the folder name "<CODE>w3c-cache</CODE>" to the root identified above
! 45: </OL>
! 46: <P>
! 47: The <CODE>cache_root</CODE> location does not have to exist, it will be created
! 48: automatically if not. An empty string will make '/' the cache root.
! 49: <P>
! 50: The size is the total size in MBytes - the default size is 20M. The cache
! 51: can not be less than 5M.
! 52: <P>
! 53: We can only enable the cache if we are in <A HREF="HTLib.html#Secure">secure
! 54: mode</A> where we can not access the local file system. This is for example
! 55: the case if using an application as a telnet shell.
2.11 frystyk 56: <PRE>
57: extern BOOL HTCacheInit (const char * cache_root, int size);
2.10 frystyk 58: </PRE>
59: <P>
2.11 frystyk 60: After the cache has been terminated it can not be used anymore unless you
61: do another <CODE>HTCacheInit()</CODE> call.
62: <PRE>
63: extern BOOL HTCacheTerminate (void);
2.10 frystyk 64: </PRE>
2.11 frystyk 65: <H2>
66: Cache Mode Parameters
67: </H2>
2.10 frystyk 68: <P>
2.21 ! frystyk 69: The persistent cache has a set of overall parameters that you can adust
2.10 frystyk 70: <H3>
2.11 frystyk 71: Enable and Disable the Cache
2.10 frystyk 72: </H3>
73: <P>
2.11 frystyk 74: The cache can be temporarily suspended by using the enable/disable flag.
75: This does not prevent the cache from being enabled/disable at a later point
76: in time.
77: <PRE>
78: extern void HTCacheMode_setEnabled (BOOL mode);
79: extern BOOL HTCacheMode_enabled (void);
2.10 frystyk 80: </PRE>
81: <H3>
2.11 frystyk 82: What is the current Cache Root?
2.10 frystyk 83: </H3>
84: <P>
2.11 frystyk 85: Return the value of the cache root. The cache root can only be set through
2.21 ! frystyk 86: the <CODE>HTCacheInit()</CODE> function. The string returned MUST be freed
! 87: by the caller
2.11 frystyk 88: <PRE>
2.21 ! frystyk 89: extern char * HTCacheMode_getRoot (void);
2.10 frystyk 90: </PRE>
91: <H3>
2.11 frystyk 92: Total Cache Size
2.10 frystyk 93: </H3>
94: <P>
2.11 frystyk 95: We set the default cache size to 20M. We set the minimum size to 5M in order
96: not to get into weird problems while writing the cache. The size is indicated
97: in Mega bytes. The size is given in MBytes and is also returned in MBytes.
2.14 frystyk 98: We don't consider the metainformation as part of the total cache size which
99: is the the reason for why the min cache size should not be less than 5M.
2.11 frystyk 100: <PRE>
101: extern BOOL HTCacheMode_setMaxSize (int size);
102: extern int HTCacheMode_maxSize (void);
2.10 frystyk 103: </PRE>
104: <H3>
2.19 frystyk 105: Max Size of a Single Cache Entry
106: </H3>
107: <P>
2.20 frystyk 108: It is also possible to control the max size of a single cache entry so that
109: the cache doesn't get filled with a very few, very large cached entries.
110: The default max size for a single cached entry is 3M. The value indicated
111: must be in Mbytes, for example, a vaue of 3 would mean 3 MBytes.
2.19 frystyk 112: <PRE>
113: extern BOOL HTCacheMode_setMaxCacheEntrySize (int size);
114: extern int HTCacheMode_maxCacheEntrySize (void);
115: </PRE>
116: <H3>
2.10 frystyk 117: How do we handle Expiration of Cached Objects?
118: </H3>
119: <P>
120: There are various ways of handling <CODE>Expires</CODE> header when met in
2.11 frystyk 121: a <I>history list</I>. Either it can be ignored all together, the user can
122: be notified with a warning, or the document can be reloaded automatically.
123: This flag decides what action to be taken. The default action is
2.10 frystyk 124: <CODE>HT_EXPIRES_IGNORE</CODE>. In <CODE>HT_EXPIRES_NOTIFY</CODE> mode ,
125: we push a message on to the Error stack which is presented to the user.
2.4 frystyk 126: <PRE>
127: typedef enum _HTExpiresMode {
128: HT_EXPIRES_IGNORE = 0,
129: HT_EXPIRES_NOTIFY,
130: HT_EXPIRES_AUTO
131: } HTExpiresMode;
132:
2.11 frystyk 133: extern void HTCacheMode_setExpires (HTExpiresMode mode);
134: extern HTExpiresMode HTCacheMode_expires (void);
135: </PRE>
136: <H3>
137: Disconnected Operation
138: </H3>
139: <P>
140: The cache can be set to handle disconnected operation where it does not use
2.20 frystyk 141: the network to validate entries and do not attempt to load new documents.
2.11 frystyk 142: All requests that can not be fulfilled by the cache will be returned with
143: a <CODE>"504 Gateway Timeout"</CODE> response. There are two modes of how
2.21 ! frystyk 144: the cache can operate in disconnected mode:
2.20 frystyk 145: <DL>
146: <DT>
147: <EM>No network activity at all</EM>
148: <DD>
149: Here is uses its own persistent cache
150: <DT>
151: <EM>Forward all disconnected requests to a proxy cache</EM>
152: <DD>
153: Here it uses the HTTP/1.1 cache-control to indicate that the proxy should
154: operate in disconnected mode. This mode only really makes sense when you
155: are using a proxy, of course.
156: </DL>
2.11 frystyk 157: <PRE>
158: typedef enum _HTDisconnectedMode {
159: HT_DISCONNECT_NONE = 0,
160: HT_DISCONNECT_NORMAL = 1,
161: HT_DISCONNECT_EXTERNAL = 2
162: } HTDisconnectedMode;
163:
164: extern void HTCacheMode_setDisconnected (HTDisconnectedMode mode);
165: extern HTDisconnectedMode HTCacheMode_disconnected (void);
166: extern BOOL HTCacheMode_isDisconnected (HTReload mode);
2.1 frystyk 167: </PRE>
2.10 frystyk 168: <H2>
2.12 frystyk 169: The Cache Index
170: </H2>
171: <P>
172: The persistent cache keeps an index of its current entries so that garbage
173: collection and lookup becomes more efficient. This index is stored automatically
174: at regular intervals so that we don't get out of sync. Also, it is automatically
175: loaded at startup and saved at closedown of the cache.
176: <H3>
177: Reading the Cache Index
178: </H3>
179: <P>
180: Read the saved set of cached entries from disk. we only allow the index ro
181: be read when there is no entries in memory. That way we can ensure consistancy.
182: <PRE>
183: extern BOOL HTCacheIndex_read (const char * cache_root);
184: </PRE>
185: <H3>
186: Write the Cache Index
187: </H3>
188: <P>
189: Walk through the list of cached objects and save them to disk. We override
190: any existing version but that is normally OK as we have already read its
191: contents.
192: <PRE>
193: extern BOOL HTCacheIndex_write (const char * cache_root);
194: </PRE>
195: <H2>
2.11 frystyk 196: The HTCache Object
2.10 frystyk 197: </H2>
198: <P>
2.11 frystyk 199: The cache object is what we store about a cached objet in memory.
200: <PRE>
201: typedef struct _HTCache HTCache;
202: </PRE>
203: <H3>
2.12 frystyk 204: Create and Update a Cache Object
2.11 frystyk 205: </H3>
206: <P>
2.10 frystyk 207: Filling the cache is done as all other transportation of bulk data in libwww
208: using <A HREF="HTStream.html">streams</A>. The cache object creater is a
209: stream which in many cases sits on a <A HREF="HTTee.html">T stream</A> so
210: that we get the original feed and at the same time can parse the contents.
2.14 frystyk 211: <P>
212: In some situations, we want to append data to an already exiting cache entry.
213: This is the case when a use has interrupted a download and we are stuck with
214: a subpart of the document. If the user later on whishes to download the object
215: again we can issue a range request and continue from where we were. This
216: will in many situations save a lot of bandwidth.
2.11 frystyk 217: <PRE>
2.14 frystyk 218: extern HTConverter HTCacheWriter, HTCacheAppend;
2.11 frystyk 219: </PRE>
2.12 frystyk 220: <P>
221: This function writes the metainformation along with the data object stored
222: by the HTCacheWriter stream above. If no headers are available then the meta
223: file is empty
224: <PRE>
2.14 frystyk 225: extern BOOL HTCache_writeMeta (HTCache * cache, HTRequest * request,
226: HTResponse * response);
2.12 frystyk 227: </PRE>
228: <P>
229: In case we received a "<CODE>304 Not Modified</CODE>" response then we do
230: not have to tough the body but must merge the metainformation with the previous
231: version. Therefore we need a special metainformation update function.
232: <PRE>
2.14 frystyk 233: extern BOOL HTCache_updateMeta (HTCache * cache, HTRequest * request,
234: HTResponse * response);
2.12 frystyk 235: </PRE>
2.11 frystyk 236: <H3>
2.18 frystyk 237: Check Cached Entry
238: </H3>
239: <P>
240: After we get a response back, we should check whether we can still cache
241: an entry and/or we should add an entry for a resource that has just been
242: created so that we can remember the etag and other things. The latter allows
243: us to guarantee that we don't loose data due to the lost update problem.
244: <PRE>
245: extern HTCache * HTCache_touch (HTRequest * request, HTResponse * response,
246: HTParentAnchor * anchor);
247: </PRE>
248: <P>
249: <H3>
2.11 frystyk 250: Load a Cached Object
251: </H3>
252: <P>
253: Loading a cached object is also done as all other loads in libwww by using
254: a <A HREF="HTProt.html">protocol load module</A>. For the moment, this load
255: function handles the persistent cache as if it was on local file but in fact
256: it could be anywhere.
257: <PRE>
2.15 frystyk 258: extern HTProtCallback HTLoadCache;
2.11 frystyk 259: </PRE>
260: <H3>
261: Delete a Cache Object
262: </H3>
263: <P>
264: Remove a HTCache object from memory and from disk. You must explicitly remove
265: a lock before this operation can succeed
266: <PRE>
267: extern BOOL HTCache_remove (HTCache * cache);
268: </PRE>
269: <H3>
2.13 frystyk 270: Delete All Cache Objects in Memory
2.11 frystyk 271: </H3>
272: <P>
273: Destroys all cache entried in memory but does not write anything to disk.
274: Use the index methods above for doing that. We do not delete the disk contents.
275: <PRE>
276: extern BOOL HTCache_deleteAll (void);
2.10 frystyk 277: </PRE>
278: <H3>
2.13 frystyk 279: Delete all Cache Object and File Entries
280: </H3>
281: <P>
282: Destroys all cache entried in memory <B>and</B> on disk. This call basically
283: resets the cache to the inital state but it does not terminate the cache.
284: That is, you don't have to reinitialize the cache before you can use it again.
285: <PRE>
286: extern BOOL HTCache_flushAll (void);
287: </PRE>
288: <H3>
2.11 frystyk 289: Find a Cached Object
2.10 frystyk 290: </H3>
291: <P>
292: Verifies if a cache object exists for this URL and if so returns a URL for
293: the cached object. It does not verify whether the object is valid or not,
294: for example it might have expired. Use the cache validation methods for checking
295: this.
2.11 frystyk 296: <PRE>
297: extern HTCache * HTCache_find (HTParentAnchor * anchor);
298: </PRE>
299: <H3>
300: Verify if an Object is Fresh
301: </H3>
302: <P>
303: This function checks whether a document has expired or not. The check is
304: based on the metainformation passed in the anchor object The function returns
305: the level of validation needed for getting a fresh version. We also check
306: the cache control directives in the request to see if they change the freshness
307: discission.
308: <PRE>
309: extern HTReload HTCache_isFresh (HTCache * me, HTRequest * request);
310: </PRE>
311: <H3>
312: Register a Cache Hit
313: </H3>
314: <P>
315: As a cache hit may occur several places, we have a public function where
316: we can declare a download to be a true cache hit. The number of hits a cache
317: object has affects its status when we are doing garbage collection.
318: <PRE>
319: extern BOOL HTCache_addHit (HTCache * cache);
320: </PRE>
321: <H3>
322: Find the Location of a Cached Object
323: </H3>
324: <P>
325: Is we have a valid entry in the cache then we also need a location where
326: we can get it. Hopefully, we may be able to access it thourgh one of our
327: protocol modules, for example the <A HREF="WWWFile.html">local file module</A>.
328: The name returned is in URL syntax and must be freed by the caller
329: <PRE>
330: extern char * HTCache_name (HTCache * cache);
331: </PRE>
332: <H3>
333: Locking a Cache Object
334: </H3>
335: <P>
336: While we are creating a new cache object or while we are validating an existing
337: one, we must have a lock on the entry so that not other requests can get
338: to it in the mean while. A lock can be broken if the same request tries to
339: create the cache entry again. This means that we have tried to validate the
340: cache entry but we got a new shipment of bytes back from the origin server
341: or an intermediary proxy.
342: <PRE>
343: extern BOOL HTCache_getLock (HTCache * cache, HTRequest * request);
344: extern BOOL HTCache_breakLock (HTCache * cache, HTRequest * request);
345: extern BOOL HTCache_hasLock (HTCache * cache);
346: extern BOOL HTCache_releaseLock (HTCache * cache);
2.1 frystyk 347: </PRE>
348: <PRE>
349: #endif
350: </PRE>
2.10 frystyk 351: <P>
352: <HR>
2.9 frystyk 353: <ADDRESS>
2.21 ! frystyk 354: @(#) $Id: HTCache.html,v 2.20 1999/01/26 13:55:47 frystyk Exp $
2.9 frystyk 355: </ADDRESS>
2.10 frystyk 356: </BODY></HTML>
Webmaster