Annotation of libwww/Library/src/HTAnchor.html, revision 2.47
2.7 timbl 1: <HTML>
2: <HEAD>
2.47 ! frystyk 3: <!-- Changed by: Henrik Frystyk Nielsen, 16-Jul-1996 -->
2.44 frystyk 4: <TITLE>W3C Reference Library libwww Anchor Class</TITLE>
2.8 timbl 5: </HEAD>
2.6 timbl 6: <BODY>
2.42 frystyk 7: <H1>
2.44 frystyk 8: The Anchor Class
2.42 frystyk 9: </H1>
2.15 frystyk 10: <PRE>
11: /*
2.23 frystyk 12: ** (c) COPYRIGHT MIT 1995.
2.15 frystyk 13: ** Please first read the full copyright statement in the file COPYRIGH.
14: */
15: </PRE>
2.42 frystyk 16: <P>
17: An anchor represents a region of a hypertext document which is linked to
18: another anchor in the same or a different document. Another name for anchors
19: would be URLs as an anchor represents all we know about a URL - including
20: where it points to and who points to it. Because the anchor objects
21: represent the part of the Web, the application has been in touch, it is often
22: useful to maintain the anchors throughout the lifetime of the application.
23: It would actually be most useful if we had persistent anchors so that an
2.44 frystyk 24: application could build up a higher knowledge about the Web topology.
2.42 frystyk 25: <P>
26: This module is implemented by <A HREF="HTAnchor.c">HTAnchor.c</A>, and it
27: is a part of the <A HREF="http://www.w3.org/pub/WWW/Library/"> W3C Reference
28: Library</A>.
2.15 frystyk 29: <PRE>
30: #ifndef HTANCHOR_H
1.1 timbl 31: #define HTANCHOR_H
2.24 frystyk 32:
2.18 frystyk 33: </PRE>
2.42 frystyk 34: <H2>
35: Types defined and used by the Anchor Object
36: </H2>
37: <P>
38: This is a set of videly used type definitions used through out the Library:
2.24 frystyk 39: <PRE>
2.44 frystyk 40: #include "WWWUtil.h"
2.35 frystyk 41:
2.24 frystyk 42: typedef HTAtom * HTFormat;
43: typedef HTAtom * HTLevel; /* Used to specify HTML level */
2.40 frystyk 44: typedef HTAtom * HTEncoding; /* C-E and C-T-E */
2.24 frystyk 45: typedef HTAtom * HTCharset;
46: typedef HTAtom * HTLanguage;
2.35 frystyk 47:
48: typedef struct _HTAnchor HTAnchor;
49: typedef struct _HTParentAnchor HTParentAnchor;
50: typedef struct _HTChildAnchor HTChildAnchor;
2.28 frystyk 51:
2.44 frystyk 52: #include "HTLink.h"
53: #include "HTMethod.h"
2.35 frystyk 54: </PRE>
2.42 frystyk 55: <H2>
56: The Anchor Class
57: </H2>
58: <P>
59: We have three variants of the Anchor object - I guess some would call them
60: superclass and subclasses ;-) <A NAME="Generic"></A>
61: <H3>
2.44 frystyk 62: <A NAME="Generic">Anchor Base Class</A>
2.42 frystyk 63: </H3>
64: <P>
65: This is the super class of anchors. We often use this as an argument to the
66: functions that both accept parent anchors and child anchors. We separate
67: the first link from the others to avoid too many small mallocs involved by
68: a list creation. Most anchors only point to one place. <A NAME="parent"></A>
69: <H3>
70: <A NAME="parent">Anchor for a Parent Object</A>
71: </H3>
72: <P>
2.44 frystyk 73: These anchors points to the whole contents of any resource accesible by a
74: URI. The parent anchor now contains all known metainformation about that
75: object and in some cases the parent anchor also contains the document itself.
76: Often we get the metainformation about a document via the entity headers
77: in the HTTP specification.
2.42 frystyk 78: <H3>
79: <A NAME="child">Anchor for a Child Object</A>
80: </H3>
81: <P>
2.44 frystyk 82: A child anchor is a anchor object that points to a subpart of a hypertext
83: document. In HTML this is represented by the <CODE>NAME</CODE> tag of the
84: Anchor element.
2.42 frystyk 85: <P>
86: After we have defined the data structures we must define the methods that
87: can be used on them. All anchors are kept in an internal hash table so that
88: they are easier to find again.
89: <H3>
90: Find/Create a Parent Anchor
91: </H3>
92: <P>
93: This one is for a reference (link) which is found in a document, and might
94: not be already loaded. The parent anchor returned can either be created on
95: the spot or is already in the hash table.
2.18 frystyk 96: <PRE>
2.37 frystyk 97: extern HTAnchor * HTAnchor_findAddress (const char * address);
2.18 frystyk 98: </PRE>
2.42 frystyk 99: <H3>
100: Find/Create a Child Anchor
101: </H3>
102: <P>
103: This one is for a new child anchor being edited into an existing document.
104: The parent anchor must already exist but the child returned can either be
105: created on the spot or is already in the hash table. The <EM>tag</EM> is
106: the part that's after the '#' sign in a URI.
2.18 frystyk 107: <PRE>
2.32 frystyk 108: extern HTChildAnchor * HTAnchor_findChild (HTParentAnchor *parent,
2.37 frystyk 109: const char * tag);
2.7 timbl 110: </PRE>
2.42 frystyk 111: <H3>
112: Find/Create a Child Anchor and Link to Another Parent
113: </H3>
114: <P>
115: Find a child anchor anchor with a given parent and possibly a <EM>tag</EM>,
116: and (if passed) link this child to the URI given in the <EM>href</EM>. As
117: we really want typed links to the caller should also indicate what the type
118: of the link is (see HTTP spec for more information). The link is
119: <EM>relative</EM> to the address of the parent anchor.
2.18 frystyk 120: <PRE>
2.43 eric 121: extern HTChildAnchor * HTAnchor_findChildAndLink (
122: HTParentAnchor * parent, /* May not be 0 */
2.37 frystyk 123: const char * tag, /* May be "" or 0 */
124: const char * href, /* May be "" or 0 */
2.35 frystyk 125: HTLinkType ltype); /* May be 0 */
2.18 frystyk 126: </PRE>
2.42 frystyk 127: <H3>
128: Delete an Anchor
129: </H3>
130: <P>
131: All outgoing links from parent and children are deleted, and this anchor
132: is removed from the sources list of all its targets. We also delete the targets.
133: If this anchor's source list is empty, we delete it and its children.
2.18 frystyk 134: <PRE>
2.32 frystyk 135: extern BOOL HTAnchor_delete (HTParentAnchor *me);
2.20 frystyk 136: </PRE>
2.42 frystyk 137: <H3>
138: Delete all Anchors
139: </H3>
140: <P>
141: Deletes <EM>all</EM> anchors and return a list of all the objects (hyperdoc)
142: hanging of the parent anchors found while doing it. The application may keep
143: its own list of <CODE>HyperDoc</CODE>s, but this function returns it anyway.
144: It is <EM>always</EM> for the application to delete any
145: <CODE>HyperDoc</CODE>s. If NULL then no hyperdocs are returned. Return YES
146: if OK, else NO.
147: <P>
148: <B>Note:</B> This function is different from cleaning up the history list!
2.20 frystyk 149: <PRE>
2.32 frystyk 150: extern BOOL HTAnchor_deleteAll (HTList * objects);
2.18 frystyk 151: </PRE>
2.42 frystyk 152: <H2>
2.44 frystyk 153: <A NAME="links">Links and Anchors</A>
2.42 frystyk 154: </H2>
155: <P>
2.44 frystyk 156: Anchor objects are bound together by <A HREF="HTLink.html">Link objects</A>
157: that carry information about what type of link and whetther we have followed
158: the link etc. Any anchor object can have zero, one, or many links but the
159: normal case is one. Therefore we treat this is a special way.
160: <H3>
161: Handling the Main Link
162: </H3>
163: <P>
164: Any outgoing link can at any time be the main destination.
165: <PRE>
166: extern BOOL HTAnchor_setMainLink (HTAnchor * anchor, HTLink * link);
167: extern HTLink * HTAnchor_mainLink (HTAnchor * anchor);
168:
169: extern HTAnchor * HTAnchor_followMainLink (HTAnchor * anchor);
170: </PRE>
2.42 frystyk 171: <H3>
2.44 frystyk 172: Handling the Sub Links
2.42 frystyk 173: </H3>
2.44 frystyk 174: <PRE>
175: extern BOOL HTAnchor_setSubLinks (HTAnchor * anchor, HTList * list);
176: extern HTList * HTAnchor_subLinks (HTAnchor * anchor);
177: </PRE>
178: <H2>
179: Relations Between Children and Parents
180: </H2>
181: <P>
182: As always, children and parents have a compliated relationship and the libwww
183: Anchor class is no exception.
184: <H3>
2.42 frystyk 185: Who is Parent?
2.44 frystyk 186: </H3>
2.42 frystyk 187: <P>
2.18 frystyk 188: For parent anchors this returns the anchor itself
2.44 frystyk 189: <PRE>extern HTParentAnchor * HTAnchor_parent (HTAnchor *me);
2.18 frystyk 190: </PRE>
2.44 frystyk 191: <H3>
2.42 frystyk 192: Does it have any Anchors within it?
2.44 frystyk 193: </H3>
194: <P>
195: Does this parent anchor have any children
196: <PRE>extern BOOL HTAnchor_hasChildren (HTParentAnchor *me);
197: </PRE>
198: <H2>
2.45 frystyk 199: Anchor Addresses
2.44 frystyk 200: </H2>
201: <P>
202: There are two addresses of an anchor. The URI that was passed when the anchor
203: was crated and the physical address that's used when the URI is going to
204: be requested. The two addresses may be different if the request is going
2.45 frystyk 205: through a proxy or a gateway or it may have been mapped through a rule file.
2.44 frystyk 206: <H3>
207: Logical Address
208: </H3>
209: <P>
210: Returns the full URI of the anchor, child or parent as a malloc'd string
211: to be freed by the caller as when the anchor was created.
212: <PRE>extern char * HTAnchor_address (HTAnchor * me);
2.18 frystyk 213: </PRE>
2.42 frystyk 214: <H3>
2.45 frystyk 215: Expanded Logical Address
216: </H3>
217: <P>
218: When expanding URLs within a hypertext document, the base address is taken
219: as the following value if present (in that order):
220: <UL>
221: <LI>
222: <CODE>Content-Base</CODE> header
223: <LI>
224: <CODE>Content-Location</CODE> header
225: <LI>
226: Logical address
227: </UL>
228: <PRE>extern char * HTAnchor_expandedAddress (HTAnchor * me);
229: </PRE>
230: <H3>
2.44 frystyk 231: Physical address
2.42 frystyk 232: </H3>
233: <P>
2.44 frystyk 234: Contains the physical address after we haved looked for proxies etc.
235: <PRE>extern char * HTAnchor_physical (HTParentAnchor * me);
236: extern void HTAnchor_setPhysical (HTParentAnchor * me, char * protocol);
2.45 frystyk 237: extern void HTAnchor_clearPhysical (HTParentAnchor * me);
2.44 frystyk 238: </PRE>
239: <H2>
240: Entity Body Information
241: </H2>
242: <P>
2.42 frystyk 243: A parent anchor can have a data object bound to it. This data object does
244: can for example be a parsed version of a HTML that knows how to present itself
245: to the user, or it can be an unparsed data object. It's completely free for
246: the application to use this possibility, but a typical usage would to manage
247: the data object as part of a memory cache.
2.18 frystyk 248: <PRE>
2.35 frystyk 249: extern void HTAnchor_setDocument (HTParentAnchor *me, void * doc);
250: extern void * HTAnchor_document (HTParentAnchor *me);
2.18 frystyk 251: </PRE>
2.44 frystyk 252: <H2>
253: Entity Header Information
254: </H2>
255: <P>
256: The anchor object also contains all the metainformation that we know about
257: the object.
2.42 frystyk 258: <H3>
2.44 frystyk 259: Clear All header Information
2.42 frystyk 260: </H3>
2.44 frystyk 261: <PRE>extern void HTAnchor_clearHeader (HTParentAnchor *me);
2.42 frystyk 262: </PRE>
263: <H3>
264: Cache Information
265: </H3>
266: <P>
267: If the cache manager finds a cached object, it is registered in the anchor
268: object. This way the <A HREF="HTFile.html">file loader</A> knows that it
269: is a MIME data object. The cache manager does not know whether the data object
270: is out of date (for example if a <EM>Expires:</EM> header is in the MIME
271: header. This is for the <A HREF="HTMIME.html">MIME parser</A> to find out.
2.44 frystyk 272: <PRE>extern BOOL HTAnchor_cacheHit (HTParentAnchor * me);
2.42 frystyk 273: extern void HTAnchor_setCacheHit (HTParentAnchor * me, BOOL cacheHit);
274: </PRE>
275: <H3>
2.47 ! frystyk 276: Is the Object Cachable?
! 277: </H3>
! 278: <P>
! 279: The various cache-control headers and directives decides whether an object
! 280: is cachable or not. Check these methods before starting caching!
! 281: <PRE>extern BOOL HTAnchor_cachable (HTParentAnchor * me);
! 282: extern BOOL HTAnchor_setCachable (HTParentAnchor * me, BOOL mode);
! 283: </PRE>
! 284: <H3>
! 285: Cache Control Directives
! 286: </H3>
! 287: <P>
! 288: The cache control directives are all part of the cache control header and
! 289: control the behavior of any intermediate cache between the user agent and
! 290: the origin server. <CODE>Cache-control</CODE> directives can be sent in both
! 291: directions - that is - from the server to the client and vise-verse. We keep
! 292: the incoming ones here and the out-going as part of the
! 293: <A HREF="HTReq.html">request object</A>.
! 294: <PRE>
! 295: extern BOOL HTAnchor_addCacheControl (HTParentAnchor * anchor,
! 296: char * token, char * value);
! 297: extern BOOL HTAnchor_deleteCacheControl (HTParentAnchor * anchor);
! 298: extern HTAssocList * HTAnchor_cacheControl (HTParentAnchor * anchor);
! 299: </PRE>
! 300:
! 301: Some useful helper functions for handling specific cache directives
! 302:
! 303: <PRE>
! 304: extern time_t HTAnchor_maxAge (HTParentAnchor * anchor);
! 305: extern BOOL HTAnchor_mustRevalidate (HTParentAnchor * anchor);
! 306: </PRE>
! 307:
! 308: <H3>
2.42 frystyk 309: Is the Anchor searchable?
310: </H3>
2.44 frystyk 311: <PRE>extern void HTAnchor_clearIndex (HTParentAnchor * me);
2.42 frystyk 312: extern void HTAnchor_setIndex (HTParentAnchor * me);
313: extern BOOL HTAnchor_isIndex (HTParentAnchor * me);
314: </PRE>
315: <H3>
2.44 frystyk 316: Anchor Title
2.42 frystyk 317: </H3>
318: <P>
319: We keep the title in the anchor as we then can refer to it later in the history
320: list etc. We can also obtain the title element if it is passed as a HTTP
321: header in the response. Any title element found in an HTML document will
322: overwrite a title given in a HTTP header.
2.44 frystyk 323: <PRE>extern const char * HTAnchor_title (HTParentAnchor *me);
2.32 frystyk 324: extern void HTAnchor_setTitle (HTParentAnchor *me,
2.37 frystyk 325: const char * title);
2.32 frystyk 326: extern void HTAnchor_appendTitle (HTParentAnchor *me,
2.37 frystyk 327: const char * title);
2.18 frystyk 328: </PRE>
2.42 frystyk 329: <H3>
2.44 frystyk 330: Content Base
331: </H3>
332: <P>
333: The <CODE>Content-Base</CODE> header may be used for resolving relative URLs
334: within the entity.
335: <PRE>extern char * HTAnchor_base (HTParentAnchor * me);
336: extern BOOL HTAnchor_setBase (HTParentAnchor * me, char * base);
337: </PRE>
338: <H3>
339: Content Location
340: </H3>
341: <P>
342: Content location can either be an absolute or a relative URL. If it is relative
343: then parse it relative to the <CODE>Content-Base</CODE> header of the request
344: URI.
345: <PRE>extern char * HTAnchor_location (HTParentAnchor * me);
346: extern BOOL HTAnchor_setLocation (HTParentAnchor * me, char * location);
347: </PRE>
348: <H3>
2.42 frystyk 349: Media Types (Content-Type)
350: </H3>
2.18 frystyk 351: <PRE>
2.32 frystyk 352: extern HTFormat HTAnchor_format (HTParentAnchor *me);
353: extern void HTAnchor_setFormat (HTParentAnchor *me,
354: HTFormat form);
2.18 frystyk 355: </PRE>
2.42 frystyk 356: <H3>
2.44 frystyk 357: Content Type Parameters
358: </H3>
359: <P>
360: The Anchor obejct stores all content parameters in an Association list so
361: here you will always be able to find them. We also have a few methods for
362: the special cases: <CODE>charset</CODE> and <CODE>level</CODE> as they are
363: often needed.
364: <PRE>
365: extern HTAssocList * HTAnchor_formatParam (HTParentAnchor * me);
366:
367: extern BOOL HTAnchor_addFormatParam (HTParentAnchor * me,
368: const char * name, const char * value);
369: </PRE>
370: <H4>
2.42 frystyk 371: Charset parameter to Content-Type
2.44 frystyk 372: </H4>
2.18 frystyk 373: <PRE>
2.32 frystyk 374: extern HTCharset HTAnchor_charset (HTParentAnchor *me);
2.44 frystyk 375: extern BOOL HTAnchor_setCharset (HTParentAnchor *me,
2.32 frystyk 376: HTCharset charset);
2.18 frystyk 377: </PRE>
2.44 frystyk 378: <H4>
2.42 frystyk 379: Level parameter to Content-Type
2.44 frystyk 380: </H4>
2.21 frystyk 381: <PRE>
2.32 frystyk 382: extern HTLevel HTAnchor_level (HTParentAnchor * me);
2.44 frystyk 383: extern BOOL HTAnchor_setLevel (HTParentAnchor * me,
2.32 frystyk 384: HTLevel level);
2.22 frystyk 385: </PRE>
2.42 frystyk 386: <H3>
387: Content Language
388: </H3>
2.22 frystyk 389: <PRE>
2.39 frystyk 390: extern HTList * HTAnchor_language (HTParentAnchor * me);
391: extern BOOL HTAnchor_addLanguage (HTParentAnchor *me, HTLanguage lang);
2.21 frystyk 392: </PRE>
2.42 frystyk 393: <H3>
394: Content Encoding
395: </H3>
2.18 frystyk 396: <PRE>
2.39 frystyk 397: extern HTList * HTAnchor_encoding (HTParentAnchor * me);
398: extern BOOL HTAnchor_addEncoding (HTParentAnchor * me, HTEncoding enc);
2.18 frystyk 399: </PRE>
2.42 frystyk 400: <H3>
401: Content Transfer Encoding
402: </H3>
2.18 frystyk 403: <PRE>
2.40 frystyk 404: extern HTEncoding HTAnchor_transfer (HTParentAnchor *me);
405: extern void HTAnchor_setTransfer (HTParentAnchor *me,
406: HTEncoding transfer);
2.18 frystyk 407: </PRE>
2.42 frystyk 408: <H3>
409: Content Length
410: </H3>
2.18 frystyk 411: <PRE>
2.41 frystyk 412: extern long int HTAnchor_length (HTParentAnchor * me);
413: extern void HTAnchor_setLength (HTParentAnchor * me, long int length);
414: extern void HTAnchor_addLength (HTParentAnchor * me, long int deltalength);
2.18 frystyk 415: </PRE>
2.42 frystyk 416: <H3>
2.44 frystyk 417: Content MD5
418: </H3>
419: <PRE>
420: extern char * HTAnchor_md5 (HTParentAnchor * me);
421: extern void HTAnchor_setMd5 (HTParentAnchor * me, const char * hash);
422: </PRE>
423: <H3>
2.42 frystyk 424: Allowed methods (Allow)
425: </H3>
2.18 frystyk 426: <PRE>
2.36 frystyk 427: extern HTMethod HTAnchor_methods (HTParentAnchor * me);
428: extern void HTAnchor_setMethods (HTParentAnchor * me, HTMethod methodset);
429: extern void HTAnchor_appendMethods (HTParentAnchor * me, HTMethod methodset);
2.18 frystyk 430: </PRE>
2.42 frystyk 431: <H3>
432: Version
433: </H3>
2.18 frystyk 434: <PRE>
2.35 frystyk 435: extern char * HTAnchor_version (HTParentAnchor * me);
2.37 frystyk 436: extern void HTAnchor_setVersion (HTParentAnchor * me, const char * version);
2.28 frystyk 437: </PRE>
2.42 frystyk 438: <H3>
439: Date
440: </H3>
441: <P>
2.28 frystyk 442: Returns the date that was registered in the RFC822 header "Date"
443: <PRE>
2.35 frystyk 444: extern time_t HTAnchor_date (HTParentAnchor * me);
2.37 frystyk 445: extern void HTAnchor_setDate (HTParentAnchor * me, const time_t date);
2.28 frystyk 446: </PRE>
2.42 frystyk 447: <H3>
448: Last Modified Date
449: </H3>
450: <P>
2.28 frystyk 451: Returns the date that was registered in the RFC822 header "Last-Modified"
452: <PRE>
2.35 frystyk 453: extern time_t HTAnchor_lastModified (HTParentAnchor * me);
2.37 frystyk 454: extern void HTAnchor_setLastModified (HTParentAnchor * me, const time_t lm);
2.28 frystyk 455: </PRE>
2.42 frystyk 456: <H3>
2.44 frystyk 457: Entity Tag
458: </H3>
459: <P>
460: Entity tags are used for comparing two or more entities from the same requested
461: resource. It is a cache validator much in the same way <I>Date</I> can be.
462: The difference is that we can't always trust the date and time stamp and
463: hence we must have something stronger.
464: <PRE>extern char * HTAnchor_etag (HTParentAnchor * me);
465: extern void HTAnchor_setEtag (HTParentAnchor * me, const char * etag);
466: extern BOOL HTAnchor_isEtagWeak (HTParentAnchor * me);
467: </PRE>
468: <H3>
2.47 ! frystyk 469: Age Header
! 470: </H3>
! 471: <P>
! 472: The <CODE>Age</CODE> response-header field conveys the sender's estimate
! 473: of the amount of time since the response (or its revalidation) was generated
! 474: at the origin server. A cached response is "fresh" if its age does not exceed
! 475: its freshness lifetime.
! 476: <PRE>
! 477: extern time_t HTAnchor_age (HTParentAnchor * me);
! 478: extern void HTAnchor_setAge (HTParentAnchor * me, const time_t age);
! 479: </PRE>
! 480: <H3>
2.42 frystyk 481: Expires Date
482: </H3>
2.28 frystyk 483: <PRE>
2.35 frystyk 484: extern time_t HTAnchor_expires (HTParentAnchor * me);
2.37 frystyk 485: extern void HTAnchor_setExpires (HTParentAnchor * me, const time_t exp);
2.18 frystyk 486: </PRE>
2.42 frystyk 487: <H3>
488: Derived from
489: </H3>
2.18 frystyk 490: <PRE>
2.35 frystyk 491: extern char * HTAnchor_derived (HTParentAnchor *me);
2.37 frystyk 492: extern void HTAnchor_setDerived (HTParentAnchor *me, const char *derived_from);
2.18 frystyk 493: </PRE>
2.44 frystyk 494: <H2>
2.42 frystyk 495: Status of Header Parsing
2.44 frystyk 496: </H2>
2.42 frystyk 497: <P>
2.47 ! frystyk 498: This is primarily for internal use. It is so that we can check whether the
! 499: header has been parsed or not.
2.44 frystyk 500: <PRE>extern BOOL HTAnchor_headerParsed (HTParentAnchor *me);
2.32 frystyk 501: extern void HTAnchor_setHeaderParsed (HTParentAnchor *me);
2.7 timbl 502: </PRE>
2.18 frystyk 503: <PRE>
504: #endif /* HTANCHOR_H */
505: </PRE>
2.42 frystyk 506: <P>
507: <HR>
2.39 frystyk 508: <ADDRESS>
2.47 ! frystyk 509: @(#) $Id: HTAnchor.html,v 2.46 1996/07/18 03:56:29 frystyk Exp $
2.39 frystyk 510: </ADDRESS>
2.42 frystyk 511: </BODY></HTML>
Webmaster