Based on experience with WOFF 1.0, which is widely deployed, this specification was developed to provide improved compression and thus lower use of network bandwidth, while still allowing fast decompression even on mobile devices. This is achieved by combining a content-aware preprocessing step and improved entropy coding, compared to the Flate compression used in WOFF 1.0.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Supporting material, including results of compression measurements, may be found in the companion WOFF 2.0 Evaluation Report.
This is an Editors Draft of WOFF 2.0. It may contain material not reviewed by the Fonts Working Group and is subject to change.
This document was developed by the WebFonts Working Group. The Working Group expects to advance this Working Draft to Recommendation Status.
Please send comments about this document to email@example.com (with public archive).
Publication as an Editors Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1.1. Notational Conventions
2. Overall file structure and basic data types
2.1. Data types
2.2. WOFF2 Header
3. Table directory format
4. Compressed data format
4.1. Transformed glyf table format
4.2. Decoding of variable-length X and Y coordinates
4.3. Transformed loca table format
4.4. Table order constraints
5. Extended Metadata Block
6. Private Data Block
Appendix A: Internet Media Type Registration
This document specifies the WOFF2 font packaging format. This format was designed to provide a reasonably easy-to-implement compression of font data with significantly better compression than previous techniques, suitable for use with CSS @font-face rules. The improvement in compression rates, compared to previously developed WOFF 1.0 format [WOFF1] are realized due to improved entropy coding and font data preprocessing and optimization step that reduces built-in redundancy of various font data structures. The details about WOFF 2.0 development history can be found in [ WOFF2ER].
The all-uppercase key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. If these words occur in lower- or mixed case, they should be interpreted in accordance with their normal English meaning.
This document includes sections of text that are called out as "Notes" and set off from the main text of the specification. These notes are intended as informative explanations or clarifications, to serve as hints or guides to implementers and users, but are not part of the normative text.
The structure of WOFF2 files is similar to that of SFNT and WOFF 1.0 font files, in that there is a header containing a table directory, followed by the data for those tables. The SFNT structure is described fully in the TrueType [TrueType], OpenType [OpenType], and ISO "Open Font Format" [OFF] specifications. However, it differs in some important respects from SFNT. Most notably, the data for the tables is compressed in a single stream comprising all the tables. The compression algorithm is Brotli [Brotli].
|WOFF2Header||File header with basic font type and version, along with offsets to metadata and private |
|TableDirectory||Directory of font tables, containing size and other info.|
|CompressedFontData||Contents of font tables, compressed for storage in the WOFF2 file.|
|ExtendedMetadata||An optional block of extended metadata, represented in XML format and compressed |
for storage in the WOFF2 file.
|PrivateData||An optional block of private data for the font designer, foundry, or vendor to use.|
|UInt8||8-bit unsigned integer.|
|Int16||16-bit signed integer in 2's complement format, stored big-endian.|
|UInt16||16-bit unsigned integer, stored big-endian.|
|255UInt16||Variable-length encoding of a 16-bit unsigned integer for optimized intermediate font data storage.|
|UIntBase128||Variable-length encoding of 32-bit unsigned integers.|
255UInt16 is a variable-length encoding of an unsigned integer in the range 0 to 65535 inclusive. This data type is intended to be used as intermediate representation of various font values, which are typically expressed as UInt16 but represent relatively small values. Depending on the encoded value, the length of the data field may be one to three bytes, as described in the following table:
|Data Type||Syntax||Description and Comments|
|UInt8||Code||if (Code < 253) Value = Code; /* [0..252] */|
|if ((Code == 254) || (Code == 255))|
|UInt8||Value1||if (Code == 255) Value = 253 + Value1; /* [253..508] */|
if (Code == 254) Value = 506 + Value1; /* [506..761] */
|else if (Code == 253)|
|UInt16||Value||Value; /* [0..65535] */|
Note that the encoding is not unique. For example, the value 506 can be encoded as [255, 203], [254, 0], and [253, 1, 250]. An encoder may produce any of these, and a decoder must accept them all, although encoders should choose shorter encodings, and should be consistent in choice of encoding for the same value, as this will tend to compress better.
UIntBase128 is a different variable length encoding of unsigned integers, suitable for values up to 232-1. A UIntBase128 encoded number is a sequence of bytes for which the most significant bit is set for all but the last byte, and clear for the last byte. The number itself is base 128 encoded in the lower 7 bits of each byte. Thus, a decoding procedure for a UIntBase128 is: start with value = 0. Consume a byte, setting value = old value times 128 + (byte bitwise-and 127). Repeat last step until the most significant bit of byte is false.
The WOFF 2.0 header includes an identifying signature and provides the information about the compressed and uncompressed sizes of encapsulated font data. It also indicates the specific kind of font data included in the WOFF 2.0 file, font version number and provides offsets to additional data blocks included in the file.
|UInt32||flavor||The "sfnt version" of the input font.|
|UInt32||length||Total size of the WOFF file.|
|UInt16||numTables||Number of entries in directory of font tables.|
|UInt16||reserved||Reserved; set to 0.|
|UInt32||totalSfntSize||Total size needed for the
uncompressed font data, including the sfnt header, |
directory, and font tables (including padding).
|UInt32||totalCompressedSize||Total length of the compressed data block.|
|UInt16||majorVersion||Major version of the WOFF file.|
|UInt16||minorVersion||Minor version of the WOFF file.|
|UInt32||metaOffset||Offset to metadata block, from beginning of WOFF file.|
|UInt32||metaLength||Length of compressed metadata block.|
|UInt32||metaOrigLength||Uncompressed size of metadata block.|
|UInt32||privOffset||Offset to private data block, from beginning of WOFF file.|
|UInt32||privLength||Length of private data block.|
The interpretation of the WOFF2 Header is the same as the WOFF Header in [WOFF1]. The signature has the value of 0x774F4632 ('wOF2'), which distinguishes it from WOFF 1.0 files.
(Probable todo: copy relevant parts of the WOFF1 spec and adapt).
The table directory is an array of WOFF2 table directory entries, as defined below. The directory follows immediately after the WOFF2 file header; therefore, there is no explicit offset in the header pointing to this block. Its size is dependent on the exact content; thus, the best strategy for decoding is to process the file as a stream, rather than trying to access it randomly. Each table directory entry specifies the size of a single font data table, as well as information indicating whether to apply an additional transform.
The format of each individual table directory entry is as follows:
|UInt8||flags||table type and flags|
|UInt32||tag||4-byte tag (optional)|
|UIntBase128||origLength||length of original table|
|UIntBase128||transformLength||transformed length (optional)|
The interpretation of the flags field is as follows. Bits [0..5] contain an index to the "known tag" table, which represents tags likely to appear in fonts. If the tag is not present in this table, then the value of this bit field is 63. Bits 6 and 7 are reserved for future extensions.
Whether a table tag is encoded with a known table tag or explicitly including the four-byte tag has no semantic significance; it is simply a choice of encoding intended to improve compression efficiency. Similarly, whether a particular four-byte tag is present or not in the known table is not a normative statement about whether such tags should be included in web fonts.
There is a predefined extension mechanism for any custom table tag that is not included in the known tag list (or for any new standard tags that may be defined in the future). If bits [0..5] of the flags byte have the value 63 (0x3f), then following the flag byte is a 4-byte arbitrary tag value. Otherwise, the tag field is omitted in the TableDirectoryEntry structure, and is derived from bits [0..5] of the flag byte from the fixed Known Table Tags table, given below.
|Known Table Tags|
|15||EBDT||31||MATH||47||fvar||63||arbitrary tag follows|
Please note that according to the SFNT-based font format specifications all table tags should consist of four characters. Table tags with less then four letters, such as e.g. 'cvt ' (tag value 0x63767420) are padded with trailing spaces.
Following the flags are one to two length values, each in UIntBase128 unsigned integer encoding. The origLength field specifies the length of the table in an uncompressed version of the font. Optionally, for those table that are subjected to additional transformations, the transformLength specifies the length of the transformed version of the table.
The CompressedFontData field in a WOFF2 file contains the concatenation of data for each table in the font, in the order that entries appear in the table directory, then compressed using the Brotli compression algorithm [Brotli]. The process of decoding the table data in a WOFF2 font file can be specified by decompressing the byte-level compression of the CompressedFontData field, yielding a "table data block", then applying additional decoding steps as described below. An actual implementation is free to combine these steps or perform some of the steps in an incremental or streaming fashion, but the results must be consistent with the sequential process as specified here.
Certain tables (such as glyf and loca tables, identified by their corresponding tags), are subject to additional transforms. If a font table is not transformed, then the table data appears in the compressed stream in literal form, and occupies origLength bytes of the table data block. If the table is transformed, then the table data must be additionally processed by a transformation specified below. In this case, the transformed table occupies transformLength bytes of the table data block.
The known table flag values shoudl not be relied upon in determining the presense of the transformed tables, it is feasible that e.g. the glyf table can be represented in the table directory with either flag = 10 and no tag, or with flag = 63 and 'glyf' tag that follows.
The sum of the origLength (for non-transformed tables) and transformLength (for transformed tables) fields in the table directory MUST equal the size of the uncompressed table data block.
A transform MUST be applied to two types of tables: glyf, representing outline data, and loca, representing the offsets of the individual glyphs within a glyf table, if these tables are present in a font. Additional constraints apply, as specified in section 4.4. The glyf table transformation is specified in section 4.1, and the loca table transformation is specified in section 4.3.
The WOFF 2.0 transformations applied to certain tables are desinged to reduce and/or eliminate the built-in redundancies of the SFNT format and restructure the font data stream for more efficient entropy encoding. As a result, the reconstructed font data will retain the exact functionality of the input font file, but due to certain possible encoding variations (such as e.g. various levels of optimization of outline point coordinates in the 'glyf' table, or difference in offset calculations of the 'loca' table) different WOFF2 decoders may produce an output file that will not be a bitwise match to the input font file. These differences will invalidate 'DSIG' table, if one is present and, therefore, the compliant WOFF2 encoder SHOULD remove the DSIG table from an input font data, prior to applying transformations and entropy coding steps.
The WOFF 2.0 encoders SHOULD also set bit 11 of the 'flags' field of the head table (see [OFF] specification) to indicate that a recreated font file was subject to lossless modifying transform.
The glyf table transformation is intended to reduce redundant information and provide a more efficient encoding of the actual TrueType outlines of glyphs. The modified transformation is specified below and is based on a similar transformation described in MicroType Express [MTX] specification. The reference to MTX is informative; the details of the modified transformation are stated below and this section is normative.
For greater compression effectiveness, the glyf table is split into seven substreams, to group like data together. The transformed table consists of a header with the size of each of the substreams, followed by the substreams in sequence. During the decoding process the reverse transformation takes place, where data from various separate substreams are recombined to create a complete glyph record for each entry of the original glyf table.
|Transformed glyf Header|
|Data Type||Semantic||Description and value type (if applicable)|
|UInt16||numGlyphs||Number of glyphs|
|UInt16||indexFormat||Offset format for loca table, should be consistent with indexToLocFormat of the original head table (see [OFF] specification)|
|UInt32||nContourStreamSize||Size of nContour stream (a stream of Int16 values)|
|UInt32||nPointsStreamSize||Size of nPoints stream (a stream of 255UInt16 values)|
|UInt32||flagStreamSize||Size of flag stream (a stream of UInt8 values)|
|UInt32||glyphStreamSize||Size of glyph stream (a stream of variable-length encoded values, see description below)|
|UInt32||compositeStreamSize||Size of composite stream (a stream of variable-length encoded values, see description below)|
|UInt32||bboxStreamSize||Size of bbox stream (a stream of Int16 values)|
|UInt32||instructionStreamSize||Size of instruction stream (a stream of UInt8 values)|
|UInt8||bboxBitmap[n]||Bitmap indicating explicit bounding boxes|
The format is best characterized by describing the decoding process, especially indications of what are valid and invalid data. It is up to the encoder to produce transformed data that is valid and decodes to the desired font data. Note also that this format specifies the decoded result at the semantic level, not specific byte streams.
Included in the Transformed glyf Header is a bboxBitmap indicating for each glyph whether it contains an explicit bounding box, or whether the bounding box is to be inferred from the coordinate values. If the bounding box is to be inferred, the explicit xMin, yMin, xMax and yMax values must be calculated at the time of decoding the outline point coordinates. The total number of bytes in this bitmap is equal to 4 * ((numGlyphs + 31) / 32). The bits are packed so that glyph number 0 corresponds to the most significant bit of the first byte, glyph number 7 corresponds to the least significant bit of the first byte, glyph number 8 corresponds to the most significant bit of the second byte, and so on. A 1 value indicates an explicitly set bounding box.
After reading the TransformedGlyphHeader, the decoding process iterates one glyph at a time. For each glyph, it reads zero or more bytes from each of the streams referenced in the TransformedGlyphHeader. Also, at the point of reconstructing a glyph, a decoder should store for each glyph the corresponding offset in the reconstructed glyph table, and this data will collectively become the contents of the reconstructed locatable (see section 4.3 below for more information about the reconstruction of the loca table).
The reconstruction process for a single glyph consists of performing the following steps:
1. Read a Int16 from the nContour stream. Store this in the numberOfContours field in the reconstructed TrueType glyph. The interpretation of the field is the same as the TrueType spec; if it is zero, the glyph is empty. If it is positive, the glyph is simple and the value represents the number of contours in the outline. If the nContour value is equal to -1 (0xffff), then the glyph is composite.
For a simple glyph, the process continues as follows:
2. Read numberOfContours 255UInt16 values from the nPoints stream. Each of these is the number of points of that contour. Convert this into the endPtsOfContours array by computing the cumulative sum, then subtracting one. For example, if the values in the stream are [2, 4], then the endPtsOfContours array is [1, 5]. Also, the sum of all the values in the array is the total number of points in the glyph, nPoints. In the example given, the value of nPoints is 6.
3. Read nPoints UInt8 values from the flags stream. Each corresponds to one point in the reconstructed glyph outline. The interpretation of the flag byte is described in details in section 4.2.
4. For each point (i.e. nPoints times), read a number of point coordinate bytes from the glyph stream. The number of point coordinate bytes is a function of the flag byte read in the previous step: for (flag < 0x7f) in the range 0 to 83 inclusive, it is one byte. In the range 84 to 119 inclusive, it is two bytes. In the range 120 to 123 inclusive, it is three bytes, and in the range 124 to 127 inclusive, it is four bytes. Decode these bytes according to the procedure specificed in the section 4.2 to reconstruct delta-x and delta-y values of the glyph point coordinates. Store these delta-x and delta-y values in the reconstructed glyph using the standard TrueType glyph encoding [OFF] section 5.3.3.
5. Read one 255UInt16 value from the glyph stream, which is instructionLength, the number of instruction bytes.
6. Read instructionLength bytes from instructionStream, and store these in the reconstituted glyph as instructions.
For a composite glyph (nContour == -1), the following steps take the place of steps 2-6 above:
2a. Read a UInt16 from compositeStream. This is interpreted as a component flag word as in the TrueType spec. Based on the flag values, there are between 4 and 14 additional argument bytes, interpreted as glyph index, arg1, arg2, and optional scale or affine matrix.
3a. Read the number of argument bytes as determined in step 2a from the composite stream, and store these in the reconstructed glyph. If the flag word read in step 2a has the FLAG_MORE_COMPONENTS bit (1 << 5) set, go back to step 2a.
4a. If any of the flag words had the FLAG_WE_HAVE_INSTRUCTIONS bit (1 << 8) set, then read the instructions from the glyph and store them in the reconstructed glyph, using the same process as described in steps 5 and 6 above.
Finally, for both simple and composite glyphs, if the corresponding bit in the bounding box bit vector is set, then additionally read 4 Int16 values from the bbox stream, representing xMin, yMin, xMax, and yMax, respectively, and record these into the corresponding fields of the reconstructed glyph. If the corresponding bit in the bounding box bit vector is not set, then derive the bounding box by computing the minimum and maximum x and y coordinates in the outline, and storing that.
A composite glyph MUST have an explicitly supplied bounding box. The motivation is that computing bounding boxes is more complicated, and would require resolving references to component glyphs taking into account composite glyph instructions and the specified scales of iindividual components, which would conflict with a purely streaming implementation of font decoding.
Simple glyph data structure defines all contours that comprise a glyph outline, which are presented by a sequence of on- and off-curve coordinate points. These point coordinates are encoded as delta values representing the incremental values between the previous and current corresponding X and Y coordinates of a point, the first point of each outline is relative to (0,0) point. To minimize the size of the dataset of point coordinate values, each point is presented as a (flag, xCoordinate, yCoordinate) triplet. The flag value is stored in a separate data stream and the coordinate values are stored as part of the glyph data stream using a variable-length encoding format consuming a total of 2-5 bytes per point.
The most significant bit of a flag indicates whether the point is on- or off-curve point, the remaining seven bits of the flag determine the format of X and Y coordinate values and specify 128 possible combinations of indices that have been assigned taking into consideration typical statistical distribution of data found in TrueType fonts. When X and Y coordinate values are recorded using nibbles (either 4 bits per coordinate or 12 bits per coordinate) the bits are packed in the byte stream with most significant bit of X coordinate first, followed by the value for Y coordinate (most significant bit first). As a result, the size of the glyph dataset is significantly reduced, and the grouping of the similar values (flags, coordinates) in separate and contiguous data streams allows more efficient application of the entropy coding applied as the second stage of encoding process.
Each of the 128 index values define the following properties and specified in details in the table below:
Please note that “Byte Count” field reflects total size of the triplet (flag, xCoordinate, yCoordinate), including ‘flag’ value that is encoded in a separate stream.
|Index||Byte Count||X bits||Y bits||Delta X||Delta Y||X sign||Y sign|
For additional information and background on the triplet encoding pleasee see section 5.11 of the MTX proposal [MTX].
The transformLength of the transformed loca table MUST always be zero. The origLength MUST be the appropriate size (generally numGlyphs times a size per glyph, where that size per glyph is two bytes when indexFormat (defined in section 4.1. Transformed glyf table format) is zero, otherwise four bytes).
The loca table MUST be reconstructed when the glyf table is decoded. The process for reconstructing the loca table is simply to record the offsets for individual glyphs obtained while reconstrucing the glyf table as specified in section 4.1.
The following constraints on valid WOFF2 files are intended to facilitate a memory-efficient WOFF 2.0 file transfer and decoding process. For a font with TrueType outlines, the glyf table MUST be encoded with the transform, which results in significantly smaller file size.
The loca table MUST follow the glyf table in the table directory. Additional constraints on the loca table in this case are given in section 4.3.
In addition to table order, there is an additional constraint for transformed glyf tables: the origLength field MUST specify an adequate amount of space to represent the reconstructed glyf table. Since there are multiple valid reconstructions of a glyph, according to the encoding rules specified in section 5.3.3 of [OFF], it is necessary to specify a nominal size for a reconstructed glyph. Depending on the context, this nominal size may be greater than, less than, or equal to the actual size of the glyf table in the source font being compressed.
The nominal size of a glyph is the size of encoding a glyph according to section 5.3.3 of [OFF], applying the following rules to resolve choices between multiple valid encodings:
Note that the nominal size is not necessarily minimal. For one, while section 5.3.4 of [OFF] specifies in a note that offsets "should" be multiples of 4, fonts with other alignments are allowed. In addition, a very aggressively optimizing font generation tool may be able to exploit a situation where a coordinate encoding other than the maximally compact may allow a longer run of identical flag values, thus saving bytes. In both of these cases, the nominal size may be larger than the size of the glyf table in the source font.
Regardless of the relation between original font table size and nominal size, an encoder MUST supply an origLength value for the transformed glyf table which is greater than or equal to the nominal size. A decoder MAY reject a font not satisfying this constraint. A tool for validating WOFF2 font files SHOULD check this constraint. Note: the motivation for not requiring decoders to check this strictly is to allow implementation freedom for the decoding process, and not require the performance overhead of computing both nominal and actual sizes.
The WOFF2 file MAY contain a block of extended metadata. The interpretation of this block is exactly the same as WOFF 1. However, while in a WOFF 1 file the extended metadata block is stored with zlib compression, in a WOFF2 file it MUST be stored compressed with Brotli. The rationale of this change is to minimize the total number of byte-level compression algorithms needed to implement WOFF2, and also because Brotli is expected to achieve better compression ratios than zlib in most cases.
(Probable todo: copy relevant parts of the WOFF1 spec or provide a normative reference).
The WOFF file MAY include a block of arbitrary data, allowing font creators to include whatever information they wish. The content of this data MUST NOT affect font usage or load behavior of user agents. User agents should make no assumptions about the content of a private block; it may (for example) contain ASCII or Unicode text, or some vendor-defined binary data, and it may be compressed or encrypted, but it has no publicly defined format. Conformant user agents will not assume anything about the structure of this data. Only the font developer or vendor responsible for the private block is expected to understand its contents.
The private data block, if present, MUST be the last block in the WOFF file, following all the font tables and any extended metadata block. The private data block MUST begin on a 4-byte boundary in the WOFF file, with up to three null bytes inserted as padding after any preceding metadata block if needed to ensure this. The end of the private data block MUST correspond to the end of the WOFF file.
This appendix registers a new MIME media type, in conformance with BCP 13 and W3CRegMedia.
Fonts are interpreted data structures that represent collections of glyph outlines, metrics and layout information for various languages and writing systems. Currently, there are many standardized font data tables that allow an unspecified number of entries, and where existing, predefined data fields allow storage of binary data with variable length. There is a significant risk that the flexibility of font data structures may be exploited to hide malicious binary content disguised as a font data component.
WOFF 2.0 is based on the table-based SFNT (scalable font) format which is highly extensible and offers an opportunity to introduce additional data structures when needed. However, this same extensibility may present specific security concerns – the flexibility and ease of defining new data structures makes it easy for any arbitrary data to be added and hidden inside a font file.
WOFF 2.0 fonts may contain 'hints' for the alignment of graphical elements of the glyphs with the target display pixel grid, and depending on the font technology utilized in the creation of a font these hints may represent active code interpreted and executed by the font rasterizer. Even though they operate within the confines of the glyph outline conversion system and have no access outside the font rendering machinery, hint instructions can be, however, quite complex, and a maliciously designed complex font could cause undue resource consumption (e.g. memory or CPU cycles) on a machine interpreting it. Indeed, fonts are sufficiently complex that most if not all interpreters cannot be completely protected from malicious fonts without undue performance penalties.
Widespread use of fonts as necessary component of visual content presentation warrants that a careful attention should be given to security considerations whenever a font is either embedded into an electronic document or transmitted alongside media content as a linked resource.
WOFF 2.0 uses Brotli compression. The WOFF2 header contains the length of the uncompressed font data including the sfnt header, directory, and font tables (including padding). Applications may therefore constrain the size of memory buffer allocated for decompression and may stop writing if a maliciously crafted WOFF file in fact contains more data than is indicated.
WOFF 2.0 does not provide privacy protections internally; if needed, these should be provided externally.
WOFF 2.0 has a private data block facility, which may contain arbitrary binary data. WOFF 2.0 does not provide a means to access this, or to execute any code contained therein. As with WOFF 1.0, it is required that the content of this block not affect font rendering in any way.
WOFF 2.0 is an improvement on WOFF 1.0. The two formats have different Internet Media Types and may be used in parallel.
This media type registration is extracted from the WOFF 2.0 specification at W3C.
WOFF 2.0 is used by Web browsers, often in conjunction with HTML and CSS.
Chris Lilley (firstname.lastname@example.org).
The WOFF2 specification is a work product of the World Wide Web Consortium's WebFonts Working Group.
The W3C has change control over this specification.