Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.
This CSS3 module defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white space handling, and text transformation.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This CSS module has been produced as a combined effort of the W3C Internationalization Activity, and the Style Activity and is maintained by the CSS Working Group. It also includes contributions made by participants in the XSL Working Group (members only).
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
Feedback on this draft should be posted to the (archived) public mailing list www-style@w3.org (see instructions) with [css3-text] in the subject line. You are strongly encouraged to complain if you see something stupid in this draft. The editors will do their best to respond to all feedback.
The following features are at risk and may be cut from the spec during its CR period if there are no (correct) implementations:
full-width’ value of ‘text-transform’
tab-size’ property
start end’ value of ‘text-align’
text-justify’ property
word-spacing’
word-spacing’
hanging-punctuation’ property
white-space’ property
This module describes the typesetting controls of CSS; that is, the features of CSS that control the translation of source text to formatted, line-wrapped text. Various CSS properties provide control over case transformation, white space collapsing, text wrapping, line breaking rules and hyphenation, alignment and justification, spacing, and indentation.
Font selection is covered in CSS Fonts Level 3 [CSS3-FONTS].
Features for decorating text, such as underlines, emphasis marks, and shadows, (previously part of this module) are covered in CSS Text Decoration Level 3 [CSS3-TEXT-DECOR].
Bidirectional and vertical text are addressed in CSS Writing Modes Level 3 [CSS3-WRITING-MODES].
This draft describes features that are specific to certain scripts. There is an ongoing discussion about where these features belong: in existing CSS properties, in new CSS properties, or perhaps in other specifications.
This module, together with [CSS3-TEXT-DECOR], replaces and extends the text-level features defined in [CSS21] chapter 16.
This specification follows the CSS property definition conventions from [CSS21]. Value types not defined in this specification are defined in CSS Level 2 Revision 1 [CSS21]. Other CSS modules may expand the definitions of these value types: for example [CSS3VAL], when combined with this module, expands the definition of the <length> value type as used in this specification.
In addition to the property-specific values listed in their definitions, all properties defined in this specification also accept the inherit keyword as their property value. For readability it has not been repeated explicitly.
A grapheme cluster is what a language user considers to be a character or a basic unit of the script. The term is described in detail in the Unicode Technical Report: Text Boundaries [UAX29]. This specification uses the extended grapheme cluster definition in [UAX29] (not the legacy grapheme cluster definition). The UA may further tailor the definition as allowed by Unicode. Within this specification, the ambiguous term character is used as a friendlier synonym for grapheme cluster. See Characters and Properties for how to determine the Unicode properties of a character.
A letter for the purpose of this specification is a character belonging to one of the Letter or Number general categories in Unicode. [UAX44]
The rendering characteristics of a character divided by an element boundary is undefined: it may be rendered as belonging to either side of the boundary, or as some approximation of belonging to both. Authors are forewarned that dividing grapheme clusters by element boundaries may give inconsistent or undesired results.
The content language of an element is the
(human) language the element is declared to be in, according to the rules
of the document
language. For example, the rules for determining the content language of an HTML element
use the lang attribute and are defined in [HTML5], and the rules for determining
the content language of an XML
element use the xml:lang attribute and are defined in [XML10]. Note that it is
possible for the content language
of an element to be unknown.
Other terminology and concepts used in this specification are defined in [CSS21] and [CSS3-WRITING-MODES].
text-transform’ property| Name: | text-transform |
|---|---|
| Value: | none | capitalize | uppercase | lowercase | full-width |
| Initial: | none |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | as specified |
This property transforms text for styling purposes. (It has no effect on the underlying content.) Values have the following meanings:
none’
capitalize’
uppercase’
lowercase’
full-width’
The definition of “word“ used for ‘capitalize’ is UA-dependent; [UAX29] is suggested (but not
required) for determining such word boundaries. Authors should not expect
‘capitalize’ to follow language-specific
titlecasing conventions (such as skipping articles in English).
The following example converts the ASCII characters used in abbreviations in Japanese text to their fullwidth variants so that they lay out and line break like ideographs:
abbr:lang(ja) { text-transform: full-width; }
The case mapping rules for the character repertoire specified by the Unicode Standard can be found on the Unicode Consortium Web site [UNICODE]. The UA must use the full case mappings for Unicode characters, including any conditional casing rules, as defined in Default Case Algorithm section. If (and only if) the content language of the element is, according to the rules of the document language, known, then any appropriate language-specific rules must be applied as well. These minimally include, but are not limited to, the language-specific rules in Unicode's SpecialCasing.txt.
For example, in Turkish there are two “i”s, one with a dot—“İ” and “i”— and one without—“I” and “ı”. Thus the usual case mappings between “I” and “i” are replaced with a different set of mappings to their respective undotted/dotted counterparts, which do not exist in English. This mapping must only take effect if the content language is Turkish (or another Turkic language that uses Turkish casing rules); in other languages, the usual mapping of “I” and “i” is required. This rule is thus conditionally defined in Unicode's SpecialCasing.txt file.
The definition of fullwidth and halfwidth forms can be found on the
Unicode consortium web site at [UAX11]. The mapping to fullwidth
form is defined by taking code points with the <wide>
or the <narrow> tag in their
Decomposition_Mapping in [UAX44]. For the
<narrow> tag, the mapping is from the code point to the
decomposition (minus <narrow> tag), and for the
<wide> tag, the mapping is from the decomposition
(minus the <wide> tag) back to the original code point.
Text transformation happens after white
space processing, which means that ‘full-width’ transforms only preserved U+0020 spaces to
U+3000.
A future level of CSS may introduce the ability to create
custom mapping tables for less common text transforms, such as by an
‘@text-transform’ rule similar to ‘@counter-style’ from [CSS-COUNTER-STYLES-3].
white-space’ property| Name: | white-space |
|---|---|
| Value: | normal | pre | nowrap | pre-wrap | pre-line |
| Initial: | not defined for shorthand properties |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | see individual properties |
This property specifies two things:
Values have the following meanings, which must be interpreted according to the White Space Processing and Line Breaking rules:
normal’
pre’
nowrap’
normal’, this value collapses white
space; but like ‘pre’, it does not allow
wrapping.
pre-wrap’
pre’, this value preserves white
space; but like ‘normal’, it allows wrapping.
pre-line’
normal’, this value collapses
consecutive spaces and allows wrapping, but preserves segment breaks in the source as forced line breaks.
The following informative table summarizes the behavior of various ‘white-space’
values:
| New Lines | Spaces and Tabs | Text Wrapping | |
|---|---|---|---|
‘normal’
| Collapse | Collapse | Wrap |
‘pre’
| Preserve | Preserve | No wrap |
‘nowrap’
| Collapse | Collapse | No wrap |
‘pre-wrap’
| Preserve | Preserve | Wrap |
‘pre-line’
| Preserve | Collapse | Wrap |
See White Space Processing Rules
for details on how white space collapses. An informative summary of
collapsing (‘normal’ and ‘nowrap’) is presented below:
See Line Breaking for details on wrapping behavior.
The source text of a document often contains formatting that is not relevant to the final rendering: for example, breaking the source into segments (lines) for ease of editing or adding white space characters such as tabs and spaces to indent the source code. CSS white space processing allows the author to control interpretation of such formatting: to preserve or collapse it away when rendering the document. White space processing in CSS interprets white space characters only for rendering: it has no effect on the underlying document data.
White space processing in CSS is controlled with the ‘white-space’
property.
CSS does not define document segmentation
rules. Segments could be separated by a particular newline seqence (such
as a line feed or CRLF pair), or delimited by some other mechanism, such
as the SGML RECORD-START and RECORD-END tokens.
For CSS processing, each document language–defined segment break, CRLF
sequence (U+000D U+000A), carriage return (U+000D), and line feed (U+000A)
in the text is treated as a segment break,
which is then interpreted for rendering as specified by the ‘white-space’
property.
Note that the document parser may have not only normalized any segment breaks, but also collapsed other space characters or otherwise processed white space according to markup rules. Because CSS processing occurs after the parsing stage, it is not possible to restore these characters for styling. Therefore, some of the behavior specified below can be affected by these limitations and may be user agent dependent.
Note that anonymous inlines consisting entirely of collapsible white space are removed from the rendering tree. See [CSS21] section 9.2.2.1
Control characters (Unicode class Cc) other than tab (U+0009), line feed (U+000A), and carriage return (U+000D) are ignored for the purpose of rendering.
White space processing affects only spaces (U+0020), tabs (U+0009), and segment breaks.
For each inline (including anonymous inlines) within an inline formatting context, white space characters are handled as follows, ignoring bidi formatting characters as if they were not there:
If ‘white-space’ is set to ‘normal’, ‘nowrap’, or
‘pre-line’, white space characters are
considered collapsible and are processed by
performing the following steps:
If ‘white-space’ is set to ‘pre-wrap’, any sequence of spaces is treated as a
sequence of non-breaking spaces. However, a soft wrap opportunity exists at
the end of the sequence.
Then, the entire block is rendered. Inlines are laid out, taking bidi
reordering into account, and wrapping as
specified by the ‘white-space’ property.
The following example illustrates the interaction of white-space collapsing and bidirectionality. Consider the following markup fragment, taking special note of spaces (with varied backgrounds and borders for emphasis and identification):
<ltr>A <rtl> B </rtl> C</ltr>
where the <ltr> element represents a left-to-right
embedding and the <rtl> element represents a
right-to-left embedding. If the ‘white-space’ property is set to ‘normal’, the white-space processing model would result
in the following:
This would leave two spaces, one after the A in the left-to-right embedding level, and one after the B in the right-to-left embedding level. This is then ordered according to the Unicode bidirectional algorithm, with the end result being:
A BC
Note that there are two spaces between A and B, and none between B and C. This is best avoided by putting spaces outside the element instead of just inside the opening and closing tags and, where practical, by relying on implicit bidirectionality instead of explicit embedding levels.
When ‘white-space’ is ‘pre’, ‘pre-wrap’, or
‘pre-line’, segment
breaks are not collapsible and
are instead transformed into a preserved line feed (U+000A).
For other values of ‘white-space’, segment breaks are collapsible, and are either transformed
into a space (U+0020) or removed depending on the context before and after
the break:
Note that the white space processing rules have already removed any tabs and spaces after the segment break before these checks take place.
Comments on how well this would work in practice would be very much appreciated, particularly from people who work with Thai and similar scripts. Note that browser implementations do not currently follow these rules (although IE does in some cases transform the break).
As each line is laid out,
tab-size’ property.
white-space’
set to ‘pre-wrap’ the UA may visually
collapse their character advance widths.
White space that was not removed or collapsed during the white space processing steps is called preserved white space.
tab-size’
property| Name: | tab-size |
|---|---|
| Value: | <integer> | <length> |
| Initial: | 8 |
| Applies to: | block containers |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | the specified integer or length made absolute |
This property determines the tab size used to render preserved tab characters (U+0009). Integers represent the measure as multiples of the space character's advance width (U+0020). Negative values are not allowed.
When inline-level content is laid out into lines, it is broken across line boxes. Such a break is called a line break. When a line is broken due to explicit line-breaking controls, or due to the start or end of a block, it is a forced line break. When a line is broken due to content wrapping (i.e. when the UA creates unforced line breaks in order to fit the content within the measure), it is a soft wrap break. The process of breaking inline-level content into lines is called line breaking.
Wrapping is only performed at an allowed break point, called a soft wrap opportunity.
In most writing systems, in the absence of hyphenation a soft wrap opportunity occurs only at word boundaries. Many such systems use spaces or punctuation to explicitly separate words, and soft wrap opportunities can be identified by these characters. Scripts such as Thai, Lao, and Khmer, however, do not use spaces or punctuation to separate words. Although the zero width space (U+200B) can be used as an explicit word delimiter in these scripts, this practice is not common. As a result, a lexical resource is needed to correctly identify soft wrap opportunities in such texts.
In several other writing systems, (including Chinese, Japanese, Yi, and sometimes also Korean) a soft wrap opportunity is based on syllable boundaries, not word boundaries. In these systems a line can break anywhere except between certain character combinations. Additionally the level of strictness in these restrictions can vary with the typesetting style.
CSS does not fully define where soft wrap opportunities occur, however some controls are provided to distinguish common variations.
Further information on line breaking conventions can be found in [JLREQ] and [JIS4051] for Japanese, [ZHMARK] for Chinese, and in [UAX14] for all scripts in Unicode.
Any guidance for appropriate references here would be much appreciated.
When determining line breaks:
white-space’ value, lines always break at
each preserved forced break character:
for all values, line-breaking behavior defined for the BK, CR, LF, CM,
NL, and SG line breaking classes in [UAX14] must be honored.
white-space’ allows wrapping, line breaking
behavior defined for the WJ, ZW, and GL line-breaking classes in [UAX14] must be
honored.
/’ and the ‘e’. The UA may use the width of the containing
block, the text's language, and other factors in assigning priorities. As
long as care is taken to avoid such awkward breaks, allowing breaks at
appropriate punctuation other than spaces is recommended, as it results
in more even-looking margins, particularly in narrow measures.
line-break’ property| Name: | line-break |
|---|---|
| Value: | auto | loose | normal | strict |
| Initial: | auto |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value |
This property specifies the strictness of line-breaking rules applied within an element: particularly how wrapping interacts with punctuation and symbols. Values have the following meanings:
auto’
loose’
normal’
strict’
It is unclear how Korean should be handled here. It should
perhaps not be included in the lists below (alongside Chinese and
Japanese). Also, the behavior of ‘word-break:
keep-all’ may be more appropriate if it also triggered ‘strict’ line-breaking here, at least in the case of
‘auto’. See this
thread for further discussion of these issues.
CSS distinguishes between three levels of strictness in the rules for text wrapping. The precise set of rules in effect for each level is up to the UA and should follow language conventions. However, this specification does require that:
strict’
line breaking and allowed in ‘normal’ and
‘loose’:
CJ. (See LineBreak.txt
in [UNICODE].)
normal’
and ‘strict’ line breaking and allowed in
‘loose’:
IN. (See LineBreak.txt
in [UNICODE].)
These rules should be cross-checked against JLREQ and any differences verified.
In the recommended list above, no distinction is made among the levels of strictness in non-CJK text: only CJK codepoints are affected, unless the text is marked as Chinese or Japanese, in which case some additional common codepoints are affected. However a future level of CSS may add behaviors affecting non-CJK text.
The CSSWG recognizes that in a future edition of the specification finer control over line breaking may be necessary to satisfy high-end publishing requirements.
word-break’ property| Name: | word-break |
|---|---|
| Value: | normal | keep-all | break-all |
| Initial: | normal |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value |
This property specifies soft wrap opportunities between letters, i.e. where it is “normal” and permissible to break lines of text.
For example, in some styles of CJK typesetting, English words are
allowed to break between any two letters, rather than only at spaces or
hyphenation points; this can be enabled with ‘word-break:break-all’.
An example of English text embedded in Japanese being broken at an arbitrary point in the word.
As another example, Korean has two styles of line-breaking: between any
two Korean syllables (‘word-break: normal’)
or, like English, mainly at spaces (‘word-break:
keep-all’).
각 줄의 마지막에 한글이 올 때 줄 나눔 기 /* break between syllables */ 준을 “글자” 또는 “어절” 단위로 한다.
각 줄의 마지막에 한글이 올 때 줄 나눔 /* break only at spaces */ 기준을 “글자” 또는 “어절” 단위로 한다.
To enable additional break opportunities only in the case of
overflow, see ‘overflow-wrap’.
Values have the following meanings:
normal’
break-all’
normal’ soft wrap opportunities, lines
may break between any two letters (except
where forbidden by the ‘line-break’ property). Hyphenation is not
applied. This option is used mostly in a context where the text is
predominantly using CJK characters with few non-CJK excerpts and it is
desired that the text be better distributed on each line.
keep-all’
line-break’) except where opportunities exist
due to dictionary-based breaking. Otherwise this option is equivalent to
‘normal’. In this style, sequences of CJK
characters do not break.
This is sometimes seen in Korean (which uses spaces between words), and is also useful for mixed-script text where CJK snippets are mixed into another language that uses spaces for separation.
Symbols that line-break the same way as letters of a particular category are affected the same way as those letters.
Here's a mixed-script sample text:
这是一些汉字, and some Latin, و کمی نوشتنن عربی, และตัวอย่างการเขียนภาษาไทย.
The break-points are determined as follows (indicated by ‘·’):
word-break: normal’
这·是·一·些·汉·字,·and·some·Latin,·و·کمی·نوشتنن·عربی·และ·ตัวอย่าง·การเขียน·ภาษาไทย.
word-break: break-all’
这·是·一·些·汉·字,·a·n·d·s·o·m·e·L·a·t·i·n,·و·ﮐ·ﻤ·ﻰ·ﻧ·ﻮ·ﺷ·ﺘ·ﻦ·ﻋ·ﺮ·ﺑ·ﻰ,·แ·ล·ะ·ตั·ว·อ·ย่·า·ง·ก·า·ร·เ·ขี·ย·น·ภ·า·ษ·า·ไ·ท·ย.
word-break: keep-all’
这是一些汉字,·and·some·Latin,·و·کمی·نوشتنن·عربی,·และตัวอย่างการเขียนภาษาไทย.
When shaping scripts such as Arabic are allowed to break within words
due to ‘break-all’, the characters must still
be shaped as if the word were not broken.
Hyphenation allows the controlled splitting of words to improve the layout of paragraphs, typically splitting words at syllabic or morphemic boundaries and visually indicating the split (usually by inserting a hyphen, U+2010). In some cases, hyphenation may also alter the spelling of a word. Regardless, hyphenation is a rendering effect only: it must have no effect on the underlying document content or on text selection or searching.
Hyphenation occurs when the line breaks at a valid hyphenation opportunity, which creates a
soft wrap opportunity within
the word. In CSS it is controlled with the ‘hyphens’ property. CSS Text Level 3 does not
define the exact rules for hyphenation, however UAs are strongly
encouraged to optimize their line-breaking implementation to choose good
break points and appropriate hyphenation points. Hyphenation opportunities
are considered when calculating ‘min-content’ intrinsic sizes.
CSS also provides the ‘overflow-wrap’ property, which can allow
arbitrary breaking within words when the text would otherwise overflow its
container.
hyphens’ property| Name: | hyphens |
| Value: | none | manual | auto |
|---|---|
| Initial: | manual |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value |
This property controls whether hyphenation is allowed to create more soft wrap opportunities within a line of text. Values have the following meanings:
none’
manual’
In Unicode, U+00AD is a conditional "soft hyphen" and U+2010 is an unconditional hyphen. Unicode Standard Annex #14 describes the role of soft hyphens in Unicode line breaking. [UAX14] In HTML, ­ represents the soft hyphen character which suggests a hyphenation opportunity.
ex­ample
auto’
Correct automatic hyphenation requires a hyphenation resource
appropriate to the language of the text being broken. The UA is therefore
only required to automatically hyphenate text for which the author has
declared a language (e.g. via HTML lang or XML
xml:lang) and for which it has an appropriate hyphenation
resource.
When shaping scripts such as Arabic are allowed to break within words due to hyphenation, the characters must still be shaped as if the word were not broken.
For example, if the word “نوشتنن” were hyphenated, it would appear as “ﻧﻮﺷ-ﺘﻦ” not as “ﻧﻮﺵ-ﺗﻦ”.
word-wrap’/‘overflow-wrap’
property| Name: | overflow-wrap/word-wrap |
|---|---|
| Value: | normal | break-word |
| Initial: | normal |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value |
This property specifies whether the UA may arbitrarily break within a
word to prevent overflow when an otherwise-unbreakable string is too long
to fit within the line box. It only has an effect when ‘white-space’ allows
wrapping. Possible values:
normal’
word-break:
keep-all’ may be relaxed to match ‘word-break:
normal’
if there are no otherwise-acceptable break points in the line.
break-word’
Soft wrap opportunities
introduced by ‘overflow-wrap: break-word’ are
not considered when calculating ‘min-content’
intrinsic sizes.
For legacy reasons, UAs must treat ‘word-wrap’ as an alternate name for the ‘overflow-wrap’
property, as if it were a shorthand of ‘overflow-wrap’.
text-align’ property| Name: | text-align |
|---|---|
| Value: | start | end | left | right | center | justify | match-parent | start end |
| Initial: | start |
| Applies to: | block containers |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value, except for ‘match-parent’ which computes as defined below
|
This property describes how the inline-level content of a block is aligned along the inline axis if the content does not completely fill the line box. Values have the following meanings:
start’
end’
left’
text-orientation’.) [CSS3-WRITING-MODES]
right’
text-orientation’.) [CSS3-WRITING-MODES]
center’
justify’
text-justify’
property, in order to exactly fill the line box.
match-parent’
inherit’ (computes to its parent's computed
value) except that an inherited ‘start’
or ‘end’ keyword is interpreted against
its parent's ‘direction’ value and
results in a computed value of either ‘left’ or ‘right’.
start end’
start’ alignment of the first
line and any line immediately after a forced line break; and ‘end’ alignment of any remaining lines.
A block of text is a stack of line boxes. This property specifies how the inline-level boxes within each line box align with respect to the start and end sides of the line box. Alignment is not with respect to the viewport or containing block.
In the case of ‘justify’, the UA may stretch
or shrink any inline boxes by adjusting their
text. (See ‘text-justify’.) If an element's white space is
not collapsible, then the UA is not required to
adjust its text for the purpose of justification and may instead treat the
text as having no expansion
opportunities. If the UA chooses to adjust the text, then it must
ensure that tab stops continue to line up as required by the white space processing rules.
If (after justification, if any) the inline contents of a line box are too long to fit within it, then the contents are start-aligned: any content that doesn't fit overflows the line box's end edge.
See Bidirectionality and line boxes for details on how to determine the start and end edges of a line box.
text-align-last’ property| Name: | text-align-last |
|---|---|
| Value: | auto | start | end | left | right | center | justify |
| Initial: | auto |
| Applies to: | block containers |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value |
This property describes how the last line of a block or a line right
before a forced line break is
aligned. If a line is also the first line of the block or the first line
after a forced line break, then,
unless ‘text-align’ assigns an explicit first line
alignment (via ‘start end’), ‘text-align-last’ takes precedence over ‘text-align’.
If ‘auto’ is specified,
content on the affected line is aligned per ‘text-align’ unless
‘text-align’ is
set to ‘justify’. In this case, content is
justified if ‘text-justify’ is ‘distribute’ and start-aligned otherwise. All other
values have the same meanings as in ‘text-align’.
text-justify’ property| Name: | text-justify |
|---|---|
| Value: | auto | none | inter-word | distribute |
| Initial: | auto |
| Applies to: | block containers and, optionally, inline elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | specified value |
This property selects the justification method used when a line's
alignment is set to ‘justify’ (see ‘text-align’). The
property applies to block containers, but the UA may (but is not required
to) also support it on inline elements. It takes the following values:
auto’
One possible algorithm is to choose the appropriate
justification behavior based on the language of the paragraph e.g.
following [JLREQ]
for Japanese, using cursive elongation for Arabic, using ‘inter-word’ for English, etc. Another possibility is
to use a justification method that is a simple universal compromise for
all writing systems, such as primarily expanding word separators along with secondarily
expanding between CJK and Southeast Asian letters.
An example of cursively-justified Arabic text rendered by Tasmeem.
Mixed-script text with ‘text-justify:
auto’: this interpretation uses a universal-compromise
justification method, expanding at spaces as well as between CJK and
Southeast Asian letters.
none’
This value is intended for use in user stylesheets to improve readability or for accessibility purposes.
inter-word’
word-spacing’ on the line, i.e. the primary
expansion opportunities are
at word separators. This behavior is
typical for languages that separate words using spaces, like English or
Korean.
Mixed-script text with ‘text-justify:
inter-word’
distribute’
letter-spacing’ on the line (except between
letters in cursive scripts such as Arabic), i.e.
the primary expansion
opportunities are between adjacent characters where both characters in the pair are
non-cursive. This value is sometimes used in e.g. Japanese.
Mixed-script text with ‘text-justify:
distribute’
When justifying text, the user agent takes the remaining space between
the ends of a line's contents and the edges of its line box, and
distributes that space throughout its contents so that the contents
exactly fill the line box. If the ‘letter-spacing’ and ‘word-spacing’
property values allow it, the user agent may also distribute negative
space, putting more content on the line than would otherwise fit under
normal spacing conditions. The exact justification algorithm is
UA-dependent; however, CSS provides some general guidelines which should
be followed when a justification method other than ‘auto’ is specified.
The guidelines in this level of CSS do not describe a complete justification algorithm. They are merely a minimum set of requirements that a complete algorithm should meet. Limiting the set of requirements gives UAs some latitude in choosing a justification algorithm that meets their needs and desired balance of quality, speed, and complexity.
For instance, a basic but fast ‘inter-word’
justification algorithm might use a simple greedy method for determining
line breaks, then distribute leftover space. This algorithm could follow
the guidelines by expanding word spaces first, expanding between letters
only if ‘word-spacing’ hit a limit.
A more sophisticated but slower ‘inter-word’
justification algorithm might use a Knuth/Plass method where expansion opportunities and
limits were assigned weights and assessed with other line breaking
considerations. This algorithm could follow the guidelines by giving more
weight to word separators than
letter spacing.
An expansion opportunity is a point where the justification algorithm may alter spacing within the text. The UA divides expansion opportunities into different priority levels: within a level, all expansion opportunities are expanded or compressed at the same priority.
When determining expansion
opportunities, characters from the Unicode Symbols (S*) and
Punctuation (P*) classes are generally treated the same as a letter: in the case of ‘inter-word’, as a Latin letter, in the case of ‘distribute’, as a Han letter, and in the case of ‘auto’, as a letter of the dominant script. However, by
typographic tradition there may be additional rules controlling the
justification of symbols and punctuation. Therefore, the UA may reassign
specific characters or introduce additional levels of prioritization to
handle expansion opportunities
involving symbols and punctuation.
For example, there are traditionally no expansion opportunities between consecutive EM DASH U+2014, HORIZONTAL BAR U+2015, HORIZONTAL ELLIPSIS U+2026, or TWO DOT LEADER U+2025 characters [JLREQ]; thus a UA might assign these characters to a “never” prioritization level. As another example, certain fullwidth punctuation characters are considered to contain an expansion opportunity in Japanese. The UA might therefore assign these characters to a higher prioritization level than the opportunities between ideographic characters.
The ‘word-spacing’ property can specify limits on
expansion opportunities
introduced by word separators. How
any remaining space is distributed once all expansion opportunities reach
their limits is up to the UA.
If the inline contents of a line cannot be stretched to the full width
of the line box, then they must be aligned as specified by the ‘text-align-last’ property. (If ‘text-align-last’ is ‘justify’, then they must be aligned as for ‘center’ if ‘text-justify’ is ‘distribute’, and as ‘start’
otherwise.)
3.8 Line Adjustment in [JLREQ] gives an example of a set of
rules for how a text formatter can justify Japanese text. It describes
rules for cases where the ‘text-justify’ property is ‘auto’.
Note that the rules described in the document specifically target Japanese. Therefore they may produce non-optimal results when used to justify other languages such as English. To make the rules more applicable to other scripts, the UA could, for instance, omit the rule to compress half-width spaces (rule a. of 3.8.3).
The UA may enable or break optional ligatures or use other font features such as alternate glyphs or glyph compression to help justify the text under any method. This behavior is not controlled by this level of CSS.
CSS offers control over text spacing via the ‘word-spacing’ and
‘letter-spacing’ properties, which specify
additional space around word
separators or between characters,
respectively. Level 3 offers the ability to control the justification
behavior of ‘word-spacing’. In addition the ‘word-spacing’
property can now be specified in percentages, making it possible to, for
example, double or eliminate word spacing.
In the following example, word spacing is halved, but may expand up to its full amount if needed for text justification.
p { word-spacing: -50% 0%; }
word-spacing’
property| Name: | word-spacing |
|---|---|
| Value: | [ normal | <length> | <percentage>]{1,3} |
| Initial: | normal |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | refers to width of the affected glyph |
| Media: | visual |
| Computed value: | an optimum, minimum, and maximum value, each consisting of either an
absolute length, a percentage, or the keyword ‘normal’
|
This property specifies the optimum, maximum, and minimum spacing (in
that order) between “words”. Missing values are assumed to be ‘normal’. Values are interpreted as defined below:
normal’
normal’ optimum spacing value computes to zero.
<length>’
<percentage>’
Additional spacing is applied to each word separator character left in the text after the white space processing rules have been applied, and should be applied half on each side of the character unless otherwise dictated by typographic tradition.
The following example will make all the spaces between words in Arabic be rendered as zero-width, and double the width of each space in English:
:lang(ar) { word-spacing: -100%; }
:lang(en) { word-spacing: 100%; }
The following example will add half the the width of the “0” glyph to word spacing character [CSS3VAL]:
p { word-spacing: 0.5ch; }
In the absence of justification the optimum spacing must be used. The
text justification process may alter the spacing from its optimum (see the
‘text-justify’
property, above) but must not violate the minimum spacing limit and should
also avoid exceeding the maximum. The UA may also use the difference
between the minimum/maximum limits and the optimum as input into a
weighting algorithm for justification.
The used optimum and maximum spacing is floored at the minimum. Similarly if the maximum is less than the optimum, then the used optimum is limited to the used maximum.
Normal spacing: Although ‘normal’ minimum and maximum spacing limits are
UA-defined, they must be defined relative to the optimum so that the
limits increase and decrease with changes to the optimum spacing. Normal
limits may also vary according to the value of the ‘text-justify’
property, the element's language, some measure of the amount of text on a
line (e.g. block width divided by font size), and/or other factors.
Word-separator characters include the space (U+0020), the no-break space (U+00A0), the Ethiopic word space (U+1361), the Aegean word separators (U+10100,U+10101), the Ugaritic word divider (U+1039F), and the Phoenician Word Separator (U+1091F). If there are no word-separator characters, or if a word-separating character has a zero advance width (such as the zero width space U+200B) then the user agent must not create an additional spacing between words. General punctuation and fixed-width spaces (such as U+3000 and U+2000 through U+200A) are not considered word-separator characters.
letter-spacing’
property| Name: | letter-spacing |
|---|---|
| Value: | normal | <length> |
| Initial: | normal |
| Applies to: | all elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | an absolute length or the keyword ‘normal’
|
This property specifies additional spacing (tracking) between adjacent
characters. Letter-spacing is applied
after bidi
reordering and is in addition to any ‘word-spacing’. Values have the following
meanings:
normal’
<length>’
CSS2.1 defines that ‘letter-spacing’ values other than ‘normal’ forbid the adjustment of letter-spacing during
justification. This means that tracking can't be used in conjunction with,
e.g. CJK justification methods. However, allowing it would mean we need a
control for disabling it; German, for example, avoids letter-spacing for
justification because it's used for emphasis. See this
thread.
Letter-spacing must not be applied at the beginning or at the end of a line. The total letter spacing between two adjacent characters (after bidi reordering) is specified by and rendered within the innermost element that contains the boundary between the two characters.
For example, given the markup
<P>a<LS>b<Z>cd</Z><Y>ef</Y></LS>g</P>
and the style sheet
LS { letter-spacing: 1em; }
Z { letter-spacing: 0.3em; }
Y { letter-spacing: 0.4em; }
the spacing would be as [noted] below:
a[0]b[1em]c[0.3em]d[1em]e[0.4em]f[0]g
Letter-spacing ignores zero-width characters (such as those from the
Unicode Cf category). For example, ‘letter-spacing’ applied to
A&zwsp;B is identical to AB.
For the purpose of ‘letter-spacing’, each consecutive run of
atomic inlines (such as images and inline blocks) is treated as a single
character.
When the effective letter-spacing between two characters is not zero
(due to either justification or non-zero
computed ‘letter-spacing’), user agents should not apply
optional ligatures.
UAs may apply letter-spacing to cursive scripts. In this case, UAs should extend the space between disjoint characters as specified above and extend the visible connection between cursively connected characters by the same amount (rather than leaving a gap). The UA may use glyph substitution or other font capabilities to spread out the letters. If the UA cannot expand a cursive script without breaking the cursive connections, it should not apply letter-spacing between characters of that script at all.
Current UAs just put gaps between joined letters in cursive scripts.
Edge effects control the indentation of lines with respect to other
lines in the block (‘text-indent’) and how content is measured at
the start and end edges of a line (‘hanging-punctuation’).
text-indent’ property| Name: | text-indent |
|---|---|
| Value: | [ <length> | <percentage> ] && hanging? && each-line? |
| Initial: | 0 |
| Applies to: | block containers |
| Inherited: | yes |
| Percentages: | refers to width of containing block |
| Media: | visual |
| Computed value: | the percentage as specified or the absolute length, plus any keywords as specified |
This property specifies the indentation applied to lines of inline
content in a block. The indent is treated as a margin applied to the start
edge of the line box. Unless otherwise specified via the ‘each-line’ and/or ‘hanging’ keywords, only lines
that are the first
formatted line of an element are affected. For example, the first line
of an anonymous block box is only affected if it is the first child of its
parent element.
Values have the following meanings:
<length>’
<percentage>’
each-line’
hanging’
If ‘text-align’ is ‘start’ and ‘text-indent’ is ‘5em’ in left-to-right text with no floats present,
then first line of text will start 5em into the block:
Since CSS1 it has been possible to indent the first line of a block element using the 'text-indent' property.
Note that since the ‘text-indent’ property inherits, when specified
on a block element, it will affect descendant inline-block elements. For
this reason, it is often wise to specify ‘text-indent:
0’ on elements that are specified ‘display:
inline-block’.
hanging-punctuation’ property| Name: | hanging-punctuation |
|---|---|
| Value: | none | [ first || [ force-end | allow-end ] || last ] |
| Initial: | none |
| Applies to: | inline elements |
| Inherited: | yes |
| Percentages: | N/A |
| Media: | visual |
| Computed value: | as specified |
This property determines whether a punctuation mark, if one is present, hangs and may be placed outside the line box (or in the indent) at the start or at the end of a line of text.
Note that if there is not sufficient padding on the block
container, ‘hanging-punctuation’ can trigger overflow.
When a punctuation mark hangs, it is not considered when measuring the line's contents for fit, alignment, or justification. Depending on the line's alignment, this may (or may not) result in the mark being placed outside the line box.
Values have the following meanings:
none’
first’
last’
force-end’
allow-end’
Non-zero start and end borders/padding between a hangeable mark and the edge of the line prevent the mark from hanging. For example, a period at the end of an inline box with end padding does not hang at the end edge of a line. At most one punctuation character may hang at each edge of the line.
A hanging punctuation mark is still enclosed inside its inline box and participates in text justification: its character advance width is just not measured when determining how much content fits on the line, how much the line's contents need to be expanded or compressed for justification, or how to position the content within the line box for text alignment.
Stops and commas allowed to hang include:
| U+002C | , | COMMA |
| U+002E | . | FULL STOP |
| U+060C | ، | ARABIC COMMA |
| U+06D4 | ۔ | ARABIC FULL STOP |
| U+3001 | 、 | IDEOGRAPHIC COMMA |
| U+3002 | 。 | IDEOGRAPHIC FULL STOP |
| U+FF0C | , | FULLWIDTH COMMA |
| U+FF0E | . | FULLWIDTH FULL STOP |
| U+FE50 | ﹐ | SMALL COMMA |
| U+FE51 | ﹑ | SMALL IDEOGRAPHIC COMMA |
| U+FE52 | ﹒ | SMALL FULL STOP |
| U+FF61 | 。 | HALFWIDTH IDEOGRAPHIC FULL STOP |
| U+FF64 | 、 | HALFWIDTH IDEOGRAPHIC COMMA |
The UA may include other characters as appropriate.
The CSS Working Group would appreciate if UAs including other characters would inform the working group of such additions.
The ‘allow-end’ and ‘force-end’ are two variations of hanging punctuation
used in East Asia.

p {
text-align: justify;
hanging-punctuation: allow-end;
}

p {
text-align: justify;
hanging-punctuation: force-end;
}
The punctuation at the end of the first line for ‘allow-end’ does not hang, because it fits without
hanging. However, if ‘force-end’ is used, it
is forced to hang. The justification measures the line without the
hanging punctuation. Therefore when the line is expanded, the punctuation
is pushed outside the line.
The start and
end
edges of a line box are determined by the inline base direction of
the line box. In most cases, this is given by its containing block's
computed ‘direction’.
However if its containing block has ‘unicode-bidi:
plaintext’ [CSS3-WRITING-MODES],
the line box's inline base direction must be determined by the
base direction of the bidi paragraph to which it belongs:
that is, the bidi paragraph for which the line box holds content.
An empty line box (i.e. one that contains no atomic inlines or characters
other than the line-breaking character, if any), takes its inline base
direction from the preceding line box (if any), or, if this is the
first line box in the containing block, then from the ‘direction’ property of the containing block.
In the following example, assuming the <block> is a
preformatted block (‘display: block; white-space:
pre’) inheriting ‘text-align: start’,
every other line is right-aligned:
<block style="unicode-bidi: plaintext"> Latin و·کمی Latin و·کمی Latin و·کمی </block>
Note that the inline base direction determined here applies
to the line box itself, and not to its contents. It affects ‘text-align’, ‘text-align-last’, ‘text-indent’, and
‘hanging-punctuation’, i.e. the position and
alignment of its contents with respect to its edges. It does not affect
the formatting or ordering of its content.
In the following example:
<para style="display: block; direction: rtl; unicode-bidi:plaintext"> <quote style="unicode-bidi:plaintext">שלום!</quote>", he said. </para>
The result should be a left-aligned line looking like this:
"!שלום", he said.
The line is left-aligned (despite the containing block having ‘direction: rtl’) because the containing block (the
<para>) has ‘unicode-bidi:plaintext’, and the line box belongs to a
bidi paragraph that is LTR. This is because that paragraph's first
character with a strong direction is the LTR "h" from "he". The RTL
"שלום!" does precede the "he", but it sits in its own bidi-isolated
paragraph that is not immediately contained by the
<para>, and is thus irrelevent to the line box's
alignment. From from the standpoint of the bidi paragraph immediately
contained by the <para> containing block, the
<quote>’s bidi-isolated paragraph inside it is, by
definition, just a neutral U+FFFC character, so the immediately-contained
paragraph becomes LTR by virtue of the "he" following it.
<fieldset style="direction: rtl"> <textarea style="unicode-bidi:plaintext"> Hello! </textarea> </fieldset>
As expected, the "Hello!" should be displayed LTR (i.e. with the
exclamation mark on the right end, despite the
<textarea>‘s ’‘direction:rtl’‘) and left-aligned.
This makes the empty line following it left-aligned as well, which means
that the caret on that line should appear at its left edge. The first
empty line, on the other hand, should be right-aligned, due to the RTL
direction of its containing paragraph, the
<textarea>.
Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.
All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]
Examples in this specification are introduced with the words “for
example” or are set apart from the normative text with
class="example", like this:
This is an example of an informative example.
Informative notes begin with the word “Note” and are set apart from
the normative text with class="note", like this:
Note, this is an informative note.
Conformance to CSS Text Level 3 is defined for three conformance classes:
A style sheet is conformant to CSS Text Level 3 if all of its declarations that use properties defined in this module have values that are valid according to the generic CSS grammar and the individual grammars of each property as given in this module.
A renderer is conformant to CSS Text Level 3 if, in addition to interpreting the style sheet as defined by the appropriate specifications, it supports all the features defined by CSS Text Level 3 by parsing them correctly and rendering the document accordingly. However, the inability of a UA to correctly render a document due to limitations of the device does not make the UA non-conformant. (For example, a UA is not required to render color on a monochrome monitor.)
An authoring tool is conformant to CSS Text Level 3 if it writes style sheets that are syntactically correct according to the generic CSS grammar and the individual grammars of each feature in this module, and meet all other conformance requirements of style sheets as described in this module.
So that authors can exploit the forward-compatible parsing rules to assign fallback values, CSS renderers must treat as invalid (and ignore as appropriate) any at-rules, properties, property values, keywords, and other syntactic constructs for which they have no usable level of support. In particular, user agents must not selectively ignore unsupported component values and honor supported values in a single multi-value property declaration: if any value is considered invalid (as unsupported values must be), CSS requires that the entire declaration be ignored.
To avoid clashes with future CSS features, the CSS2.1 specification reserves a prefixed syntax for proprietary and experimental extensions to CSS.
Prior to a specification reaching the Candidate Recommendation stage in the W3C process, all implementations of a CSS feature are considered experimental. The CSS Working Group recommends that implementations use a vendor-prefixed syntax for such features, including those in W3C Working Drafts. This avoids incompatibilities with future changes in the draft.
Once a specification reaches the Candidate Recommendation stage, non-experimental implementations are possible, and implementors should release an unprefixed implementation of any CR-level feature they can demonstrate to be correctly implemented according to spec.
To establish and maintain the interoperability of CSS across implementations, the CSS Working Group requests that non-experimental CSS renderers submit an implementation report (and, if necessary, the testcases used for that implementation report) to the W3C before releasing an unprefixed implementation of any CSS features. Testcases submitted to W3C are subject to review and correction by the CSS Working Group.
Further information on submitting testcases and implementation reports can be found from on the CSS Working Group’s website at http://www.w3.org/Style/CSS/Test/. Questions should be directed to the public-css-testsuite@w3.org mailing list.
For this specification to be advanced to Proposed Recommendation, there must be at least two independent, interoperable implementations of each feature. Each feature may be implemented by a different set of products, there is no requirement that all features be implemented by a single product. For the purposes of this criterion, we define the following terms:
The specification will remain Candidate Recommendation for at least six months.
This specification would not have been possible without the help from: Ayman Aldahleh, Bert Bos, Tantek Çelik, Stephen Deach, John Daggett, Martin Dürst, Laurie Anna Edlund, Ben Errez, Yaniv Feinberg, Arye Gittelman, Ian Hickson, Martin Heijdra, Richard Ishida, Masayasu Ishikawa, Michael Jochimsen, Eric LeVine, Ambrose Li, Håkon Wium Lie, Chris Lilley, Ken Lunde, Nat McCully, Shinyu Murakami, Paul Nelson, Chris Pratley, Marcin Sawicki, Arnold Schrijver, Rahul Sonnad, Michel Suignard, Takao Suzuki, Frank Tang, Chris Thrasher, Etan Wexler, Chris Wilson, Masafumi Yabe and Steve Zilles.
Major changes include:
inter-ideograph’, ‘inter-cluster’, and ‘kashida’ values of ‘text-justify’.
letter-spacing’.
word-spacing’ to
default missing values to ‘normal’.
text-align’.
Significant details updated:
This appendix is informative, and is to help UA developers to implement default stylesheet, but UA developers are free to ignore or change.
/* make list items and option elements align together */
li, option { text-align: match-parent; }
If you find any issues, recommendations to add, or corrections, please send the information to www-style@w3.org with [css3-text] in the subject line.
This appendix is normative.
The following scripts in Unicode 6 are considered to be cursive scripts, and do not admit expansion opportunities between their letters: Arabic, Mandaic, Mongolian, N'Ko, Phags Pa, Syriac
The following list defines the order of text operations. (Implementations are not bound to this order as long as the resulting layout is the same.)
| Property | Values | Initial | Applies to | Inh. | Percentages | Media |
|---|---|---|---|---|---|---|
| hanging-punctuation | none | [ first || [ force-end | allow-end ] || last ] | none | inline elements | yes | N/A | visual |
| hyphens | none | manual | auto | manual | all elements | yes | N/A | visual |
| letter-spacing | normal | <length> | normal | all elements | yes | N/A | visual |
| line-break | auto | loose | normal | strict | auto | all elements | yes | N/A | visual |
| overflow-wrap | normal | break-word | normal | all elements | yes | N/A | visual |
| tab-size | <integer> | <length> | 8 | block containers | yes | N/A | visual |
| text-align | start | end | left | right | center | justify | match-parent | start end | start | block containers | yes | N/A | visual |
| text-align-last | auto | start | end | left | right | center | justify | auto | block containers | yes | N/A | visual |
| text-indent | [ <length> | <percentage> ] && hanging? && each-line? | 0 | block containers | yes | refers to width of containing block | visual |
| text-justify | auto | none | inter-word | distribute | auto | block containers and, optionally, inline elements | yes | N/A | visual |
| text-transform | none | capitalize | uppercase | lowercase | full-width | none | all elements | yes | N/A | visual |
| white-space | normal | pre | nowrap | pre-wrap | pre-line | not defined for shorthand properties | all elements | yes | N/A | visual |
| word-break | normal | keep-all | break-all | normal | all elements | yes | N/A | visual |
| word-spacing | [ normal | <length> | <percentage>]{1,3} | normal | all elements | yes | refers to width of the affected glyph | visual |
| word-wrap | normal | break-word | normal | all elements | yes | N/A | visual |