Line Break Behavior Details
The following table enumerates the line breaking semantics of all characters in a manner consistent with CSS3 Text.
The columns of the table are defined as follows:
- Code - Unicode code point (in hexadecimal), except - which means all other characters not explicitly listed
- UAX14 - Line breaking class assigned by UAX14
- ICU() - Behavior implemented by ICU when primary language subtag of locale is not specified, not 'ja', or otherwise not explicitly supported
- ICU(ja) - Behavior implemented by ICU when primary language subtag of locale is 'ja' (or equivalent)
- Loose() - Behavior prescribed by CSS3 when
line-break
isloose
and content language is not Chinese, Japanese, or Korean - Loose(cjk) - Behavior prescribed by CSS3 when
line-break
isloose
and content language is Chinese, Japanese, or Korean - Normal() - Behavior prescribed by CSS3 when
line-break
isnormal
and content language is not Chinese, Japanese, or Korean - Normal(cjk) - Behavior prescribed by CSS3 when
line-break
isnormal
and content language is Chinese, Japanese, or Korean - Strict() - Behavior prescribed by CSS3 when
line-break
isstrict
and content language is not Chinese, Japanese, or Korean - Strict(cjk) - Behavior prescribed by CSS3 when
line-break
isstrict
and content language is Chinese, Japanese, or Korean - Character Name - Unicode character name
The values of the UAX14 column designate (a subset of the) line breaking classes defined by UAX14 as follows:
- BA - Break After
- CJ - Conditional Japanese Starter
- EX - Exclamation/Interrogation
- IN - Inseparable
- IS - Infix Numeric Separator
- NS - Nonstarters
- PO - Postfix Numeric
- PR - Prefix Numeric
- - - as otherwise defined by UAX14
The values of the ICU() through Strict(ja) columns designate the following breaking behavior:
- A - break permitted after
- B - break permitted before
- B/A - break permitted before or after
- XA - break excluded after
- XB - break excluded before
- XP - break excluded between any pair in class
- - - as otherwise defined by default line breaking behavior
ISSUE: Need to verify behavior for
U+2010
(hyphen) andU+2013
(en dash) below. UAX14 characterizes as BA (break permitted after) but doesn't address break before behavior, while CSS3 Text prescribes certain break before behavior but not break after behavior.
Code | UAX14 | ICU() | ICU(ja) | Loose() | Loose(cjk) | Normal() | Normal(cjk) | Strict() | Strict(cjk) | Character Name |
0021 | EX | XB | XB | - | B/A | - | XB | - | XB | exclamation mark |
0024 | PR | XA | XA | - | B/A | - | XA | - | XA | dollar sign |
0025 | PO | XB | XB | - | B/A | - | XB | - | XB | percent sign |
003A | IS | XB | XB | - | B/A | - | XB | - | XB | colon |
003B | IS | XB | XB | - | B/A | - | XB | - | XB | semicolon |
003F | EX | XB | XB | - | B/A | - | XB | - | XB | question mark |
00A2 | PO | XB | XB | - | B/A | - | XB | - | XB | cent sign |
00A3 | PR | XA | XA | - | B/A | - | XA | - | XA | pound sign |
00A5 | PR | XA | XA | - | B/A | - | XA | - | XA | yen sign |
00B0 | PO | XB | XB | - | B/A | - | XB | - | XB | degree sign |
2010 | BA | A | A | - | B/A | - | B/A | - | XB | hyphen |
2013 | BA | A | A | - | B/A | - | B/A | - | XB | en dash |
2025 | IN | XP | XP | B/A | B/A | XP | XP | XP | XP | two dot leader |
2026 | IN | XP | XP | B/A | B/A | XP | XP | XP | XP | ellipsis |
2030 | PO | XB | XB | - | B/A | - | XB | - | XB | per mille sign |
2032 | PO | XB | XB | - | B/A | - | XB | - | XB | prime |
2033 | PO | XB | XB | - | B/A | - | XB | - | XB | double prime |
203C | NS | XB | XB | - | B/A | - | XB | - | XB | double exclamation mark |
2047 | NS | XB | XB | - | B/A | - | XB | - | XB | double question mark |
2048 | NS | XB | XB | - | B/A | - | XB | - | XB | question exclamation mark |
2049 | NS | XB | XB | - | B/A | - | XB | - | XB | exclamation question mark |
20AC | PR | XA | XA | - | B/A | - | XA | - | XA | euro sign |
2103 | PO | XB | XB | - | B/A | - | XB | - | XB | degree celsius |
2116 | PR | XA | XA | - | B/A | - | XA | - | XA | numero sign |
3005 | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | ideographic iteration mark |
301C | NS | XB | XB | - | B/A | - | B/A | - | XB | wave dash |
303B | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | vertical ideographic iteration mark |
3041 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small a |
3043 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small i |
3045 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small u |
3047 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small e |
3049 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small o |
3063 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small tu |
3083 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small ya |
3085 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small yu |
3087 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small yo |
308E | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small wa |
3095 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small ka |
3096 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small ke |
309D | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | hiragana iteration mark |
309E | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | hiragana voiced iteration mark |
30A0 | NS | XB | XB | - | B/A | - | B/A | - | XB | katakana-hiragana double hyphen |
30A1 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small a |
30A3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small i |
30A5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small u |
30A7 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small e |
30A9 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small o |
30C3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small tu |
30E3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ya |
30E5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small yu |
30E7 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small yo |
30EE | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small wa |
30F5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ka |
30F6 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ke |
30FB | NS | XB | XB | - | B/A | - | XB | - | XB | katakana middle dot |
30FC | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana-hiragana prolonged sound mark |
30FD | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | katakana iteration mark |
30FE | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | katakana voiced iteration mark |
31F0 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ku |
31F1 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small si |
31F2 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small su |
31F3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small to |
31F4 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small nu |
31F5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ha |
31F6 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small hi |
31F7 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small hu |
31F8 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small he |
31F9 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ho |
31FA | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small mu |
31FB | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ra |
31FC | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ri |
31FD | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ru |
31FE | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small re |
31FF | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ro |
FF01 | EX | XB | XB | - | B/A | - | XB | - | XB | fullwidth exclamation mark |
FF04 | PR | XA | XA | - | B/A | - | XA | - | XA | fullwidth dollar sign |
FF05 | PO | XB | XB | - | B/A | - | XB | - | XB | fullwidth percent sign |
FF1A | NS | XB | XB | - | B/A | - | XB | - | XB | fullwidth colon |
FF1B | NS | XB | XB | - | B/A | - | XB | - | XB | fullwidth semicolon |
FF1F | EX | XB | XB | - | B/A | - | XB | - | XB | fullwidth question mark |
FF65 | NS | XB | XB | - | B/A | - | XB | - | XB | halfwidth katakana middle dot |
FF67 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small a |
FF68 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small i |
FF69 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small u |
FF6A | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small e |
FF6B | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small o |
FF6C | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small ya |
FF6D | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small yu |
FF6E | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small yo |
FF6F | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small tu |
FF70 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana-hiragana prolonged sound mark |
FFE0 | PO | XB | XB | - | B/A | - | XB | - | XB | fullwidth cent sign |
FFE1 | PR | XA | XA | - | B/A | - | XA | - | XA | fullwidth pound sign |
FFE5 | PR | XA | XA | - | B/A | - | XA | - | XA | fullwidth yen sign |
- | - | - | - | - | - | - | - | - | - | all other characters |
Implementation Details
- If
-webkit-line-break
isauto
, content language is not Chinese, Japanese, or Korean, and ICU does not provide a tailored set of rules that applies to the content language, then use the ICU() column's behavior; - If
-webkit-line-break
isauto
, content language is not Chinese, Japanese, or Korean, and ICU does provide a tailored set of rules that applies to the content language, then use the ICU() column's behavior modulo application of tailored rules; - If
-webkit-line-break
isauto
and content language is Chinese, Japanese, or Korean, then use the Normal(cjk) column's behavior; - If
-webkit-line-break
isloose
and content language is not Chinese, Japanese, or Korean, then use the Loose() column's behavior; - If
-webkit-line-break
isloose
and content language is Chinese, Japanese, or Korean, then use the Loose(cjk) column's behavior; - If
-webkit-line-break
isnormal
and content language is not Chinese, Japanese, or Korean, then use the Normal() column's behavior; - If
-webkit-line-break
isnormal
and content language is Chinese, Japanese, or Korean, then use the Normal(cjk) column's behavior; - If
-webkit-line-break
isstrict
and content language is not Chinese, Japanese, or Korean, then use the Strict() column's behavior; - If
-webkit-line-break
isstrict
and content language is Chinese, Japanese, or Korean, then use the Strict(cjk) column's behavior; - If
-webkit-line-break
isafter-white-space
, then use the procedure defined in Handling of after-white-space.
For implementation purposes, content language is determined as described by to determine the language of a node.
For implementation purposes, default line breaking behavior is interpreted as the behavior implemented by ICU when applying either ICU's default rules or locale specific rules tailored to an identified content language.
Handling of after-white-space
ISSUE: To be supplied.