Version 5 (modified by 11 years ago) (diff) | ,
---|
Per-Character CSS3 Line Break Semantics
The following table enumerates the line breaking semantics of specific characters explicitly enumerated by CSS3 Text.
The columns of the table are defined as follows:
- Code - Unicode code point (in hexadecimal), except - which means all other characters not explicitly listed
- UAX14 - Line breaking class assigned by UAX14
- ICU() - Behavior implemented by ICU when primary language subtag of locale is not specified or not 'ja' or 'zh' (or equivalent)
- ICU(ja) - Behavior implemented by ICU when primary language subtag of locale is 'ja' or 'zh' (or equivalent)
- Loose() - Behavior prescribed by CSS3 when
line-break
isloose
and content language is not Japanese or Chinese - Loose(ja) - Behavior prescribed by CSS3 when
line-break
isloose
and content language is not Japanese or Chinese - Normal() - Behavior prescribed by CSS3 when
line-break
isnormal
and content language is not Japanese or Chinese - Normal(ja) - Behavior prescribed by CSS3 when
line-break
isnormal
and content language is not Japanese or Chinese - Strict() - Behavior prescribed by CSS3 when
line-break
isstrict
and content language is not Japanese or Chinese - Strict(ja) - Behavior prescribed by CSS3 when
line-break
isstrict
and content language is not Japanese or Chinese - Character Name - Unicode character name
The values of the UAX14 column designate (a subset of the) line breaking classes defined by UAX14 as follows:
- BA - Break After
- CJ - Conditional Japanese Starter
- EX - Exclamation/Interrogation
- IN - Inseparable
- NS - Nonstarters
- PO - Postfix Numeric
- PR - Prefix Numeric
The values of the ICU() through Strict(ja) columns designate the following breaking behavior:
- B/A - break permitted before or after
- XA - break excluded after
- XB - break excluded before
- XP - break excluded between any pair in class
- - - break behavior defined by default line breaking behavior
Code | UAX14 | ICU() | ICU(ja) | Loose() | Loose(ja) | Normal() | Normal(ja) | Strict() | Strict(ja) | Character Name |
0021 | EX | XB | XB | - | B/A | - | XB | - | XB | exclamation mark |
0024 | PR | XA | XA | - | B/A | - | XA | - | XA | dollar sign |
0025 | PO | XB | XB | - | B/A | - | XB | - | XB | percent sign |
003A | IS | XB | XB | - | B/A | - | XB | - | XB | colon |
003B | IS | XB | XB | - | B/A | - | XB | - | XB | semicolon |
003F | EX | XB | XB | - | B/A | - | XB | - | XB | question mark |
00A2 | PO | XB | XB | - | B/A | - | XB | - | XB | cent sign |
00A3 | PR | XA | XA | - | B/A | - | XA | - | XA | pound sign |
00A5 | PR | XA | XA | - | B/A | - | XA | - | XA | yen sign |
00B0 | PO | XB | XB | - | B/A | - | XB | - | XB | degree sign |
2010 | BA | B/A | B/A | - | B/A | - | B/A | - | XB | hyphen |
2013 | BA | B/A | B/A | - | B/A | - | B/A | - | XB | en dash |
2025 | IN | XP | XP | B/A | B/A | XP | XP | XP | XP | two dot leader |
2026 | IN | XP | XP | B/A | B/A | XP | XP | XP | XP | ellipsis |
2030 | PO | XB | XB | - | B/A | - | XB | - | XB | per mille sign |
2032 | PO | XB | XB | - | B/A | - | XB | - | XB | prime |
2033 | PO | XB | XB | - | B/A | - | XB | - | XB | double prime |
203C | NS | XB | XB | - | B/A | - | XB | - | XB | double exclamation mark |
2047 | NS | XB | XB | - | B/A | - | XB | - | XB | double question mark |
2048 | NS | XB | XB | - | B/A | - | XB | - | XB | question exclamation mark |
2049 | NS | XB | XB | - | B/A | - | XB | - | XB | exclamation question mark |
20AC | PR | XA | XA | - | B/A | - | XA | - | XA | euro sign |
2103 | PO | XB | XB | - | B/A | - | XB | - | XB | degree celsius |
2116 | PR | XA | XA | - | B/A | - | XA | - | XA | numero sign |
3005 | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | ideographic iteration mark |
301C | NS | XB | XB | - | B/A | - | B/A | - | XB | wave dash |
303B | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | vertical ideographic iteration mark |
30A0 | NS | XB | XB | - | B/A | - | B/A | - | XB | katakana-hiragana double hyphen |
3041 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small a |
3043 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small i |
3045 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small u |
3047 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small e |
3049 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small o |
3063 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small tu |
3083 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small ya |
3085 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small yu |
3087 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small yo |
308E | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small wa |
3095 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small ka |
3096 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | hiragana letter small ke |
309D | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | hiragana iteration mark |
309E | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | hiragana voiced iteration mark |
30A1 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small a |
30A3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small i |
30A5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small u |
30A7 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small e |
30A9 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small o |
30C3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small tu |
30E3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ya |
30E5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small yu |
30E7 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small yo |
30EE | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small wa |
30F5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ka |
30F6 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ke |
30FB | NS | XB | XB | - | B/A | - | XB | - | XB | katakana middle dot |
30FC | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana-hiragana prolonged sound mark |
30FD | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | katakana iteration mark |
30FE | NS | XB | XB | B/A | B/A | XB | XB | XB | XB | katakana voiced iteration mark |
31F0 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ku |
31F1 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small si |
31F2 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small su |
31F3 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small to |
31F4 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small nu |
31F5 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ha |
31F6 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small hi |
31F7 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small hu |
31F8 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small he |
31F9 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ho |
31FA | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small mu |
31FB | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ra |
31FC | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ri |
31FD | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ru |
31FE | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small re |
31FF | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | katakana letter small ro |
FF01 | EX | XB | XB | - | B/A | - | XB | - | XB | fullwidth exclamation mark |
FF04 | PR | XA | XA | - | B/A | - | XA | - | XA | fullwidth dollar sign |
FF05 | PO | XB | XB | - | B/A | - | XB | - | XB | fullwidth percent sign |
FF1A | NS | XB | XB | - | B/A | - | XB | - | XB | fullwidth colon |
FF1B | NS | XB | XB | - | B/A | - | XB | - | XB | fullwidth semicolon |
FF1F | EX | XB | XB | - | B/A | - | XB | - | XB | fullwidth question mark |
FF65 | NS | XB | XB | - | B/A | - | XB | - | XB | halfwidth katakana middle dot |
FF67 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small a |
FF68 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small i |
FF69 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small u |
FF6A | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small e |
FF6B | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small o |
FF6C | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small ya |
FF6D | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small yu |
FF6E | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small yo |
FF6F | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana letter small tu |
FF70 | CJ | XB | B/A | B/A | B/A | B/A | B/A | XB | XB | halfwidth katakana-hiragana prolonged sound mark |
FFE0 | PO | XB | XB | - | B/A | - | XB | - | XB | fullwidth cent sign |
FFE1 | PR | XA | XA | - | B/A | - | XA | - | XA | fullwidth pound sign |
FFE5 | PR | XA | XA | - | B/A | - | XA | - | XA | fullwidth yen sign |
- | - | - | - | - | - | - | - | - | - | all other characters |
Implementation Details
- If
line-break
isauto
, content language is neither Japanese nor Chinese, and ICU does not provide a tailored set of rules that applies to the content language, then use the ICU() column's behavior; - If
line-break
isauto
, content language is neither Japanese nor Chinese, and ICU does provide a tailored set of rules that applies to the content language, then use the ICU() column's behavior modulo application of tailored rules; - If
line-break
isauto
and content language is Japanese or Chinese, then use the Normal(ja) column's behavior; - If
line-break
isloose
and content language is neither Japanese nor Chinese, then use the Loose() column's behavior; - If
line-break
isloose
and content language is Japanese or Chinese, then use the Loose(ja) column's behavior; - If
line-break
isnormal
and content language is neither Japanese nor Chinese, then use the Normal() column's behavior; - If
line-break
isnormal
and content language is Japanese or Chinese, then use the Normal(ja) column's behavior; - If
line-break
isstrict
and content language is neither Japanese nor Chinese, then use the Strict() column's behavior; - If
line-break
isstrict
and content language is Japanese or Chinese, then use the Strict(ja) column's behavior;