Changeset 247421 in webkit


Ignore:
Timestamp:
Jul 13, 2019 8:36:19 PM (5 years ago)
Author:
Wenson Hsieh
Message:

[Text autosizing] [iPadOS] Further adjust our heuristics to determine text autosizing candidates
https://bugs.webkit.org/show_bug.cgi?id=199780
<rdar://problem/52289088>

Reviewed by Simon Fraser.

Source/WebCore:

Our current idempotent text autosizing candidate heuristic makes the right judgment call most of the time, but
there is still a large batch of text autosizing bugs left unfixed by the first iteration of the heuristic added
in r246781. This patch attempts to address most of these bugs by adjusting the decision-tree-based heuristic
once again, mostly with improvements to the model generation pipeline.

During the first iteration, I placed emphasis on tuning the max tree depth and min leaf size hyperparameters
when coming up with my decision tree, and didn't consider the inclusion or exclusion of each feature as a
hyperparameters. As such, the trees generated using the pipeline tended to use too many features, and as a
result, tended to have cross-validation overall accuracy scores hovering around 73%.

In this revised model generation pipeline, I now consider the inclusion of each feature (along with max depth
and min leaf size, as before) as a hyperparameter. Since this increases the number of hyperparameters by many
orders of magnitude, a naive grid search (as described in the prior ChangeLog entry) is no longer a tractible
procedure for tuning hyperparameters to the training algorithm.

Instead, I now use a stochastic greedy algorithm to search for good sets of hyperparameters; this process begins
with seeding some number (usually 20-24) of "searchers" with completely randomized sets of hyperparameters (i.e.
random max depth, random leaf size, and random subsets of features). I then evaluate the average performance of
each set of hyperparameters by using them to generate 2000 decision trees over 90% of the training data, and
then cross-validating these trees against the remaining 10%. These cross-validation scores are aggregated into a
single confusion matrix, which is then passed into a loss function that computes a single value indicating how
well training with the set of hyperparameters generalized to cross-validation data. After experimenting with
various loss functions, I settled on the following:

k(false positive rate)^2 + (false negative rate)^2

...where a constant k is chosen to penalize false positives (i.e. broken layout) more harshly than false
negatives (small text). Additionally, squaring the false negative and false positive rates seems to help avoid
converging on solutions that heavily favor reducing only false positives or false negatives, or vice versa.

The stochastic algorithm starts by computing a loss value for the randomly generated configuration. Then, for
an indefinite number of iterations, it randomly mutates the configuration (e.g. by adding or removing features,
or changing min leaf size or max tree depth) and computes a new loss value for the mutated configuration. If the
mutated configuration performs better (i.e. achieves lower loss) than the current configuration, I set the
current configuration to be the mutated configuration. Otherwise, I keep the current (non-mutated) configuration
as-is. The stochastic algorithm then proceeds, ad-infinitum, with this current configuration.

Of course, since each mutation is small, this strategy so far is prone to leaving each searcher stuck in local
optima. To mitigate this, for each searcher, I keep track of a side-table of configurations that have already
been tested; when random mutations would normally lead to testing a configuration that has already been tested,
each searcher instead increases the chance of applying additional mutations. This has the effect of searchers
initially exhausting similar configurations, and expanding to test more and more dissimilar configurations as
the local alternatives all turn out to be worse. This allows searchers to effectively jump out of local optima
after being stuck for a long time.

So, using these strategies, I simultaneously ran a handful of searchers until they all appeared to converge
(a process that takes 8-12 hours on my current dataset). Many of the searchers achieved configurations with
cross-validation scores of 81% and above, up from the 73% of the previous attempt. These additionally have the
added bonus of reducing the number of features, often making the final trees themselves shallower and simpler to
understand than before.

This patch introduces one such decision tree generated using a set of hyperparameters acquired via this
stochasic search algorithm; it appears to simultaneously use fewer features, and achieve better cross-validation
performance.

Test: fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates.html

  • css/StyleResolver.cpp:

(WebCore::StyleResolver::adjustRenderStyleForTextAutosizing):

Adjust the early return to bail if either (1) the element is a candidate and the computed size is already equal
to the boosted size, or (2) the element is not a candidate and the computed size is already equal to the
specified size. Since the autosizing candidate heuristic depends on styles specified on the element itself (as
opposed to styles on any element in the ancestor chain), a parent may be an autosizing candidate, but a child of
it may not.

  • rendering/style/RenderStyle.cpp:

(WebCore::RenderStyle::isIdempotentTextAutosizingCandidate const):

Revamp the idempotent text autosizing candidate heuristic. See the explanation above for more details.

  • rendering/style/RenderStyle.h:

Remove some bits from RenderStyle's autosizeStatus, now that we care about fewer bits of information from the
inherited flags.

  • rendering/style/TextSizeAdjustment.cpp:

(WebCore::AutosizeStatus::updateStatus):

  • rendering/style/TextSizeAdjustment.h:

LayoutTests:

Rebaseline an existing idempotent text autosizing test, and add an additional test case.

  • fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates-expected.txt:
  • fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates.html:
Location:
trunk
Files:
9 edited

Legend:

Unmodified
Added
Removed
  • trunk/LayoutTests/ChangeLog

    r247420 r247421  
     12019-07-13  Wenson Hsieh  <wenson_hsieh@apple.com>
     2
     3        [Text autosizing] [iPadOS] Further adjust our heuristics to determine text autosizing candidates
     4        https://bugs.webkit.org/show_bug.cgi?id=199780
     5        <rdar://problem/52289088>
     6
     7        Reviewed by Simon Fraser.
     8
     9        Rebaseline an existing idempotent text autosizing test, and add an additional test case.
     10
     11        * fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates-expected.txt:
     12        * fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates.html:
     13
    1142019-07-13  Simon Fraser  <simon.fraser@apple.com>
    215
  • trunk/LayoutTests/fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates-expected.txt

    r246781 r247421  
    1414PASS result is >= result2
    1515Checking target8:
    16 PASS result is >= 13
     16PASS result is 12
    1717Checking target9:
    1818PASS result is >= 13
     
    2222PASS result is >= 13
    2323Checking target12:
     24PASS result is >= 13
     25Checking target13:
    2426PASS result is 12
    25 Checking target13:
    26 PASS result is >= 13
    2727Checking target14:
    2828PASS result is 12
     
    3131Checking target16:
    3232PASS result is >= 13
     33Checking target17:
     34PASS result is 12
    3335PASS successfullyParsed is true
    3436
     
    4951Test
    5052Test
     53Test
  • trunk/LayoutTests/fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates.html

    r246781 r247421  
    2626<div style="background: green;"><span id="target13" style="font-size: 12px; height: 20px; position: fixed; float: right; overflow-x: hidden; width: 100px;">Test</span></div>
    2727<div style="background: green;"><span id="target14" style="font-size: 12px; height: 20px; width: 100px; float: right;">Test</span></div>
    28 <div style="background: green;"><span id="target15" style="overflow-y: hidden; float: right;">Test</span></div>
    29 <div style="background: green;"><span id="target16" style="float: right;">Test</span></div>
     28<div style="background: green;"><span id="target15" style="font-size: 12px; overflow-y: hidden; float: right;">Test</span></div>
     29<div style="background: green;"><span id="target16" style="font-size: 12px; float: right;">Test</span></div>
     30<div style="background: green;"><span id="target17" style="font-size: 12px; -webkit-text-size-adjust: 100%;">Test</span></div>
    3031<script>
    3132let result;
     
    5657shouldBeGreaterThanOrEqual("result", "result2");
    5758
    58 check("target8", true);
     59check("target8", false);
    5960check("target9", true);
    6061check("target10", true);
     
    6364// common patterns in real webpages.
    6465check("target11", true);
    65 check("target12", false);
    66 check("target13", true);
     66check("target12", true);
     67check("target13", false);
    6768check("target14", false);
    6869check("target15", true);
    6970check("target16", true);
     71check("target17", false);
    7072</script>
    7173<script src="../../../../resources/js-test-post.js"></script>
  • trunk/Source/WebCore/ChangeLog

    r247420 r247421  
     12019-07-13  Wenson Hsieh  <wenson_hsieh@apple.com>
     2
     3        [Text autosizing] [iPadOS] Further adjust our heuristics to determine text autosizing candidates
     4        https://bugs.webkit.org/show_bug.cgi?id=199780
     5        <rdar://problem/52289088>
     6
     7        Reviewed by Simon Fraser.
     8
     9        Our current idempotent text autosizing candidate heuristic makes the right judgment call most of the time, but
     10        there is still a large batch of text autosizing bugs left unfixed by the first iteration of the heuristic added
     11        in r246781. This patch attempts to address most of these bugs by adjusting the decision-tree-based heuristic
     12        once again, mostly with improvements to the model generation pipeline.
     13
     14        During the first iteration, I placed emphasis on tuning the max tree depth and min leaf size hyperparameters
     15        when coming up with my decision tree, and didn't consider the inclusion or exclusion of each feature as a
     16        hyperparameters. As such, the trees generated using the pipeline tended to use too many features, and as a
     17        result, tended to have cross-validation overall accuracy scores hovering around 73%.
     18
     19        In this revised model generation pipeline, I now consider the inclusion of each feature (along with max depth
     20        and min leaf size, as before) as a hyperparameter. Since this increases the number of hyperparameters by many
     21        orders of magnitude, a naive grid search (as described in the prior ChangeLog entry) is no longer a tractible
     22        procedure for tuning hyperparameters to the training algorithm.
     23
     24        Instead, I now use a stochastic greedy algorithm to search for good sets of hyperparameters; this process begins
     25        with seeding some number (usually 20-24) of "searchers" with completely randomized sets of hyperparameters (i.e.
     26        random max depth, random leaf size, and random subsets of features). I then evaluate the average performance of
     27        each set of hyperparameters by using them to generate 2000 decision trees over 90% of the training data, and
     28        then cross-validating these trees against the remaining 10%. These cross-validation scores are aggregated into a
     29        single confusion matrix, which is then passed into a loss function that computes a single value indicating how
     30        well training with the set of hyperparameters generalized to cross-validation data. After experimenting with
     31        various loss functions, I settled on the following:
     32
     33        `k(false positive rate)^2 + (false negative rate)^2`
     34
     35        ...where a constant k is chosen to penalize false positives (i.e. broken layout) more harshly than false
     36        negatives (small text). Additionally, squaring the false negative and false positive rates seems to help avoid
     37        converging on solutions that heavily favor reducing only false positives or false negatives, or vice versa.
     38
     39        The stochastic algorithm starts by computing a loss value for the randomly generated configuration. Then, for
     40        an indefinite number of iterations, it randomly mutates the configuration (e.g. by adding or removing features,
     41        or changing min leaf size or max tree depth) and computes a new loss value for the mutated configuration. If the
     42        mutated configuration performs better (i.e. achieves lower loss) than the current configuration, I set the
     43        current configuration to be the mutated configuration. Otherwise, I keep the current (non-mutated) configuration
     44        as-is. The stochastic algorithm then proceeds, ad-infinitum, with this current configuration.
     45
     46        Of course, since each mutation is small, this strategy so far is prone to leaving each searcher stuck in local
     47        optima. To mitigate this, for each searcher, I keep track of a side-table of configurations that have already
     48        been tested; when random mutations would normally lead to testing a configuration that has already been tested,
     49        each searcher instead increases the chance of applying additional mutations. This has the effect of searchers
     50        initially exhausting similar configurations, and expanding to test more and more dissimilar configurations as
     51        the local alternatives all turn out to be worse. This allows searchers to effectively jump out of local optima
     52        after being stuck for a long time.
     53
     54        So, using these strategies, I simultaneously ran a handful of searchers until they all appeared to converge
     55        (a process that takes 8-12 hours on my current dataset). Many of the searchers achieved configurations with
     56        cross-validation scores of 81% and above, up from the 73% of the previous attempt. These additionally have the
     57        added bonus of reducing the number of features, often making the final trees themselves shallower and simpler to
     58        understand than before.
     59
     60        This patch introduces one such decision tree generated using a set of hyperparameters acquired via this
     61        stochasic search algorithm; it appears to simultaneously use fewer features, and achieve better cross-validation
     62        performance.
     63
     64        Test: fast/text-autosizing/ios/idempotentmode/idempotent-autosizing-candidates.html
     65
     66        * css/StyleResolver.cpp:
     67        (WebCore::StyleResolver::adjustRenderStyleForTextAutosizing):
     68
     69        Adjust the early return to bail if either (1) the element is a candidate and the computed size is already equal
     70        to the boosted size, or (2) the element is not a candidate and the computed size is already equal to the
     71        specified size. Since the autosizing candidate heuristic depends on styles specified on the element itself (as
     72        opposed to styles on any element in the ancestor chain), a parent may be an autosizing candidate, but a child of
     73        it may not.
     74
     75        * rendering/style/RenderStyle.cpp:
     76        (WebCore::RenderStyle::isIdempotentTextAutosizingCandidate const):
     77
     78        Revamp the idempotent text autosizing candidate heuristic. See the explanation above for more details.
     79
     80        * rendering/style/RenderStyle.h:
     81
     82        Remove some bits from RenderStyle's autosizeStatus, now that we care about fewer bits of information from the
     83        inherited flags.
     84
     85        * rendering/style/TextSizeAdjustment.cpp:
     86        (WebCore::AutosizeStatus::updateStatus):
     87        * rendering/style/TextSizeAdjustment.h:
     88
    1892019-07-13  Simon Fraser  <simon.fraser@apple.com>
    290
  • trunk/Source/WebCore/css/StyleResolver.cpp

    r247377 r247421  
    908908    };
    909909
    910     if (!style.isIdempotentTextAutosizingCandidate())
    911         return adjustLineHeightIfNeeded(style.computedFontSize());
    912 
    913910    auto fontDescription = style.fontDescription();
    914     auto initialComputedFontSize = fontDescription.computedSize();
     911    auto initialComputedFontSize = fontDescription.computedSize();
     912    auto specifiedFontSize = fontDescription.specifiedSize();
     913    bool isCandidate = style.isIdempotentTextAutosizingCandidate();
     914    if (!isCandidate && WTF::areEssentiallyEqual(initialComputedFontSize, specifiedFontSize))
     915        return;
     916
    915917    auto adjustedFontSize = AutosizeStatus::idempotentTextSize(fontDescription.specifiedSize(), initialScale);
    916     if (initialComputedFontSize == adjustedFontSize)
    917         return;
    918 
    919     fontDescription.setComputedSize(adjustedFontSize);
     918    if (isCandidate && WTF::areEssentiallyEqual(initialComputedFontSize, adjustedFontSize))
     919        return;
     920
     921    fontDescription.setComputedSize(isCandidate ? adjustedFontSize : specifiedFontSize);
    920922    style.setFontDescription(WTFMove(fontDescription));
    921923    style.fontCascade().update(&document().fontSelector());
  • trunk/Source/WebCore/rendering/style/RenderStyle.cpp

    r246781 r247421  
    501501
    502502    if (fields.contains(AutosizeStatus::Fields::FixedHeight)) {
    503         if (whiteSpace() == WhiteSpace::NoWrap)
     503        if (fields.contains(AutosizeStatus::Fields::FixedWidth)) {
     504            if (whiteSpace() == WhiteSpace::NoWrap) {
     505                if (width().isFixed())
     506                    return false;
     507
     508                return true;
     509            }
     510
     511            if (fields.contains(AutosizeStatus::Fields::Floating))
     512                return false;
     513
     514            if (fields.contains(AutosizeStatus::Fields::OverflowXHidden))
     515                return false;
     516
    504517            return true;
    505 
     518        }
     519
     520        if (fields.contains(AutosizeStatus::Fields::OverflowXHidden)) {
     521            if (fields.contains(AutosizeStatus::Fields::Floating))
     522                return false;
     523
     524            return true;
     525        }
     526
     527        return true;
     528    }
     529
     530    if (width().isFixed())
     531        return false;
     532
     533    if (textSizeAdjust().isPercentage() && textSizeAdjust().percentage() == 100) {
    506534        if (fields.contains(AutosizeStatus::Fields::Floating))
    507             return fields.contains(AutosizeStatus::Fields::OutOfFlowPosition) && fields.contains(AutosizeStatus::Fields::OverflowXHidden);
     535            return true;
    508536
    509537        if (fields.contains(AutosizeStatus::Fields::FixedWidth))
    510             return !fields.contains(AutosizeStatus::Fields::OutOfFlowPosition);
    511     }
    512 
    513     if (fields.contains(AutosizeStatus::Fields::Floating))
    514         return true;
    515 
    516     if (fields.contains(AutosizeStatus::Fields::FixedWidth))
    517         return fields.contains(AutosizeStatus::Fields::OverflowYHidden);
    518 
    519     return !fields.contains(AutosizeStatus::Fields::OverflowYHidden) && !fields.contains(AutosizeStatus::Fields::FixedMaxWidth);
     538            return true;
     539
     540        return false;
     541    }
     542
     543    return true;
    520544}
    521545
  • trunk/Source/WebCore/rendering/style/RenderStyle.h

    r246781 r247421  
    18361836
    18371837#if ENABLE(TEXT_AUTOSIZING)
    1838         unsigned autosizeStatus : 8;
    1839 #endif
    1840         // 56 bits
     1838        unsigned autosizeStatus : 5;
     1839#endif
     1840        // 53 bits
    18411841    };
    18421842
  • trunk/Source/WebCore/rendering/style/TextSizeAdjustment.cpp

    r246781 r247421  
    5050        result.add(Fields::DisplayNone);
    5151
    52     if (style.hasOutOfFlowPosition())
    53         result.add(Fields::OutOfFlowPosition);
    54 
    5552    if (style.height().isFixed())
    5653        result.add(Fields::FixedHeight);
     
    5956        result.add(Fields::FixedWidth);
    6057
    61     if (style.maxWidth().isFixed())
    62         result.add(Fields::FixedMaxWidth);
    63 
    6458    if (style.overflowX() == Overflow::Hidden)
    6559        result.add(Fields::OverflowXHidden);
    66 
    67     if (style.overflowY() == Overflow::Hidden)
    68         result.add(Fields::OverflowYHidden);
    6960
    7061    if (style.isFloating())
  • trunk/Source/WebCore/rendering/style/TextSizeAdjustment.h

    r246781 r247421  
    5858        Floating = 1 << 3,
    5959        OverflowXHidden = 1 << 4,
    60         OverflowYHidden = 1 << 5,
    61         OutOfFlowPosition = 1 << 6,
    62         FixedMaxWidth = 1 << 7
    6360        // Adding new values requires giving RenderStyle::InheritedFlags::autosizeStatus additional bits.
    6461    };
Note: See TracChangeset for help on using the changeset viewer.