Context Navigation

← Previous Changeset
Next Changeset →

Changeset 177000 in webkit

Timestamp:

Dec 8, 2014, 5:31:37 PM (11 years ago)

Author:

dino@apple.com

Message:

[Apple] Use Accelerate framework to speed-up FEGaussianBlur
https://bugs.webkit.org/show_bug.cgi?id=139310
<rdar://problem/18434594>

PerformanceTests:

Reviewed by Simon Fraser.

Add an interactive performance test that measures the speed of a set
of blur operations on a generated images.

Interactive/blur-filter-timing.html: Added.

Source/WebCore:

<rdar://problem/18434594>

Reviewed by Simon Fraser.

Using Apple's Accelerate framework provides faster blurs
than the parallel jobs approach, especially since r168577
which started performing retina-accurate filters.

Using Accelerate.framework to replace the existing box blur (what
we use to approximate Gaussian blurs) gets about a 20% speedup on
desktop class machines, but between a 2x-6x speedup on iOS hardware.
Obviously this depends on the size of the content being blurred,
but it is still good.

The change is to intercept the platformApply function on
FEGaussianBlur and send it off to Accelerate.

There is an interactive performance test: PerformanceTests/Interactive/blur-filter-timing.html

platform/graphics/filters/FEGaussianBlur.cpp:

(WebCore::kernelPosition): Move this to a file static function from the .h.
(WebCore::accelerateBoxBlur): The Accelerate implementation.
(WebCore::standardBoxBlur): The default generic/standard implementation.
(WebCore::FEGaussianBlur::platformApplyGeneric): Use accelerate or the default form.
(WebCore::FEGaussianBlur::platformApply): Don't try the parallelJobs approach if Accelerate is available.

platform/graphics/filters/FEGaussianBlur.h:

(WebCore::FEGaussianBlur::kernelPosition): Deleted. Move into the .cpp.

Source/WTF:

<rdar://problem/18434594>

Reviewed by Simon Fraser.

Add a HAVE_ACCELERATE flag, true on Apple platforms.

wtf/Platform.h:

Location:

trunk

Files:

: 1 added
: 6 edited

PerformanceTests/ChangeLog (modified) (1 diff)
PerformanceTests/Interactive/blur-filter-timing.html (added)
Source/WTF/ChangeLog (modified) (1 diff)
Source/WTF/wtf/Platform.h (modified) (1 diff)
Source/WebCore/ChangeLog (modified) (1 diff)
Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp (modified) (8 diffs)
Source/WebCore/platform/graphics/filters/FEGaussianBlur.h (modified) (2 diffs)

Legend:

: Unmodified
: Added
: Removed

trunk/PerformanceTests/ChangeLog

-              r176077
+              r177000
+-12-08  Dean Jackson  <dino@apple.com>
+        [Apple] Use Accelerate framework to speed-up FEGaussianBlur
+        https://bugs.webkit.org/show_bug.cgi?id=139310
+        Reviewed by Simon Fraser.
+        Add an interactive performance test that measures the speed of a set
+        of blur operations on a generated images.
+        * Interactive/blur-filter-timing.html: Added.
 -11-13  Zalan Bujtas  <zalan@apple.com>

trunk/Source/WTF/ChangeLog

-              r176982
+              r177000
+-12-08  Dean Jackson  <dino@apple.com>
+        [Apple] Use Accelerate framework to speed-up FEGaussianBlur
+        https://bugs.webkit.org/show_bug.cgi?id=139310
+        <rdar://problem/18434594>
+        Reviewed by Simon Fraser.
+        Add a HAVE_ACCELERATE flag, true on Apple platforms.
+        * wtf/Platform.h:
 -12-08  Myles C. Maxfield  <mmaxfield@apple.com>

trunk/Source/WTF/wtf/Platform.h

-              r176031
+              r177000
 #endif
+#if PLATFORM(COCOA)
+#define HAVE_ACCELERATE 1
+#endif
 #endif /* WTF_Platform_h */

trunk/Source/WebCore/ChangeLog

-              r176999
+              r177000
+-12-08  Dean Jackson  <dino@apple.com>
+        [Apple] Use Accelerate framework to speed-up FEGaussianBlur
+        https://bugs.webkit.org/show_bug.cgi?id=139310
+        <rdar://problem/18434594>
+        Reviewed by Simon Fraser.
+        Using Apple's Accelerate framework provides faster blurs
+        than the parallel jobs approach, especially since r168577
+        which started performing retina-accurate filters.
+        Using Accelerate.framework to replace the existing box blur (what
+        we use to approximate Gaussian blurs) gets about a 20% speedup on
+        desktop class machines, but between a 2x-6x speedup on iOS hardware.
+        Obviously this depends on the size of the content being blurred,
+        but it is still good.
+        The change is to intercept the platformApply function on
+        FEGaussianBlur and send it off to Accelerate.
+        There is an interactive performance test: PerformanceTests/Interactive/blur-filter-timing.html
+        * platform/graphics/filters/FEGaussianBlur.cpp:
+        (WebCore::kernelPosition): Move this to a file static function from the .h.
+        (WebCore::accelerateBoxBlur): The Accelerate implementation.
+        (WebCore::standardBoxBlur): The default generic/standard implementation.
+        (WebCore::FEGaussianBlur::platformApplyGeneric): Use accelerate or the default form.
+        (WebCore::FEGaussianBlur::platformApply): Don't try the parallelJobs approach if Accelerate is available.
+        * platform/graphics/filters/FEGaussianBlur.h:
+        (WebCore::FEGaussianBlur::kernelPosition): Deleted. Move into the .cpp.
 -12-08  Beth Dakin  <bdakin@apple.com>

trunk/Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp

-              r173397
+              r177000
 #include "TextStream.h"
+#if HAVE(ACCELERATE)
+#include <Accelerate/Accelerate.h>
+#endif
 #include <runtime/JSCInlines.h>
 #include <runtime/TypedArrayInlines.h>
 …
 namespace WebCore {
+inline void kernelPosition(int blurIteration, unsigned& radius, int& deltaLeft, int& deltaRight)
+{
+    // Check http://www.w3.org/TR/SVG/filters.html#feGaussianBlurElement for details.
+    switch (blurIteration) {
+    case 0:
+        if (!(radius % 2)) {
+            deltaLeft = radius / 2 - 1;
+            deltaRight = radius - deltaLeft;
+        } else {
+            deltaLeft = radius / 2;
+            deltaRight = radius - deltaLeft;
+        }
+        break;
+    case 1:
+        if (!(radius % 2)) {
+            deltaLeft++;
+            deltaRight--;
+        }
+        break;
+    case 2:
+        if (!(radius % 2)) {
+            deltaRight++;
+            radius++;
+        }
+        break;
+    }
+}
 FEGaussianBlur::FEGaussianBlur(Filter* filter, float x, float y, EdgeModeType edgeMode)
 …
+}
+inline void FEGaussianBlur::platformApplyGeneric(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* tmpPixelArray, unsigned kernelSizeX, unsigned kernelSizeY, IntSize& paintSize)
+{
+    int stride = 4 * paintSize.width();
+#if HAVE(ACCELERATE)
+inline void accelerateBoxBlur(const Uint8ClampedArray* src, Uint8ClampedArray* dst, unsigned kernelSize, int stride, int effectWidth, int effectHeight)
+{
+    // We must always use an odd radius.
+    if (kernelSize % 2 != 1)
+        kernelSize += 1;
+    vImage_Buffer effectInBuffer;
+    effectInBuffer.data = src->data();
+    effectInBuffer.width = effectWidth;
+    effectInBuffer.height = effectHeight;
+    effectInBuffer.rowBytes = stride;
+    vImage_Buffer effectOutBuffer;
+    effectOutBuffer.data = dst->data();
+    effectOutBuffer.width = effectWidth;
+    effectOutBuffer.height = effectHeight;
+    effectOutBuffer.rowBytes = stride;
+    // Determine the size of a temporary buffer by calling the function first with a special flag. vImage will return
+    // the size needed, or an error (which are all negative).
+    size_t tmpBufferSize = vImageBoxConvolve_ARGB8888(&effectInBuffer, &effectOutBuffer, 0, 0, 0, kernelSize, kernelSize, 0, kvImageEdgeExtend | kvImageGetTempBufferSize);
+    if (tmpBufferSize <= 0)
+        return;
+    void* tmpBuffer = fastMalloc(tmpBufferSize);
+    vImageBoxConvolve_ARGB8888(&effectInBuffer, &effectOutBuffer, tmpBuffer, 0, 0, kernelSize, kernelSize, 0, kvImageEdgeExtend);
+    vImageBoxConvolve_ARGB8888(&effectOutBuffer, &effectInBuffer, tmpBuffer, 0, 0, kernelSize, kernelSize, 0, kvImageEdgeExtend);
+    vImageBoxConvolve_ARGB8888(&effectInBuffer, &effectOutBuffer, tmpBuffer, 0, 0, kernelSize, kernelSize, 0, kvImageEdgeExtend);
+    WTF::fastFree(tmpBuffer);
+    // The final result should be stored in src.
+    if (dst == src) {
+        ASSERT(src->length() == dst->length());
+        memcpy(dst->data(), src->data(), src->length());
+    }
+}
+#endif
+inline void standardBoxBlur(Uint8ClampedArray* src, Uint8ClampedArray* dst, unsigned kernelSizeX, unsigned kernelSizeY, int stride, IntSize& paintSize, bool isAlphaImage, EdgeModeType edgeMode)
+{
     int dxLeft = 0;
     int dxRight = 0;
     int dyLeft = 0;
     int dyRight = 0;
-    Uint8ClampedArray* src = srcPixelArray;
-    Uint8ClampedArray* dst = tmpPixelArray;
     for (int i = 0; i < 3; ++i) {
 …
                 boxBlurNEON(src, dst, kernelSizeX, dxLeft, dxRight, 4, stride, paintSize.width(), paintSize.height());
             else
                 boxBlur(src, dst, kernelSizeX, dxLeft, dxRight, 4, stride, paintSize.width(), paintSize.height(), true, m_edgeMode);
+                boxBlur(src, dst, kernelSizeX, dxLeft, dxRight, 4, stride, paintSize.width(), paintSize.height(), true, edgeMode);
 #else
             boxBlur(src, dst, kernelSizeX, dxLeft, dxRight, 4, stride, paintSize.width(), paintSize.height(), isAlphaImage(), m_edgeMode);
+            boxBlur(src, dst, kernelSizeX, dxLeft, dxRight, 4, stride, paintSize.width(), paintSize.height(), isAlphaImage, edgeMode);
 #endif
             std::swap(src, dst);
 …
                 boxBlurNEON(src, dst, kernelSizeY, dyLeft, dyRight, stride, 4, paintSize.height(), paintSize.width());
             else
                 boxBlur(src, dst, kernelSizeY, dyLeft, dyRight, stride, 4, paintSize.height(), paintSize.width(), true, m_edgeMode);
+                boxBlur(src, dst, kernelSizeY, dyLeft, dyRight, stride, 4, paintSize.height(), paintSize.width(), true, edgeMode);
 #else
             boxBlur(src, dst, kernelSizeY, dyLeft, dyRight, stride, 4, paintSize.height(), paintSize.width(), isAlphaImage(), m_edgeMode);
+            boxBlur(src, dst, kernelSizeY, dyLeft, dyRight, stride, 4, paintSize.height(), paintSize.width(), isAlphaImage, edgeMode);
 #endif
             std::swap(src, dst);
 …
+    }
     // The final result should be stored in srcPixelArray.
     if (dst == srcPixelArray) {
+    // The final result should be stored in src.
+    if (dst == src) {
         ASSERT(src->length() == dst->length());
         memcpy(dst->data(), src->data(), src->length());
+    }
+}
+inline void FEGaussianBlur::platformApplyGeneric(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* tmpPixelArray, unsigned kernelSizeX, unsigned kernelSizeY, IntSize& paintSize)
+{
+    int stride = 4 * paintSize.width();
+#if HAVE(ACCELERATE)
+    if (kernelSizeX == kernelSizeY && (m_edgeMode == EDGEMODE_NONE || m_edgeMode == EDGEMODE_DUPLICATE)) {
+        accelerateBoxBlur(srcPixelArray, tmpPixelArray, kernelSizeX, stride, paintSize.width(), paintSize.height());
+        return;
+    }
+#endif
+    standardBoxBlur(srcPixelArray, tmpPixelArray, kernelSizeX, kernelSizeY, stride, paintSize, isAlphaImage(), m_edgeMode);
+}
 …
 inline void FEGaussianBlur::platformApply(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* tmpPixelArray, unsigned kernelSizeX, unsigned kernelSizeY, IntSize& paintSize)
+{
+#if !HAVE(ACCELERATE)
     int scanline = 4 * paintSize.width();
     int extraHeight = 3 * kernelSizeY * 0.5f;
 …
         // Fallback to single threaded mode.
+    }
+#endif
     // The selection here eventually should happen dynamically on some platforms.

trunk/Source/WebCore/platform/graphics/filters/FEGaussianBlur.h

-              r173397
+              r177000
     FEGaussianBlur(Filter*, float, float, EdgeModeType);
-    static inline void kernelPosition(int boxBlur, unsigned& std, int& dLeft, int& dRight);
     inline void platformApply(Uint8ClampedArray* srcPixelArray, Uint8ClampedArray* tmpPixelArray, unsigned kernelSizeX, unsigned kernelSizeY, IntSize& paintSize);
 …
 };
-inline void FEGaussianBlur::kernelPosition(int boxBlur, unsigned& std, int& dLeft, int& dRight)
+{
-    // check http://www.w3.org/TR/SVG/filters.html#feGaussianBlurElement for details
-    switch (boxBlur) {
-    case 0:
-        if (!(std % 2)) {
-            dLeft = std / 2 - 1;
-            dRight = std - dLeft;
-        } else {
-            dLeft = std / 2;
-            dRight = std - dLeft;
+        }
-        break;
-    case 1:
-        if (!(std % 2)) {
-            dLeft++;
-            dRight--;
+        }
-        break;
-    case 2:
-        if (!(std % 2)) {
-            dRight++;
-            std++;
+        }
-        break;
+    }
+}
 } // namespace WebCore

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 177000 in webkit

Legend:

trunk/PerformanceTests/ChangeLog

trunk/Source/WTF/ChangeLog

trunk/Source/WTF/wtf/Platform.h

trunk/Source/WebCore/ChangeLog

trunk/Source/WebCore/platform/graphics/filters/FEGaussianBlur.cpp

trunk/Source/WebCore/platform/graphics/filters/FEGaussianBlur.h

Download in other formats: