WebGL Extension Proposal

Byungseon Shin, LG


  • YouTube on TV HTML5
    • YouTube 360 does not play at smooth 60FPS on embedded devices
  • WebGL for 360-degree video.
  • Must support VideoElement source
  • Should render smoothly (~60 fps)
    • Most of the GPU resources are consumed by Color Space Conversion (CSC) processing

VR Technology

  • 360 VR Tech: spherical mapping of equirectangular textures
  • Maping a flat surface onto a 3D model (in this case, a sphere).
  • Equirectanular Format: -- A bit like a map projection.
  • Native OpenGL ES implementation was 60 fps. Video -> libvt - { texture } -> Sphere Mapping (OpenGL) -> Frame Buffer.
  • Initially tried WebGL, but was SLOW. (~30 fps). Video -> libvt - { texture } -> WebGL (YUV -> RGBA) -> Sphere Mappig (OpenGL) -> Frame Buffer.
    • There's an extra step of converting from libvt to RGBA
    • In the case of 360VR, final screen size will be far smaller than input screen, but we have to convert full screen to use WebGL
    • Three bottlenecks:
      • Color Space Conversion (YUV -> RGB): 90%
      • Texture Copy inside Video Decoder: 8-9%
      • Compositing of WebGL canvas: 1-2%
  • Currently WebGL only supports TEXTURE_2D (RGBA) as an import. Not YUV.
  • YUV -> RGB conversion is very slow.



  • Add an OES_EGL_image_external extension on WebGL. -- Allows YUV textures to be handled without color space conversion.
  • Solution: currently, WebGL only supports TEXTURE_2D (RGBA). We should make it support TEXTURE_EXTERNAL_OES as well
  • Expose OES_EGL_image_external functionality to WebGL (define a new texture target TEXTURE_EXTERNAL_OES)
  • Advantage over existing proposals: focus on extending texture format of WebGL. Make it easy to port and use
  • 2 issues pointed out in the WG:
    • EGL image sync issue (syncing EGL image between video decoder)
    • Audio sync (need exact timestamp of the frame currently being rendered)
  • Also need other per-frame metadata like dimensions of the texture
  • This approach allows avoiding converting the entire video image from YUV -> RGBA. Instead, only need to decode the region presented in the frame buffer.
  • This extension already exists in OpenGL. This is just a matter of exposing through WebGL.
  • Some competing proposals work on a similar problem, but are not as good a fit


  • Audio sync is a key challenge to overcome.


  • Why not just decode the portion of the image that will be displayed?
    • Might work if the viewport is fixed, but this won't have optimal performance when changing the viewport
Last modified 3 years ago Last modified on Oct 27, 2016 1:27:57 PM