Context Navigation

WebGL Extension Proposal

Byungseon Shin, LG

YouTube on TV HTML5
- YouTube 360 does not play at smooth 60FPS on embedded devices
WebGL for 360-degree video.
Must support VideoElement source
Should render smoothly (~60 fps)
- Most of the GPU resources are consumed by Color Space Conversion (CSC) processing

360 VR Tech: spherical mapping of equirectangular textures
Maping a flat surface onto a 3D model (in this case, a sphere).
Equirectanular Format: -- A bit like a map projection.
Native OpenGL ES implementation was 60 fps. Video -> libvt - { texture } -> Sphere Mapping (OpenGL) -> Frame Buffer.
Initially tried WebGL, but was SLOW. (~30 fps). Video -> libvt - { texture } -> WebGL (YUV -> RGBA) -> Sphere Mappig (OpenGL) -> Frame Buffer.
- There's an extra step of converting from libvt to RGBA
- In the case of 360VR, final screen size will be far smaller than input screen, but we have to convert full screen to use WebGL
- Three bottlenecks:
  - Color Space Conversion (YUV -> RGB): 90%
  - Texture Copy inside Video Decoder: 8-9%
  - Compositing of WebGL canvas: 1-2%

Add an OES_EGL_image_external extension on WebGL. -- Allows YUV textures to be handled without color space conversion.
Solution: currently, WebGL only supports TEXTURE_2D (RGBA). We should make it support TEXTURE_EXTERNAL_OES as well
Expose OES_EGL_image_external functionality to WebGL (define a new texture target TEXTURE_EXTERNAL_OES)
Advantage over existing proposals: focus on extending texture format of WebGL. Make it easy to port and use
2 issues pointed out in the WG:
- EGL image sync issue (syncing EGL image between video decoder)
- Audio sync (need exact timestamp of the frame currently being rendered)
Also need other per-frame metadata like dimensions of the texture
This approach allows avoiding converting the entire video image from YUV -> RGBA. Instead, only need to decode the region presented in the frame buffer.
This extension already exists in OpenGL. This is just a matter of exposing through WebGL.
Some competing proposals work on a similar problem, but are not as good a fit

Why not just decode the portion of the image that will be displayed?
- Might work if the viewport is fixed, but this won't have optimal performance when changing the viewport

Last modified 9 years ago Last modified on Oct 27, 2016, 1:27:57 PM

Note: See TracWiki for help on using the wiki.