Details of the QtWebKit HTML5 media element implementation
QtMultimediaKit (Qt Mobility)
We're currently working on the QtMultimediaKit integration. The QtMultimediaKit API from Qt Mobility can fill many of the holes in the existing Phonon integration.
Files: MediaPlayerPrivateQt.cpp, MediaPlayerPrivateQt.h
Background: QtMultimedia
QtMultimedia provides a collection of cross platform media playback APIs. These APIs are wrapped around various media services. For example, on Windows the default is a DirectShow service, on Mac the default is a QuickTime service, and on Linux the default is a gstreamer service. Enterprising users can also write their own media services.
Important Note: Not all of these media services will support all of the features specified in the API. We continue to work hard to support as many features as possible, but ultimately some features will not be able to be supported on all services/platforms. Generally in these cases we will attempt to gracefully degrade the service.
API's used in the webkit integration with less than universal support:
- QMediaPlayer::supportedMimeTypes() (Will often return an empty list)
- QMediaPlayerControl::availablePlaybackRanges() (Will often return a single interval, 0..duration, rather than buffered ranges)
- QtMultimedia::Size meta data and "bytes-loaded" extended meta data (Will often not be emitted)
- QMediaPlayer::setMedia() with a QNetworkRequest (Will often take the URL and ignore the rest of the information)
Rendering
For the most part, the QtMultimedia integration is a thin wrapper around QMediaPlayer and QGraphicsVideoItem, which are high level classes within QtMultimedia. QGraphicsVideoItem uses software rendering, using the same QPainter which renders the web view. This means that webkit can safely draw on top of the video (such a controls, overlapping elements, etc).
This said, QGraphicsVideoItem is not entirely unaccelerated. For the GL paint engine we do scaling, color space conversion and blt in hardware. On the N900, we display XVideo in a window, then composite the web view above using a hardware chroma key overlay. On the desktop, unaccelerated performance is reasonable. Your mileage will vary depending on graphics engine and operating system. OpenGL is fast for those places where it is supported. The raster graphics system is also pretty reasonable. X/11 is unfortunately quite poor. There are also plans to introduce other acceleration techniques in to QGraphicsVideoItem in the future.
Given the use of QGraphicsVideoItem, it was also very easy to support the QGraphicsView-based accelerated compositing code path, so that's supported too.
Loading
This backend attempts to pass all of the information required by the backend to retrieve a media source. For http[s] sources, this includes cookies and a referer header. As previously stated, this is not yet universally supported. The best support for HTTP loading is currently with the QuickTime media service.
This allows us to support services such as youtube/html5 and vimeo/html5.
Repository
You can get my clone of WebKit, with the latest QtMultimedia integration at: http://gitorious.org/~nickyoung/webkit/nickyoungs-webkit There is also an experimental branch on this repository, which contains support for fullscreen mode, as well as performance measuring tools.
Phonon (<Qt4.7)
Structure
DOM elements:
- =WebCore/html/HTMLMediaElement.cpp= and =WebCore/html/HTMLMediaElement.h=
- =WebCore/html/HTMLVideoElement.cpp= and =WebCore/html/HTMLVideoElement.h=
- =WebCore/html/HTMLAudioElement.cpp= and =WebCore/html/HTMLAudioElement.h=
Renderers:
- =WebCore/rendering/RenderMedia.cpp= and =WebCore/rendering/RenderMedia.h=
- =WebCore/rendering/RenderVideo.cpp= and =WebCore/rendering/RenderVideo.h=
=MediaPlayerPrivate= implementation:
- =WebCore/platform/graphics/qt/MediaPlayerPrivatePhonon.cpp= and =WebCore/platform/graphics/qt/MediaPlayerPrivatePhonon.h=
Limitations
The current implementation has some limitations, partly due to lack of support in the Phonon API and backends.
Software rendering
The implementation is based on a Phonon::VideoWidget, which is rendered in MediaPlayerPrivate::paint() by a call to QWidget::render(), passing the active webkit =QPainter. Update notification of video-frames are triggered by an event filter on the VideoWidget.
For this technique to work we have to make sure the Phonon back-end either uses a software rendering path, using the QPainter to draw the video, or that any accelerated drawing does not change the context state of the QPainter and allows WebKit to continue drawing on top after the video has been rendered. In other words, no XVideo or ad hock accelerated drawing.
We set the window flag WA_DontShowOnScreen on the video widget, as a hint to the back-end to use this software path. The reason for the software path requirement is that we want HTML content overlapping with the video to not be clipped by the video.
This works fine on X11 and Mac (a little slow), and the Windows bugs will get ironed out soon.
See =$QTDIR/src/3rdparty/kdebase/runtime/phonon/gstreamer/videowidget.cpp=, =$QTDIR/src/3rdparty/kdebase/runtime/phonon/ds9/videowidget.cpp=, and =$QTDIR/src/3rdparty/kdebase/runtime/phonon/qt7/videowidget.cpp= for examples.
Media loading
Media loading is currently handled by Phonon, since we call MediaObject::setCurrentSource() passing in a =Phonon::MediaSource= constructed from a URL. The alternative would be to handle loading using WebKit's normal resource loading mechanisms, and then use a =MediaSource= constructed from a Phonon::AbstractMediaStream to feed the loaded data from WebKit to Phonon.
The problem is that on Mac OSX there is no way to continuously feed QuickTime the data on the back-end side (only in one big chunk, which is hardly optimal for streaming data), so a solution based on =AbstractMediaStream= would not be cross platform. Another problem is that none of the Phonon back-ends support the =AbstractMediaStream= push model, where we feed data to Phonon as the data arrives in the WebKit resource loader. The model they do support (again, only on Linux on Windows) is the =AbstractMediaStream= pull model, where Phonon asks us to provide more data, and we are expected to at least give it something. That leaves us with having to do a busy wait using =QApplication::processEvents()= to ensure that we have loaded enough data to feed back to Phonon when returning, which is pretty ugly. So in, conclusion, =AbstractMediaStream= does not have the back-end support yet to be a basis for this implementation. (I do have an experimental branch where i tried this out though, available http://code.staikos.net/cgi-bin/gitweb.cgi?p=webkit;a=shortlog;h=torarne/mediaelement/medialoader][here).
So what are the downsides of letting Phonon handle media loading? For one, the Phonon API does not have a way for the client to query how much of the media has been loaded. This is not only useful for displaying a visualization to the user of how much of the file has been loaded, but also required by the platform independent WebKit media element code to know whether or not it can seek to a given position. Confusingly enough Phonon does provide a MediaObject::bufferStatus() signal, but that signal refers to how much of the internal _pre-buffering_ buffer has been filled (for example when you seek to a new position, Phonon goes into a buffering state where it fills this buffer, and then goes back to playing state), not to how much of the media files has been loaded. In effect, there is no way to provide a seek slider without lying to the WebKit platform independent code about how much data we've loaded (by claiming that we've got all the data, even though we don't know that).
Another problem is that user credentials are not forwarded to Phonon. Normally when you are logged into a site, for example a news site, the browser knows you are authorized, so when you requests more resources (images, videos, documents) it uses that authorization for those resources. In the case of Phonon, there is no way to tell it that you are authorized, so if you for example log into a site, and want to watch a video, Phonon will try to load that video but will get permission denied. That would not happen if we loaded resources though WebKit, because then the credentials would be passed along the request.