[Live-devel] live555MediaServer and trick mode

Thu Aug 6 07:06:06 PDT 2020

On Aug 5, 2020, at 4:45 PM, Ross Finlayson <finlayson at live555.com> wrote:
> 
> you should make sure that the resulting “.mkv” (or “.webm”) file contains frequent ‘key frames’ (at least one per second; preferably more frequent than this)

A large part of the reason H.264 and newer are able to get higher rates of compression is that they’re less prone to error build-up: greater allowed spans for P and B frame lookup, B frames can now be used as references for P frames, quarter-pel instead of half-pel motion search, etc.

Therefore, newer codecs allow intra (I) frames to be farther apart, since they’re not needed as often to reset the errors built up over prior P and B frames.

Your advice appears to be based on MPEG-2, where I frame distance of half a second (15 frames for 30fps) was the common rule, but with newer codecs, it is quite common for I frames to be further apart.  The default in x264 is 40 frames, according to docs found online, but I’ve seen 300 used with modern codecs before when optimizing for low bandwidth VoD.

The point is that if you take this advice to reduce I frame distance, you’re necessarily going to have either higher encoding bit rates or lower perceptual quality.

> the trick play mechanism works by seeking to a key frame.

Also beware of “open” GOPs, which make for lower bit rates at the expense of I frames not being interpretable without context from prior frames.  Seeking to an open GOP’s I frame will produce bad decodes until you get to just before the *next* I frame, where context for *that* frame begins to appear.

> Another thing you could try is outputting to a Transport Stream file

…which then means you must use null-stuffing to achieve constant bit rate, because the rate limiter in Live555’s streaming code still can’t cope with VBR, even though there’s no technical reason it couldn’t be made to do so.

CBR throws away a lot of the advantage of modern codecs.  The expected size ratio between VBR MPEG-2 and H.264 files is roughly 2:1 to 4:1 for equivalent perceptual quality, but I just did a test here:

VBR MPEG-2 original: 84 MB
VBR H.264 re-encode: 37 MB
CBR null-stuffed H.264: 65 MB

In other words, we took a 56% codec savings for MPEG-2 to H.264 VBR to VBR and turned it into a 23% savings by requiring CBR.

This is because H.264 is very nearly as “peaky” as MPEG-2, since the improvements in I frame encoding aren't the biggest source of savings in H.264 relative to MPEG-2.  Much more of the savings is in the ability to put off I frames as I said above, but CBR null-stuffing covers over those savings.

It doesn’t matter how deep the valley is if you flood it for a reservoir.