[Live-devel] H264VideoFramer truncating frames

Wed Mar 11 05:54:42 PDT 2015

I've got this working now, I think my problem was conceptually treating 
the byte stream as a series of frames instead of as a continuous stream.

Having looked at the ByteStreamMemoryBufferSource class it makes a lot 
more sense now.

On 03/11/2015 10:10 AM, Robert Smith wrote:
> Firstly apologies for the multiple posts.
>
> The platform is a TI DaVinci DM8148 using TI's HDVICP2 video encoder 
> hardware. As far as I can tell the encoder is capable of outputting 
> discrete NALU's but we are only able to use the encoder via an OpenMAX 
> API which doesn't expose this behaviour.. it's frustrating!.
>
> The encoder provides discrete 'frames' which consist of a buffer of 
> concatenated NALU's prepended with the start codes.
>
> Now that I think about it, If the H264VideoFramer is expecting to read 
> X bytes from a continuous stream, I should be able to give it only 
> part of the 'frame' that I receive from the encoder? I'll try this and 
> see how it works.
>
> Btw, On our other systems we use the Intel IPP H264 encoder and the 
> Intel Media SDK encoder, I'm not as familiar with them but I 
> understand that they also don't output discrete NALU's so I thought 
> this was common amongst encoders.
>
>
> Regarding slices, I can configure the encoder to encode a frame as 
> multiple slices based either on a maximum slice size or number of MB's 
> per slice but we're using GStreamer as one of our clients and I get a 
> lot of image corruption with multiple slices enabled.
>
> I haven't had time to look into the problem deeper but using a single 
> slice was a quick and easy solution.
>
> Thanks,
>
> Robert Smith.
>
> On 03/10/2015 01:49 AM, Ross Finlayson wrote:
>>> The encoder unfortunately only supplies frames in Annex B byte 
>>> stream format requiring the frames to be parsed.
>>
>> Are you sure about this?  (Often, hardware encoders have firmware 
>> upgrades available.)
>>
>>
>>> Previously I was using my own class to identify the NAL unit's in 
>>> conjunction with the H264VideoDiscreteFramer which worked fine but 
>>> it's heavy on the CPU. So I've been trying to use the 
>>> H264VideoFramer and just pass the full frames in which works ok and 
>>> is faster than my solution except that I'm seeing a lot of truncated 
>>> frames.
>>>
>>> Having looked into the code it appears to be caused by the behaviour 
>>> of the StreamParser class; specifically the ensureValidBytes1() 
>>> method which calls getNextFrame() on my source with maxSize = 
>>> BANK_SIZE - fTotNumValidBytes. The method switches banks to ensure 
>>> that the larger of numBytesNeeded or the input sources 
>>> maxFrameSize() will fit.
>>>
>>> I can 'fix' the problem by increasing BANK_SIZE and implementing 
>>> maxFrameSize() on my source but I'm not totally happy with this 
>>> solution because I would prefer not to modify the library source and 
>>> I'm just guessing for the maxFrameSize() value.
>>>
>>> I was wondering whether it's possible to return a partial frame from 
>>> my video source?
>>
>> Yes, but not in the way that you might think :-)  A H.264 encoder 
>> actually delivers "NAL units".  "NAL units" are what actually get 
>> parsed by our code, and packed into RTP packets.
>>
>> Often, a "NAL unit" is a complete frame.  It is possible, however, 
>> for a 'key frame' to be split up - by your encoder - into multiple 
>> 'slice' NAL units.  For datagram streaming (e.g., over RTP), it is 
>> *much* better to have your key frames broken up into multiple 'slice' 
>> NAL units, than to have the key frame be a single, large NAL unit - 
>> which is what you have now.  This is especially true if your key 
>> frames are exceptionally large: ~150000 bytes or larger, which 
>> appears to be the case for you, because you are hitting the BANK_SIZE 
>> limit (which was deliberately set to be larger than realistically 
>> needed).
>>
>> Note that a 150000 byte key frame NAL unit will get transmitted as 
>> more than 1000 RTP packets (datagrams).  (Our code automatically 
>> handles the required fragmentation.)  If *any* of these 1000 packets 
>> gets lost in transit, then the entire key frame will be undeliverable.
>>
>> If, instead, your encoder delivers each key frame as multiple 'slice' 
>> NAL units, then your streaming will be much more resilient to network 
>> packet loss.
>>
>> So, your first task should be to check whether your encoder:
>> 1/ can be reconfigured to deliver discrete frames, rather than a 
>> stream with each NAL unit prepended by a 0x00 0x00 0x00 0x01 'start 
>> code', and
>> 2/ can be reconfigured to deliver key frames as multiple 'slice' NAL 
>> units, rather than as a single (ridiculously large) NAL unit.
>>
>>
>> Ross Finlayson
>> Live Networks, Inc.
>> http://www.live555.com/
>>
>>
>>
>> _______________________________________________
>> live-devel mailing list
>> live-devel at lists.live555.com
>> http://lists.live555.com/mailman/listinfo/live-devel
>
>
>
> _______________________________________________
> live-devel mailing list
> live-devel at lists.live555.com
> http://lists.live555.com/mailman/listinfo/live-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.live555.com/pipermail/live-devel/attachments/20150311/a723a9dc/attachment.html>