[Live-devel] Synchro problem and isCurrentlyAwaitingData()
Cristiano Belloni
belloni at imavis.com
Thu Apr 21 08:32:48 PDT 2011
Il 06/04/2011 18:49, Cristiano Belloni ha scritto:
> Il 06/04/2011 12:04, Cristiano Belloni ha scritto:
>> Hi to all,
>> I wrote a custom shared memory source. it inherits from FramedSource.
>> The shared memory is synchronized via Linux semaphores (simple
>> producer-consumer algorithm), but since I didn't want to subclass
>> TaskScheduler, I still use a "dummy" file descriptor-based
>> communication with live555. In pseudocode:
>>
>>
>> ~~~Client (without live555):
>>
>> wait on semaphore_empty (blocking)
>> copy frame in shared memory
>> write one byte in a dedicated FIFO (this should wake up live555'
>> TaskScheduler select())
>> post on semaphore_fill
>>
>> ~~~Server (with live555, in SharedMemSource::incomingPacketHandler1())
>> [turnOnBackgroundReadHandling is called in doGetNextFrame]
>>
>> wait on semaphore_fill (blocking)
>> read one byte from the dedicated FIFO (to flush the FIFO buffer)
>> copy frame from shared memory
>> post on semaphore_empty
>>
>> This works. Altought the blocking wait on semaphore_fill might make
>> you wonder, the client wakes up my source with the write() in the
>> dedicated FIFO and immediately posts on semaphore_fill, so the server
>> almost never waits, and if it does, it doesn't block for a really
>> small time.
>>
>> The problem is that, after a while (1 or 2 hours usually), the client
>> does its cycle and the server never wakes up. It *doesn't* get stuck
>> on the wait, I checked: it simply never wakes up, as if the client
>> write() was lost (but it *always* succeed on the client side) or the
>> select() didn't wake up even if the write succeeded.
>>
>> I would like to emphasize this: the server *never* gets stuck forever
>> on its wait. When it gets stuck, the client is one frame ahead of the
>> server, incomingPacketHandler1() simply is never called anymore and
>> the wait is not even reached.
>>
>> At this point, I have two questions:
>>
>> 1) In your knowledge, can the select() not wake up even if a write()
>> on the other side succeeded? If it can, how is it possible? Note that
>> the system is an embedded ARM processor, and it could get quite busy
>> while acquiring and streaming video.
>>
>> 2) First thing I do in SharedMemSource::incomingPacketHandler1() is
>> to check for isCurrentlyAwaitingData(). If it's false, I simply
>> return before doing all the cycle, and this happens quite often.
>> What's the meaning of isCurrentlyAwaitingData()? I mean, if the
>> select() in TaskScheduler returned, some data must be present on the
>> file/fifo/socket. How is it possible that the select() did return but
>> still there's no data available? I'm getting really confused on this.
>>
>> Thanks and regards,
>> Cristiano Belloni.
>>
>>
>> --
>> Belloni Cristiano
>> Imavis Srl.
>> www.imavis.com <http://www.imavis.com>
>> belloni at imavis.com <mailto://belloni@imavis.com>
>>
>>
>> _______________________________________________
>> live-devel mailing list
>> live-devel at lists.live555.com
>> http://lists.live555.com/mailman/listinfo/live-devel
>
> Update: I put some logs in BasicTaskScheduler to see what happens.
>
> one before the select():
>
> printf ("[SYNCHROBUG] About to do the select, timeout %d.%d\n",
> tv_timeToDelay.tv_sec, tv_timeToDelay.tv_usec);
> int selectResult = select(fMaxNumSockets, &readSet, &writeSet,
> &exceptionSet, &tv_timeToDelay);
> if (selectResult < 0) {
> [...]
>
> two after the select(), (one catches an EINTR or EAGAIN error value
> should they happen):
>
> #else
> if (errno != EINTR && errno != EAGAIN) {
> #endif
> // Unexpected error - treat this as fatal:
> #if !defined(_WIN32_WCE)
> perror("BasicTaskScheduler::SingleStep(): select() fails");
> #endif
> internalError();
> }
> }
> if (errno == EINTR || errno == EAGAIN) {
> perror ("[SYNCHROBUG] error is");
> }
>
> printf ("[SYNCHROBUG] Select done, getting sockets\n");
> [...]
>
> two after the first and second pass of readable socket check:
>
> int resultConditionSet = 0;
> if (FD_ISSET(sock, &readSet) && FD_ISSET(sock, &fReadSet)/*sanity
> check*/) {
> printf ("[SYNCHROBUG] Socket %d found readable on first
> pass\n", sock);
> resultConditionSet |= SOCKET_READABLE;
> }
>
> [...]
>
> if (FD_ISSET(sock, &readSet) && FD_ISSET(sock,
> &fReadSet)/*sanity check*/) {
> printf ("[SYNCHROBUG] Socket %d found readable on second
> pass\n", sock);
> resultConditionSet |= SOCKET_READABLE;
> }
>
> And one to check if we found some readable/writable/excepting socket
> at all:
>
>
> if (handler == NULL) {
> fLastHandledSocketNum = -1;//because we didn't call a handler
> printf ("[SYNCHROBUG] No socket found at all\n");
> }
>
>
> at first everything is ok:
>
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] No socket found at all
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on first pass
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 0.0
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
>
> (socket 5 must be the FIFO, I guess)
>
>
> But then, select keeps randomly return errno=11, aka EAGAIN or
> "Resource temporarily unavailable":
>
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 1.480311
> [SYNCHROBUG] error is: Resource temporarily unavailable
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 6 found readable on first pass
> [SYNCHROBUG] About to do the select, timeout 1.479309
> [SYNCHROBUG] error is: Resource temporarily unavailable
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 1.478362
> [SYNCHROBUG] error is: Resource temporarily unavailable
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 6 found readable on first pass
> [SYNCHROBUG] About to do the select, timeout 1.477358
> [SYNCHROBUG] error is: Resource temporarily unavailable
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 5 found readable on second pass
> [SYNCHROBUG] About to do the select, timeout 1.476431
> [SYNCHROBUG] error is: Resource temporarily unavailable
> [SYNCHROBUG] Select done, getting sockets
> [SYNCHROBUG] Socket 6 found readable on first pass
>
> Now, I don't even know the reason why a select() could return EAGAIN
> (a lot of people say it souldn't at all, and even my "man 3 select"
> agrees:
> http://stackoverflow.com/questions/4193043/select-on-a-pipe-in-blocking-mode-returns-eagain
> ), but I see this case is handled in your code and ignored, just like
> the EINTR case:
>
> if (errno != EINTR && errno != EAGAIN) {
> #endif
> // Unexpected error - treat this as fatal:
> #if !defined(_WIN32_WCE)
> perror("BasicTaskScheduler::SingleStep(): select() fails");
> #endif
> internalError();
> }
>
> [if errno is EINTR or EAGAIN, then the scheduler goes on inspecting
> the select()'s returned sets].
>
> Could that be the origin of my problems?
>
> As obviously you can't try my executables on my hardware, please tell
> me what else could I log. BTW the rtsp/rtp client in this picture is
> openRTSP. Here's an ascii schema :)
>
> client program generating frames ----FIFO---> rtsp server based on
> live555 ----RTSP/RTP/TCP----> openRTSP
>
> Thank you and best regards,
>
> Cristiano Belloni.
>
>
> --
> Belloni Cristiano
> Imavis Srl.
> www.imavis.com <http://www.imavis.com>
> belloni at imavis.com <mailto://belloni@imavis.com>
>
>
> _______________________________________________
> live-devel mailing list
> live-devel at lists.live555.com
> http://lists.live555.com/mailman/listinfo/live-devel
Ok, it was a problem related to the timestamp calculations, which you
fixed in the last version. Disregard this.
Cristiano.
--
Belloni Cristiano
Imavis Srl.
www.imavis.com <http://www.imavis.com>
belloni at imavis.com <mailto://belloni@imavis.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.live555.com/pipermail/live-devel/attachments/20110421/ed7047e6/attachment.html>
More information about the live-devel
mailing list