[Live-devel] Synchro problem and isCurrentlyAwaitingData()

Cristiano Belloni belloni at imavis.com
Wed Apr 6 09:49:00 PDT 2011


Il 06/04/2011 12:04, Cristiano Belloni ha scritto:
> Hi to all,
> I wrote a custom shared memory source. it inherits from FramedSource.
> The shared memory is synchronized via Linux semaphores (simple 
> producer-consumer algorithm), but since I didn't want to subclass 
> TaskScheduler, I still use a "dummy" file descriptor-based 
> communication with live555. In pseudocode:
>
>
> ~~~Client (without live555):
>
> wait on semaphore_empty (blocking)
> copy frame in shared memory
> write one byte in a dedicated FIFO (this should wake up live555' 
> TaskScheduler select())
> post on semaphore_fill
>
> ~~~Server (with live555, in SharedMemSource::incomingPacketHandler1())
> [turnOnBackgroundReadHandling is called in doGetNextFrame]
>
> wait on semaphore_fill (blocking)
> read one byte from the dedicated FIFO (to flush the FIFO buffer)
> copy frame from shared memory
> post on semaphore_empty
>
> This works. Altought the blocking wait on semaphore_fill might make 
> you wonder, the client wakes up my source with the write() in the 
> dedicated FIFO and immediately posts on semaphore_fill, so the server 
> almost never waits, and if it does, it doesn't block for a really 
> small time.
>
> The problem is that, after a while (1 or 2 hours usually), the client 
> does its cycle and the server never wakes up. It *doesn't* get stuck 
> on the wait, I checked: it simply never wakes up, as if the client 
> write() was lost (but it *always* succeed on the client side) or the 
> select() didn't wake up even if the write succeeded.
>
> I would like to emphasize this: the server *never* gets stuck forever 
> on its wait. When it gets stuck, the client is one frame ahead of the 
> server, incomingPacketHandler1() simply is never called anymore and 
> the wait is not even reached.
>
> At this point, I have two questions:
>
> 1) In your knowledge, can the select() not wake up even if a write() 
> on the other side succeeded? If it can, how is it possible? Note that 
> the system is an embedded ARM processor, and it could get quite busy 
> while acquiring and streaming video.
>
> 2) First thing I do in SharedMemSource::incomingPacketHandler1() is to 
> check for isCurrentlyAwaitingData(). If it's false, I simply return 
> before doing all the cycle, and this happens quite often. What's the 
> meaning of isCurrentlyAwaitingData()? I mean, if the select() in 
> TaskScheduler returned, some data must be present on the 
> file/fifo/socket. How is it possible that the select() did return but 
> still there's no data available? I'm getting really confused on this.
>
> Thanks and regards,
> Cristiano Belloni.
>
>
> -- 
> Belloni Cristiano
> Imavis Srl.
> www.imavis.com <http://www.imavis.com>
> belloni at imavis.com <mailto://belloni@imavis.com>
>
>
> _______________________________________________
> live-devel mailing list
> live-devel at lists.live555.com
> http://lists.live555.com/mailman/listinfo/live-devel

Update: I put some logs in BasicTaskScheduler to see what happens.

one before the select():

printf ("[SYNCHROBUG] About to do the select, timeout %d.%d\n", 
tv_timeToDelay.tv_sec, tv_timeToDelay.tv_usec);
     int selectResult = select(fMaxNumSockets, &readSet, &writeSet, 
&exceptionSet, &tv_timeToDelay);
     if (selectResult < 0) {
[...]

two after the select(), (one catches an EINTR or EAGAIN  error value 
should they happen):

#else
     if (errno != EINTR && errno != EAGAIN) {
#endif
         // Unexpected error - treat this as fatal:
#if !defined(_WIN32_WCE)
         perror("BasicTaskScheduler::SingleStep(): select() fails");
#endif
         internalError();
       }
   }
     if (errno == EINTR || errno == EAGAIN) {
        perror ("[SYNCHROBUG] error is");
     }

   printf ("[SYNCHROBUG] Select done, getting sockets\n");
[...]

two after the first and second pass of readable socket check:

  int resultConditionSet = 0;
     if (FD_ISSET(sock, &readSet) && FD_ISSET(sock, &fReadSet)/*sanity 
check*/) {
        printf ("[SYNCHROBUG] Socket %d found readable on first pass\n", 
sock);
        resultConditionSet |= SOCKET_READABLE;
     }

     [...]

       if (FD_ISSET(sock, &readSet) && FD_ISSET(sock, &fReadSet)/*sanity 
check*/) {
          printf ("[SYNCHROBUG] Socket %d found readable on second 
pass\n", sock);
          resultConditionSet |= SOCKET_READABLE;
       }

And one to check if we found some readable/writable/excepting socket at all:


if (handler == NULL) {
        fLastHandledSocketNum = -1;//because we didn't call a handler
        printf ("[SYNCHROBUG] No socket found at all\n");
     }


at first everything is ok:

[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] No socket found at all
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on first pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 0.0
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass

(socket 5 must be the FIFO, I guess)


But then, select keeps randomly return errno=11, aka EAGAIN or "Resource 
temporarily unavailable":

[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 1.480311
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 6 found readable on first pass
[SYNCHROBUG] About to do the select, timeout 1.479309
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 1.478362
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 6 found readable on first pass
[SYNCHROBUG] About to do the select, timeout 1.477358
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 5 found readable on second pass
[SYNCHROBUG] About to do the select, timeout 1.476431
[SYNCHROBUG] error is: Resource temporarily unavailable
[SYNCHROBUG] Select done, getting sockets
[SYNCHROBUG] Socket 6 found readable on first pass

Now, I don't even know the reason why a select() could return EAGAIN (a 
lot of people say it souldn't at all, and even my "man 3 select" agrees: 
http://stackoverflow.com/questions/4193043/select-on-a-pipe-in-blocking-mode-returns-eagain 
), but I see this case is handled in your code and ignored, just like 
the EINTR case:

     if (errno != EINTR && errno != EAGAIN) {
#endif
         // Unexpected error - treat this as fatal:
#if !defined(_WIN32_WCE)
         perror("BasicTaskScheduler::SingleStep(): select() fails");
#endif
         internalError();
       }

[if errno is EINTR or EAGAIN, then the scheduler goes on inspecting the 
select()'s returned sets].

Could that be the origin of my problems?

As obviously you can't try my executables on my hardware, please tell me 
what else could I log. BTW the rtsp/rtp client in this picture is 
openRTSP. Here's an ascii schema :)

client program generating frames ----FIFO---> rtsp server based on 
live555 ----RTSP/RTP/TCP----> openRTSP

Thank you and best regards,

Cristiano Belloni.


-- 
Belloni Cristiano
Imavis Srl.
www.imavis.com <http://www.imavis.com>
belloni at imavis.com <mailto://belloni@imavis.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.live555.com/pipermail/live-devel/attachments/20110406/f0a6eeb5/attachment-0001.html>


More information about the live-devel mailing list