[Live-devel] Strange problem when spawning multiple instances of Live555 programs

Morgan Tørvolt morgan.torvolt at gmail.com
Mon Oct 30 01:19:00 PST 2006


Hi Ross.

Yes, I am setting reuseFirstSource to true.

I do not believe that this problem is caused by a internal limit. The
reason for my opinion is that if i run 40 or 50 instances, it is
usually  only one of them that fails. Even when running 100, no more
than a few of the processes fail. What really has me wondering is the
fact that it usually continues as soon as the other processes die.

Now, after some more testing, I have maybe found a pattern. It seems
everything works OK up until the 31. pair of server/openRTSP. it is
always the 32. pair that fails. Sometimes when spawning a lot of
threads at once, it is not the last pid, but always one of the last
ones. Spawning one and one will always make sure that it is the 32.
that fails. If I add another server/client pair, it is random if the
new one fails, and the old failed one continues, or vica versa, but
one starts/continues OK, and the other one fails. If I remove one of
the pairs, then the hanging one starts, and everything runs fine.

Since this happens in both the RTSPserver lookalike server of mine and
the openRTSP, I have been trying to dig trough the live555 code see
what I can find. Especially I have searched for uses of the bind
command. I could not really understand why one would run bind on a
client. In GroupsockHelper.cpp around line 520 there is a comment
saying "// Hack - ....". I suspect that it is there that the bind call
comes from on the openRTSP.
The output:

getsockname(4, {sa.........
bind(4, {sa_family............
getsockname(4, {sa........

that you can see in my first mail (from strace), seems to concur with
this thought, since getsockname is called before and after. After some
searching I see that you use this function in RTSPServer.cpp and
OnDemandServerMediaSubsession.cpp. It must be the last of those two
that hangs since that is the only one with an infinite loop
possibility (while(1) vs. while(0)), and that sort of makes sense the
way I read the code.

In OnDemandServerMediaSubsession.cpp you have a while (1) loop,
grabbing a port with an even number (why you need an even number, I
cannot guess). I expect there is some grace period on the uneven ports
that you will have received and discarded which made the 31 first
processes work nicely. What I suspect happens is that the OS figures
"Wow, 31 ports taken but not in use. Free them now" when the 32. pair
starts, making it so that it receives every odd port over and over.
When a new pair arrives, they will interfere with each other and make
one of them (the random part I talked about) able to get an even
numbered port. This seems to continue until I go above 63 clients,
when the "crash" and CPU-hogging happens to two processes. With a
dual-core CPU, I am unable to take this any further, but I would be
surprised if it does not happen at 95 and 127 also.

What needs to be done to fix this is addressing the port number issue.
One suggestion would be to try random ports with even numbers until
you get one that is "vacant". Another to check the /proc/net/tcp and
find a port that is free. That would be quite ugly though, and not
very portable. It is not certain that any of my suggestions is any
good of course, but something should be done with this issue as it
will happen on all Linux systems, and quite possibly on other as well.

If you want me to make a fix for it, I can do so, but I unfortunately
have more urgent matters that needs to be addressed first and I do not
have the time at the moment to do a lot of debugging and stability
testing. It could therefor take some time before I get around to take
care of it.

I could of course be me that is totally wrong here, but I have been
unable to find anything in the kernel that suggest that I can set a
limit on these things. It seems to be a OS thing as you suggested
though.

Sorry for the long e-mail. I thought that giving alot of information
would probably make it easier for you.

Regards

-Morgan Tørvolt-


Btw, I had to think hard to understand the do{ ... } while(0) loop
calling getSourcePort in RTSPServer.cpp. It seems to work like goto,
only using break and a do-while loop instead. Below I have written
something that is a bit more readable in my eyes. I threw it together
in a hurry, and don't have the time to test it today unfortunately.
Your decicion though. It is obvious that your code works well.

  ourSocket = setupStreamSocket(env, ourPort);
  if (ourSocket >= 0)
  {
      // Make sure we have a big send buffer:
      if (increaseSendBufferTo(env, ourSocket, 50*1024))
      {
          // Allow multiple simultaneous connections:
          if (listen(ourSocket, LISTEN_BACKLOG_SIZE) >= 0)
          {
              if (ourPort.num() != 0)
              {
                  return ourSocket;
              }
              else
              {
                  // bind() will have chosen a port for us; return it also:
                    if (!getSourcePort(env, ourSocket, ourPort)) break;
              }
          }
          else
          {
              env.setResultErrMsg("listen() failed: ");
              break;
          }
      }
  }

On 30/10/06, Ross Finlayson <finlayson at live555.com> wrote:
> >I have made one program that accepts data on a socket, and
> >retransmits it on a RTSP socket (sort of like
> >testOnDemandRTSPServer, but with socket input instead of a file)
>
> Be sure to set the "reuseFirstSource" parameter to "True" when
> creating each "ServerMediaSubsession" object.
>
> >Everything works fine as long as I spawn no more than 25
> >server/client pairs. If I go above that, say 40, one of the programs
> >(sometimes openRTSP, other times the server) starts hogging CPU
> >power.
>
> I suspect you're running into some internal OS limit here - e.g.,
> maximum number of sockets or open files.  If so, then this may be
> something that you can reconfigure.
>
> Apart from this, I don't have anything else to suggest right now,
> unfortunately.
> --
>
> Ross Finlayson
> Live Networks, Inc.
> http://www.live555.com/
> _______________________________________________
> live-devel mailing list
> live-devel at lists.live555.com
> http://lists.live555.com/mailman/listinfo/live-devel
>



More information about the live-devel mailing list