<html><head><base href="x-msg://5043/"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><blockquote type="cite"><span class="Apple-style-span" style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><div lang="EN-CA" link="blue" vlink="purple"><div class="WordSection1" style="page: WordSection1; "><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; ">We are using Live555 to stream (unicast) short bursts of video (currently MJPG, soon to be H.264) and metadata as objects are tracked throughout a scene.  For a given tracked object , we have information about the object (id, location, etc), which we send as XML meta data in a MediaSubsession.  Each video/metadata sequence has a start and an end, and multiple (up to 25) of these sequences may be active at one time (think of each as coming from a separate camera trained on a separate subject).</div><p class="MsoNormal" style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; "> </p><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; ">Currently the way we are doing this (just as a proof of concept) is to use a single SMS containing one MJPG stream and one XML stream, with each video sequence multiplexed into this single image/metadata channel.  In this model, the MJPG stream has the frames tagged with a serial number and put into correspondence with the XML stream in a buffer on the receiving end. </div><p class="MsoNormal" style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; "> </p><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; ">This is very clunky, and will not scale to better video codecs due to the uncorrelated (multiplexed) sequence of frames.  We would like to move toward a model where we have a single session open, but each video/metadata sequence is dynamically added/removed from that session as objects appear and disappear.</div><p class="MsoNormal" style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; "> </p><div style="margin-top: 0cm; margin-right: 0cm; margin-left: 0cm; margin-bottom: 0.0001pt; font-size: 11pt; font-family: Calibri, sans-serif; ">Is this possible using Live555?  Is this something that can’t be done due to limitations in RTSP/RTCP? </div></div></div></span></blockquote><div><br></div>Hmm...  I'm not 100% sure that I understand what you're trying to do, but I think there are 3 separate issues here:</div><div>1/ Does RTP/RTCP support what you're trying to do?</div><div>2/ Does RTSP support what you're trying to do?</div><div>3/ Does the "LIVE555 Streaming Media" code support what you're trying to do?</div><div><br></div><div>For 1/, I think the answer is yes.  RTP/RTCP supports multiple (logical or physical) sources within a session - with each source having its own "SSRC" (basically, a 'RTP source identifier').  Receiver(s) can demultiplex the different media sources based on "SSRC", on the media type (RTP payload format code), and of course on multiple port numbers.  Normally, this is done within a multicast session, but in your case you are (from what I can gather) using unicast (with multiple logical sources effectively sharing a single unicast stream).  That should be OK as well.</div><div><br></div><div>For 2/, the answer is yes (I think) - but perhaps not in the way that you think.  Because you have effectively just two media types - JPEG video and XML text - in your session, there should be just two "m=" lines in your server's SDP description, and thus (in a LIVE555 implementation) just two "ServerMediaSubsession" objects within your "ServerMediaSession".  The fact that many different logical sources may appear/disappear within each (sub)stream is irrelevant, as far as RTSP is concerned.  This is something that affects the contents of the media stream(s), but not the way that they are described/set up via RTSP.  So, your LIVE555-based server implementation should continue to have just two "ServerMediaSubsession" objects within your "ServerMediaSession".</div><div><br></div><div>Now the bad news: For 3/, the answer is no (for now, at least).  The LIVE555 code currently does not support demultiplexing based on SSRC.  This means that a LIVE555-based receiver will have to do its own demultiplexing upon the single stream of RTP data that it receives.  For your XML substream, that's not a problem, because you can have a 'source id' there that you can demultiplex upon.  However, as you noted, for H.264, it's going to be a problem.  I'm not sure what you can do about this, but perhaps there's some 'user data' field defined for a H.264 NAL that you could use to add a tag that your receiver could use for demultiplexing??</div><br><br><div apple-content-edited="true">

<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; "><span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: -webkit-auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; font-size: medium; ">Ross Finlayson<br>Live Networks, Inc.<br><a href="http://www.live555.com/">http://www.live555.com/</a></span></span>

</div>

<br></body></html>