[Live-devel] Kasenna MediaBase support (long)

Thu Apr 29 16:47:27 PDT 2004

Ross,

I need to talk a few things though. I'll understand if you don't have time
for this, but if can point me in the right direction on a few things I
would really appreciate that.

It goes back to this email.

On Sat, 27 Mar 2004 20:25:04 -0800, Ross Finlayson <finlayson at live.com> 
wrote:

> Derk-Jan,
>
>> Thought it was better to discuss this here.
>> I really want to try to get this working no matter how much it's their
>> fault for not keeping to the latest specs (and actually creating broken
>> implementations).
>
> OK, fair enough.
>
>> 4: VLC needs to be able to call a demux from another demux (TS streams
>> need to be demuxed)
>
> Yes, this is something that's needed for playing 'proper' MPEG-2 TS/RTP 
> streams also.

This is what I'm trying to achieve, but with mplayer. I went with mplayer
because it seemed that it had the best RTSP/MPEG-2 TS support. Maybe that
was wrong though as people are saying that MPEG-2 UDP multicast is working
well with VLC. I would need unicast though as the Kasenna support fast
forward and rewind, which is the functionality that I'm looking for.

>> I know how I would go about doing the RAW/RAW/UDP request...
>> I've actually already hacked up liveMedia a bit to use RAW/RAW/UDP, and
>> i edited the SETUP to use the aggregate uri instead of the
>> non-aggregate, because there is only 1 TS Stream and otherwise the
>> server reports 460 (i know, it's a mess :)

I've made similar changes and more. I'll summarise them below.

>>
>> Anyway, these are the results:
>> (Notice the completely ridiculous SDP the server responds with... )
>
> The SDP actually seems OK, except for the fact that it apparently lies 
> about the nature of the stream that it's supposed to be 'describing'.
>
> Now, just to make things clear (in my own mind): Do I understand this 
> correctly? Despite advertising that the stream consists of (i) a MPEG 
> video RTP stream and (ii) a MPEG audio stream, the stream actually 
> consists of a single MPEG Transport Stream over UDP.  Is this correct??  
> If so, how incredibly brain damaged!

This is correct. It's single MPEG-2 Transport Stream over UDP, with
two PES, one for audio, one for video.

> I'd prefer that any changes to the LIVE.COM libraries necessary to 
> support this Kasenna crap be kept as small as possible.  One possible 
> way to proceed would be for the RTSP client code to (when a Kasenna 
> stream is detected) throw away the SDP description that comes from the 
> Kasenna server, and instead generate a different SDP description that 
> *really* describes the stream.  This new, proper SDP description would 
> be passed to "MediaSession::initializeWithSDP()", which then wouldn't 
> need to change at all, except to add support for creating a source for 
> raw UDP streams - using the "BasicUDPSource" class.

In fact for MPEG-2 from a Kasenna, there isn't any SDP at all. Only for
MPEG-1 and MPEG-4. For MPEG-2 they use application/x-rtsp-mh. I think
there are reasons for that. They have extensions for the ff/rw and for
clips and sequences and other things. Plus, they are around a few years
so maybe they went with something that worked and now won't change.

Anyway, I'm reading the x-rtsp-mh and generating an SDP. But, I'm not
100% sure that it's correct.

I go through DESCRIBE/SETUP/PLAY and the server starts streaming.

Now, it's about here that I'm stuck. With demux_rtp.cpp as the demuxer
the TS packets are all out of order so it fails MPEG-2 TS continuity
checks. I'm attempting to use the mplayer TS demuxer (demux_ts.c) in
libmpdemux, copying the new_demuxers_demuxer hack used for interleaved
RTP.

Maybe I should show some changes. Please understand that I'm trying
to understand all the issues first and get a working prototype. I
am committed to going over all this for as long as it takes to turn
the hacks into proper solutions.

Groupsock::handleRead()
-----------------------

#ifdef SUPPORT_KASENNA_RTSP
   int maxBytesToRead = bufferMaxSize;
#else
   int maxBytesToRead = bufferMaxSize - TunnelEncapsulationTrailerMaxSize;
#endif

RTSPClient::describeURL()
-------------------------

     // (Later implement more, as specified in the RTSP spec, sec D.1 #####)
     char* const cmdFmt =
       "DESCRIBE %s RTSP/1.0\r\n"
       "CSeq: %d\r\n"
#ifdef SUPPORT_KASENNA_RTSP
       "Accept: application/x-rtsp-mh, application/sdp\r\n"
#else
       "Accept: application/sdp\r\n"
#endif

and towards the end of this method:

       if (from != to && fVerbosityLevel >= 1) {
         envir() << "Warning: " << from-to << " invalid 'NULL' bytes were 
found in (and removed from) the SDP description.\n";
       }
       bodyStart[to] = '\0'; // trims any extra data

#ifdef SUPPORT_KASENNA_RTSP
      // Translate from x-rtsp-mh to sdp

      int  videoPid, audioPid;
      char sdpBuf[300], currentWord[20], ipAddressBuf[20];

      char * currentPos = bodyStart;

      while (strcmp(currentWord, "</MediaDescription>") != 0)
      {
        sscanf(currentPos, "%s", currentWord);

        if (strcmp(currentWord, "VideoPid") == 0) {
          currentPos += strlen(currentWord) + 1;
          sscanf(currentPos, "%s", currentWord);
          currentPos += strlen(currentWord) + 1;
          sscanf(currentPos, "%d", &videoPid);
          currentPos += 3;
        }

        if (strcmp(currentWord, "AudioPid") == 0) {
          currentPos += strlen(currentWord) + 1;
          sscanf(currentPos, "%s", currentWord);
          currentPos += strlen(currentWord) + 1;
          sscanf(currentPos, "%d", &audioPid);
          currentPos += 3;
        }

        currentPos += strlen(currentWord) + 1;
      }

      unsigned char byte1 = fServerAddress & 0x000000ff;
      unsigned char byte2 = (fServerAddress & 0x0000ff00) >>  8;
      unsigned char byte3 = (fServerAddress & 0x00ff0000) >> 16;
      unsigned char byte4 = (fServerAddress & 0xff000000) >> 24;

      sprintf(ipAddressBuf, "%u%s%u%s%u%s%u%s%c",
                byte1, ".",
                byte2, ".",
                byte3, ".",
                byte4, " ",
                '\0');

      sprintf(sdpBuf, "%s\n%s%s\n%s%s\n%s%s\n%s\n%s\n%s\n%s\n%s%d\n",
              "v=0",
              "o=NoSpacesAllowed 1 1 IN IP4 ", ipAddressBuf,
              "s=", url,
              "c=IN IP4 ", ipAddressBuf,
              "t=0 0",
              "a=control:*",
              "a=range:npt=0-",
              "m=video 554 RAW/RAW/UDP 33",
              "a=control:trackID=", videoPid);
              //"m=audio 0 RTP/AVP 14",
              //"a=control:trackID=", audioPid);

      return strDup(sdpBuf);

#endif
     }

RTSPClient::setupMediaSubsession()
----------------------------------

     // (Later implement more, as specified in the RTSP spec, sec D.1 #####)
     char* const cmdFmt =
#ifdef SUPPORT_KASENNA_RTSP
       "SETUP %s%s RTSP/1.0\r\n"
#else
       "SETUP %s%s%s RTSP/1.0\r\n"
#endif
       "CSeq: %d\r\n"
#ifdef SUPPORT_REAL_RTSP
       "Transport: x-pn-tng/tcp;mode=play,rtp/avp/unicast;mode=play\r\n"
#else
#ifdef SUPPORT_KASENNA_RTSP
       "Transport: RAW/RAW/UDP%s%s%s=%d-%d\r\n"
#else
       "Transport: RTP/AVP%s%s%s=%d-%d\r\n"
#endif
#endif
       "%s"
       "%s"
       "%s\r\n";

and further on in the same method:

     unsigned cmdSize = strlen(cmdFmt)
       + strlen(prefix) + strlen(separator)
#ifndef SUPPORT_KASENNA_RTSP
       + strlen(suffix)
#endif
       + 20 /* max int len */
       + strlen(transportTypeString) + strlen(modeString)
           + strlen(portTypeString) + 2*5 /* max port len */
       + strlen(sessionStr)
       + strlen(authenticatorStr)
       + fUserAgentHeaderStrSize;
     cmd = new char[cmdSize];
     sprintf(cmd, cmdFmt,
             prefix, separator,
#ifndef SUPPORT_KASENNA_RTSP
             suffix,
#endif
             ++fCSeq,
#ifndef SUPPORT_REAL_RTSP
             transportTypeString, modeString, portTypeString,
                 rtpNumber, rtcpNumber,
#endif
             sessionStr,

MediaSession::initializeWithSDP()
---------------------------------

#ifdef SUPPORT_KASENNA_RTSP
     if (sscanf(sdpLine, "m=%s %hu RAW/RAW/UDP %u",
                mediumName, &subsession->fClientPortNum, &payloadFormat) != 
3
#else
     if (sscanf(sdpLine, "m=%s %hu RTP/AVP %u",
                mediumName, &subsession->fClientPortNum, &payloadFormat) != 
3
#endif

MediaSubsession::initiate()
---------------------------

#ifdef SUPPORT_KASENNA_RTSP
       success = True;
       break;
#else
       // Get the client port number, to make sure that it's even (for RTP):
       Port clientPort(0);
       if (!getSourcePort(env(), fRTPSocket->socketNum(), clientPort)) {
         break;
       }
       fClientPortNum = ntohs(clientPort.num());

       // If the port number's not even, try again:
       if ((fClientPortNum&1) == 0) {
         success = True;
         break;
       }
       // Try again:
       delete oldGroupsock;
       oldGroupsock = fRTPSocket;
       fClientPortNum = 0;
#endif

and a little on:

#ifndef SUPPORT_KASENNA_RTSP
     // Set our RTCP port to be the RTP port +1
     unsigned short const rtcpPortNum = fClientPortNum|1;
     if (isSSM()) {
       fRTCPSocket = new Groupsock(env(), tempAddr, fSourceFilterAddr,
                                   rtcpPortNum);
       // Also, send RTCP packets back to the source via unicast:
       if (fRTCPSocket != NULL) {
         fRTCPSocket->changeDestinationParameters(fSourceFilterAddr,0,~0);
       }
     } else {
       fRTCPSocket = new Groupsock(env(), tempAddr, rtcpPortNum, 255);
     }
     if (fRTCPSocket == NULL) {
       char tmpBuf[100];
       sprintf(tmpBuf, "Failed to create RTCP socket (port %d)",
               rtcpPortNum);
       env().setResultMsg(tmpBuf);
       break;
     }
#endif

and a good bit further on:

     } else if (strcmp(fCodecName, "X-MCT-TEXT") == 0) {
       // A UDP-packetized text stream (*not* a RTP stream)
       fReadSource = BasicUDPSource::createNew(env(), fRTPSocket);
       fRTPSource = NULL; // Note!
     } else if (  strcmp(fCodecName, "PCMU") == 0 // PCM u-law audio
                || strcmp(fCodecName, "GSM") == 0 // GSM audio
                || strcmp(fCodecName, "PCMA") == 0 // PCM a-law audio
                || strcmp(fCodecName, "L16") == 0 // 16-bit linear audio
                || strcmp(fCodecName, "MP1S") == 0 // MPEG-1 System Stream
                || strcmp(fCodecName, "MP2T") == 0 // MPEG-2 Transport Str
                || strcmp(fCodecName, "MP2P") == 0 // MPEG-2 Program Stream
                || strcmp(fCodecName, "L8") == 0 // 8-bit linear audio
                || strcmp(fCodecName, "SPEEX") == 0 // SPEEX audio
               ) {
#ifdef SUPPORT_KASENNA_RTSP
       createBasicUDPSource = True;
#else
       createSimpleRTPSource = True;
       useSpecialRTPoffset = 0;
#endif
     } else if (useSpecialRTPoffset >= 0) {
       // We don't know this RTP payload format, but try to receive
       // it using a 'SimpleRTPSource' with the specified header offset:
       createSimpleRTPSource = True;
     } else {
       env().setResultMsg("RTP payload format unknown or not supported");
       break;
     }

     if (createBasicUDPSource) {
       fReadSource = BasicUDPSource::createNew(env(), fRTPSocket);
     }

demux_rtp.cc - demux_open_rtp()
-------------------------------

         // Set the OS's socket receive buffer sufficiently large to avoid
         // incoming packets getting dropped between successive reads from 
this
         // subsession's demuxer.  Depending on the bitrate(s) that you 
expect,
         // you may wish to tweak the "desiredReceiveBufferSize" values 
above.
#ifdef SUPPORT_KASENNA_RTSP
         BasicUDPSource* udpSource = 
dynamic_cast<BasicUDPSource*>(subsession->readSource());
         int rtpSocketNum = udpSource->groupSock()->socketNum();
#else
         int rtpSocketNum = subsession->rtpSource()->RTPgs()->socketNum();
#endif

further on:

     success = True;
   } while (0);
   if (!success) return NULL; // an error occurred

#ifdef SUPPORT_KASENNA_RTSP
   stream_t* s = new_ds_stream(demuxer->video);

   demuxer->type = DEMUXER_TYPE_MPEG_TS;
   demuxer_t* new_demux = demux_open(s, DEMUXER_TYPE_MPEG_TS, -1, -1, -1, 
NULL);

   demuxer = new_demuxers_demuxer(new_demux, new_demux, new_demux);
#else
   // Hack: If audio and video are demuxed together on a single RTP stream,
   // then create a new "demuxer_t" structure to allow the higher-level
   // code to recognize this:
   if (demux_is_multiplexed_rtp_stream(demuxer)) {
     stream_t* s = new_ds_stream(demuxer->video);
     demuxer_t* od = demux_open(s, DEMUXER_TYPE_UNKNOWN, -1, -1, -1, NULL);
     demuxer = new_demuxers_demuxer(od, od, od);
   }
#endif

   return demuxer;
}

demux_rtp.cpp - afterReading()
------------------------------

#ifdef SUPPORT_KASENNA_RTSP
   if (frameSize > MPEG2_TS_FRAME_SIZE) {
     fprintf(stderr, "Saw an input frame too large (>=%d).  Increase 
MPEG2_TS_FRAME_SIZE in \"demux_rtp.cpp\".\n",
             MPEG2_TS_FRAME_SIZE);
   }
#else
   if (frameSize >= MAX_RTP_FRAME_SIZE) {
     fprintf(stderr, "Saw an input frame too large (>=%d).  Increase 
MAX_RTP_FRAME_SIZE in \"demux_rtp.cpp\".\n",
             MAX_RTP_FRAME_SIZE);
   }
#endif

and ifdef out the rtcp checks:

#ifndef SUPPORT_KASENNA_RTSP
   RTPState* rtpState = (RTPState*)(demuxer->priv);

   // Set the packet's presentation time stamp, depending on whether or
   // not our RTP source's timestamps have been synchronized yet:
   Boolean hasBeenSynchronized
     = bufferQueue->rtpSource()->hasBeenSynchronizedUsingRTCP();

I would really appreciate some guidance. Is this completely the wrong
approach? Should I be doing something with MPEG1or2Demux, adding in
TS support? Or is it BasicUDPSource that I should be extending?

Dermot.
--