[Live-devel] Implementing MRCP

Thu May 6 22:10:19 PDT 2004

>         I'm looking into implementing MRCP using this code base, but 
> certain parts of the protocol seem difficult work out.  I'm also just 
> trying to learn the code.  As a little background, MRCP only uses the 
> SETUP, TEARDOWN, DESCRIBE, and ANNOUNCE commands.

David,

I wasn't familiar with MRCP before, so I looked it up.  FYI (for the rest 
of the list), I found a description at 
<http://www.scansoft.com/network/standards/mrcp/>, and an Internet-Draft 
(for version 1) at 
<http://www.ietf.org/internet-drafts/draft-shanmugham-mrcp-05.txt>.  While 
version 1 apparently uses RTSP as a base, there are apparently plans to 
upgrade this to version 2 
<http://www.ietf.org/internet-drafts/draft-ietf-speechsc-mrcpv2-02.txt>, 
which will instead use SIP as a base.  I assume that you're planning to 
implement version 1.

>  A typical conversation to do recognition would go.
>
>DESCRIBE - to get the server SDP on the main url
>SETUP with unicast and a  client port on the recognizer url -> the 
>response has the server ports and the config.
>--setup outgoing RTP to stream audio data TO the server
>ANNOUNCE - for commands like define grammar, or begin recognizing
>
>Then the server sends ANNOUNCE events to the client asynchronously to 
>state when things finish.
>
> From what I can tell, the software assumes that MediaSubsessions that are 
> created as a result of SETUP command are for receiving data

A small clarification: The "MediaSession" and "MediaSubsession" objects are 
actually created from the SDP description that's returned by the RTSP 
"DESCRIBE" operation - not the RTSP "SETUP" operation.  Also, the objects 
(RTPSource's etc.) for receiving data aren't actually created until 
"initiate()" is called on the MediaSubsession object.  If you don't call 
"initiate()", then you won't be creating any of these objects.  So, you can 
probably continue to use the "MediaSession" and "MediaSubsession" objects 
(perhaps with some modifications), even though they're not being used to 
receive data.

>   Also, the MRCP servers don't seem to return a control URL, and there is 
> no way to override the control url on a subsession.

Unless you modify the code to support this.

>  The third thing is that the event loop does not appear to have support 
> for receiving asynchronous events from the RTSP server.

Although the "RTSPClient" implementation reads synchronously from its 
socket (to get the responses from each command that it sends), there's no 
reason why it couldn't (be modified to) also asynchronously read and handle 
commands that come from the RTSP server.  This would be done using the 
usual asynchronous socket read handling mechanism: 
"TaskScheduler::turnOnBackgroundReadHandling()"

Anyway, the bottom line is that the code probably contains enough of 
framework to let you do what you want to do, but you're going to have to 
make some fairly significant modifications to it first.

	Ross Finlayson
	LIVE.COM
	<http://www.live.com/>