Capturing DTMF with Asterisk and UniMRCP
A common question new developers have is how to capture DTMF key presses when using speech recognition with Asterisk. There are two ways to do this when working with the res_unimrcp.so module and the MRCPRecog() and SynthAndRecog() applications.
Using DTMF Grammars
Our recommended way of handling DTMF is to allow LumenVox to parse DTMF key presses against DTMF grammars. This allows the dialplan code to handle speech and DTMF in the same way. A common requirement is to support menus like "Say yes or press 1, otherwise say no or press 2." In this case, we would like the application to treat the spoken word "yes" as equivalent to the key press 1. This is accomplished by writing separate speech and DTMF grammars that return the same output, loading them both, and then performing a recognition with i=none. Setting i=none tells UniMRCP and Asterisk to send the DTMF digits to LumenVox for processing.
Here are the two grammars that would be required for the application:
YesNoSpeech.gram
#ABNF 1.0 UTF-8; language en-US; mode voice; tag-format <semantics/1.0>; root $rootrule; $yes = yes {out = "yes"}; $no = no {out = "no"}; $rootrule = $yes | $no; |
YesNoDTMF.gram
#ABNF 1.0 UTF-8; language en-US; mode dtmf; tag-format <semantics/1.0>; root $rootrule; $yes = 1 {out = "yes"}; $no = 2 {out = "no"}; $rootrule = $yes | $no; |
Regardless of whether a user says "Yes" or presses 1, the output from each grammar will be the word "yes." So the dialplan can look like the following:
DTMF and Speech Dialplan
exten => s,n,MRCPRecog(speechYesNo.gram,YesNoDTMF.gram,p=default&i=none) exten => s,n,GotoIf($[ "${RECOG_INSTANCE(0/0)}" = "yes"]?Yes:No) exten => s,n(Yes),Verbose(1,The user said yes.) exten => s,n,Hangup exten => s,n(No),Verbose(1,The user said no.) exten => s,n,Hangup |
Now if the user says "Yes" or presses 1, they will hit the line labeled Yes. This same basic pattern can be used for any sort of mixed DTMF/speech dialogs.
Using Builtin Dialplan DTMF Handling
Both MRCPRecog() and SynthAndRecog() support a parameter called i which specifies which, if any, DTMF digits the UniMRCP module will process. If i is set to a value of either a string of digits or the word any, those digits will simply be processed by the dialplan as normal, and LumenVox will not be involved. Consider the following code:
exten => s,n,MRCPRecog(builtin:grammar/digits,p=default&i=any) exten => s,n,Verbose(1,The user said ${RECOG_RESULT}) exten => 1,1,Verbose(1,The user pressed pressed 1) |
That code allows a user to speak digits using the built-in LumenVox digits grammar, but if the user presses 1, the normal Asterisk dialplan functionality would take over, advancing the user to extension 1. However, there is a limitation in the current implementation of UniMRCP/Asterisk that limits it to only collecting one digit this way. For more than one digit, use the above method with DTMF grammars.