Advanced SISR results
If you have not yet done so, please read our Intro to Semantic Interpretation for some background information about semantic interpretation. This page covers some basic concepts and terms used by SISR.
Using ECMAScript Within Tags
As mentioned in our other SISR articles, Tags are how SISR is put into grammars and these Tags usually contain ECMAScript. Any script inside of a tag is executed by the Engine when the part of a rule to its left is matched.
ECMAScript (sometimes called JavaScript), can be used to create fairly complex structures in the results returned by a grammar, and perhaps more importantly, since there it is a scripting language, it allows for variable and function definition to be used. This flexibility may not always be needed, or the usefulness of this may not always be apparent, so let's look at a complex example of using some moderately advanced ECMAScript within a grammar to process results in a certain way.
Important Reminders
Using ECMAScript within grammars, as with any language, requires special syntax to be correctly interpreted.
Whitespace
Whitespace characters, such as carriage returns, line feeds, spaces, tab characters and so on are typically removed or ignored during processing of ECMAScript, so writing Tags that are easy for you to understand, with well-formatted indentation and spacing over several lines may be easier to maintain (and understand) than trying to write the same Tags without using any spacing, as shown here:
$my_rule=test{out="some string";my_counter++;out.counter=my_counter}; |
Is exactly the same as...
$my_rule = test { out = "some string"; my_counter++; out.counter = my_counter }; |
The second example here may be much easier to maintain, read and therefore understand, however it's important to note that when this processing is happening when the grammar is being processed by the ASR, they are treated the same way.
Curly Brackets
Because ECMAScript also uses the curly bracket as a reserved character for conditional statements, it is sometimes necessary to have curly braces within an SISR tag. In this case, you can denote an SISR tag with the following three-character sequence: {!{ will open a tag and }!} will close it.
In other words, if you have a simple tag, such as the following, you can simply use one set of curly brackets as shown here:
$my_rule = test { out = "some string"; }; |
If you need to use a curly brace within your ECMAScript code, then you should surround your tag with the three character sequences, as shown here:
$my_rule = test {!{ if(out == "") { out = "some string"; } else { out = "other string"; } }!}; |
If you attempted to use only the single curly brace, instead of the highlighted three-character sequences, this would result in an SISR parsing error. These are often difficult to detect, since they are only processed after the decode has taken place, so there is no indication of a problem while the grammar is being loaded or compiled.
Variable Declaration
Another common misconception relates to variables used within Tags. The "out" variable is pre-defined, so does not need to be declared by the grammar - this often leads users to assume they do not need to declare other variables that they may use - this is incorrect, and leads to undefined variable errors, which are also difficult to detect.
ECMAScript is also a very loosely-typed language, meaning that you can assign a string, integer, floating point value, array, object or a number of other types to a variable, as shown here:
$my_rule = test {!{ out = "string"; // out is assigned a 'string' type out = 123; // out is assigned an 'integer' type out = 123.45; // out is assigned a 'floating point' type out = new Array('one', 'two', 'three'); // an 'array' type }!}; |
Apart from the "out" variable, if you would like to use other variables within your Tags, it is important to declare them first, and also assign a type to them, where appropriate, as shown in this example:
$my_rule = test {!{ var res = "string"; // a new variable named res is declared // and assigned a 'string' type. out = res; // assigned the value of the res variable }!}; |
Here, note the important use of the var keyword, which is used to declare the res variable, before it can be used. This is very important, since without using the var keyword would result in a processing error.
Also important to note here is that after the res variable is defined, it is assigned a value or "string" - this not only assigns a value, but more importantly, initializes the variable with a "string" type. From that point onward, the ECMAScript processor will treat it as a string (as opposed to some other type).
If a variable does not have a clearly type, then you can get undefined behavior when using it. For example, if you did not assign the value of "string" to res above, then assigning the undefined value of res to out would result in an undefined result being returned by the SISR interpretation.
Functions
An often overlooked capability of ECMAScript is the ability to define and call functions when needed. These could be used to process variables, perform some on-the-fly checking, or perhaps just to save repetitive coding that would otherwise need to appear in multiple Tags.
This simple example shows how to define a function and call it from within a Tag. In this case it is called once for every instance of the $option rule being matched:
language en-US; mode voice; tag-format <semantics/1.0.2006>; root $rootrule; {!{ // This function appends items to the specified // array object, then returns the array object function add_to_array(array_to_fill, what_to_add) { array_to_fill.push(what_to_add); return array_to_fill; }; }!}; $option = cat | mouse | dog; $rootrule = {out = new Array();} ($option { out = add_to_array(out, rules.latest()); } )<1-3>; |
Here, the keyword function is used to declare a function called add_to_array, which simply adds the specified what_to_add value to the specified array_to_fill, which is then returned to the caller.
Note the Tag at the beginning of the $rootrule definition, which initializes the out variable, making it an Array type. This is an important initialization step, since the Tag after the $option rule is processed treats the out variable as an array. Without this initialization step, you would encounter an error.
Below is an equivalent grammar in GrXML format, which shows how the same Tags would be defined in that format.
<?xml version="1.0" encoding="UTF-8" ?> <grammar xmlns="http://www.w3.org/2001/06/grammar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/06/grammar http://www.w3.org/TR/speech-grammar/grammar.xsd" xml:lang="en-US" version="1.0" root="rootrule" mode="voice" tag-format="semantics/1.0.2006"> <tag> // This function appends items to the specified // array object, then returns the array object function add_to_array(array_to_fill, what_to_add) { array_to_fill.push(what_to_add); return array_to_fill; }; </tag> <rule id="option"> <one-of> <item>cat</item> <item>mouse</item> <item>dog</item> </one-of> </rule> <rule id="rootrule"> <tag>out = new Array();</tag> <item repeat="1-3"> <ruleref uri="#option" /> <tag> out = add_to_array(out, rules.latest()); </tag> </item> </rule> </grammar> |
An example result to someone saying "cat dog" using either of these grammars would look like this (an array containing the animals spoken):
<item index="0">cat</item><item index="1">dog</item> |
An Example Grammar
After taking some of the above ideas and reminders into consideration, let's review a specific use-case requiring some advanced logic that relies on ECMAScript processing with some of the above techniques and ideas.
Consider a situation where you would like the user to be able to say any combination of up to 3 of "cat", "dog" or "mouse" in any order. This may seem reasonably straight-forward using optional words within a rule and a repeat-operator.
Now, consider how to handle the situation where you don't want to allow the user to repeat any of those words. This takes a little more thought - you could come up with a finite number of possible combinations of these phrases (there would be a lot), or we could look at using ECMAScript to make the task significantly less complex.
ABNF:
#ABNF 1.0 UTF-8; language en-US; mode voice; tag-format ; root $rootrule; $animal = cat | mouse | dog; $animals = { // This is processed before the animals rule and is used // to declare and initializes the variables out.selections = ''; out.num_animals = 0; out.res_array = new Array(); out.skipped = 0; } ($animal // This next section is processed once per result from $animal {!{ if(out.selections.indexOf(rules.latest()) == -1) { if(out.selections != '') out.selections += ' '; // add spacing between results out.selections+=rules.latest(); out.res_array.push(rules.latest()); out.num_animals++; } else { out.skipped++; } }!} )<1-3>; $rootrule = $animals; |
GrXML:
<?xml version="1.0" encoding="UTF-8" ?> <grammar xmlns="http://www.w3.org/2001/06/grammar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/06/grammar http://www.w3.org/TR/speech-grammar/grammar.xsd" xml:lang="en-US" version="1.0" root="rootrule" mode="voice" tag-format="semantics/1.0.2006"> <tag> // This function appends items to the specified // array object, then returns the array object function add_to_array(array_to_fill, what_to_add) { array_to_fill.push(what_to_add); return array_to_fill; }; </tag> <rule id="animal"> <one-of> <item>cat</item> <item>mouse</item> <item>dog</item> </one-of> </rule> <rule id="rootrule"> <tag> // This is processed before the rootrule is used // to declare and initialize the variables out.selections = ''; out.num_animals = 0; out.res_array = new Array(); out.skipped = 0; </tag> <item repeat="1-3"> <ruleref uri="#animal" /> <tag> // This next section is processed once per result from $animal if(out.selections.indexOf(rules.latest()) == -1) { if(out.selections != '') out.selections += ' '; // add spacing between results out.selections+=rules.latest(); out.res_array.push(rules.latest()); out.num_animals++; } else { out.skipped++; } </tag> </item> </rule> </grammar> |
Note the use of the built in function indexOf(), which is used here to determine if the animal has been previously encountered, to prevent duplication. The if statement is then used to selectively perform different operations based on whether the animal was a duplicate or not.
If not a duplicate, a space is added to the out.selections variable if it is not empty (when at least one animal was already spoken), then the new animal is appended to the end. Also, the out.res_array has a new item created to contain the new animal, and the out.num_animals counter is incremented.
If the animal is a duplicate, the code simply increments the out.skipped counter.
All of these out sub-variables will be included in the interpretation result. Note the Tag section prior to the rule being processed, which as the comments suggest, is used to declare and initialize the variables (which, again, is a very important step to remember)
In both ABNF and GrXML example grammars above, the result of someone speaking "cat dog" would be:
<num_animals>2</num_animals><res_array length="2"><item index="0">cat</item><item index="1">dog</item></res_array><selections>cat dog</selections><skipped>0</skipped> |
As you can see in the results, there are a number of returned parameters, including :
- num_animals - The number of unique animals spoken
- res_array - An array containing an item for each unique animal spoken
- selections - A string of unique animal names, with spaces between
- skipped - The number of skipped (duplicate) animals
Using the same grammars, but with the user saying a repeated animal, such as "cat dog cat" would give a similar result, except that the skipped value would be 1 instead of 0 to indicate that the second cat utterance was a duplicate:
<num_animals>2</num_animals><res_array length="2"><item index="0">cat</item><item index="1">dog</item></res_array><selections>cat dog</selections><skipped>1</skipped> |
Conclusion
As you can see, but using a number of different techniques and understanding how to leverage the power behind ECMAScript within your grammars, you can programmatically achieve complex results with relative ease.