SI script by example

You will probably find it helpful to have already read Rule Variables and the previous pages in the Semantic Interpretation before following along with this example.

In this page, using the numbers grammar from the introduction, we will look at getting semantic interpretation from a grammar.


Tag Format Declaration

#ABNF 1.0;
language en-US;
mode voice;
tag-format <semantics/1.0>;

In our grammar's header, we specify semantics/1.0 as our tag format. This tells the Engine that we will be using the final recommendation of SISR 1.0, using SI Script tags (as opposed to string literals). Note that if you have disabled STRICT_SISR_COMPLIANCE, you need to say semantics/1.0.2006 instead. In order to maintain backwards compatibility with older drafts, LumenVox requires the use of "1.0.2006" to access the final version of 1.0. See Tag Formats for a clear understanding on how to specify tag formats.


Setting Semantic Interpretation

As we build our grammar rules, we need to set the rule variable for any rule that is matched:

$base = (one {out = 1} |
two {out = 2} |
three {out = 3} |
four {out = 4} |
five {out = 5} |
six {out = 6} |
seven {out = 7} |
eight {out = 8} |
nine {out = 9});

$teen = ten {out = 10} |
eleven {out = 11} |
twelve {out = 12} |
thirteen {out = 13} |
fourteen {out = 14} |
fifteen {out = 15} |
sixteen {out = 16} |
seventeen {out = 17} |
eighteen {out = 18} |
nineteen {out = 19};

The identifier out serves as the variable name for the current rule's variable. We are setting the values of the variables for $base and $teen to the numerical representation of the word that was spoken. We are specifying the values as integers.


Modifying Rule Variables

We can perform operations on the variable identified by out the same way we can any other variable. We also access the rule variables for other rules by using rules.rulename, where rulename is the name of the rule whose variable we want to access.

In the $twenty_to_ninetynine rule, we need to add the value of the variable from the $base rule to the rule variable for $twenty_to_ninetynine:

twenty_to_ninetynine = (
twenty {out = 20} |
thirty {out = 30} |
forty {out = 40} |
fifty {out = 50} |
sixty {out = 60} |
seventy {out = 70} |
eighty {out = 80} |
ninety {out = 90}
) [$base { out += rules.base }];

First, we sent out equal to 20, 30, 40, etc., up to ninety. Then, if the optional $base rule is matched, the value of $base is added to out.


The rules.latest() Object

So far we have seen that a rule's variable can be referenced by rules.rulename after that rule has been matched. Sometimes, when there are lots of rule alternatives in a rule, it can be cumbersome to reference rules by name. Other times, a matched rule can't be referenced at all. For these reasons, the rules.latest() object exists. The rules.latest() object is always equal to the last rule matched. Using rules.latest(), we can write the $tens, $hundred, and $small_number rules like this:

$tens = (
$base |
$teen |
$twenty_to_ninetynine
) { out = rules.latest() };

$hundred = (
[a] hundred {out = 100} |
$base hundred {out = 100 * rules.base}
);

$small_number = $hundred {out = rules.latest()}
[[and] $tens {out += rules.latest}] |
$tens { out = rules.latest() };


Composite Return Types

Our small numbers grammar now returns an integer named small_number. Sometimes, however, we want more than one piece of information for a return type. By default, grammar rules return an object type, and object types can have additional properties.

For example, we may also want to know the exact phrase that was spoken, possibly for transcription or reading the text back to the speaker. Each rule reference also has a corresponding meta variable, with a property called "text."

We access meta variables similarly to how we access variables using rules.rulename. To get the text for a specific rule, use meta.rulename.text. To get the text for the current rule, use meta.current().text.

The following change to our grammar creates a composite return type containing the text that was spoken, and the numeric representation of that text.

root $small_number_and_text;

$small_number_and_text = $small_number {
out.number = rules.latest();
out.text = meta.current().text
};

Now a successful grammar match returns an object with two member properties, number and text.

Here is the complete grammar:

#ABNF 1.0;
language en-US;
mode voice;
tag-format <semantics/1.0>;

root $small_number_and_text;

$base = (one {out = 1} |two {out = 2}|three {out = 3}|four{out = 4}|five {out = 5}|six {out = 6}|seven {out = 7}|eight {out = 8} |nine {out = 9});

$teen = ten {out = 10}|eleven {out = 11}|twelve {out = 12}|thirteen {out = 13}|fourteen {out = 14}|fifteen{out = 15}|sixteen {out = 16}|seventeen {out = 17}|eighteen {out = 18}|nineteen {out = 19};

$twenty_to_ninetynine = (twenty {out = 20}|thirty {out = 30}|forty {out = 40}|fifty {out = 50}|sixty {out = 60}|seventy {out = 70}|eighty {out = 80}|ninety {out = 90}) [$base { out += rules.base }];

$tens = ($base|$teen|$twenty_to_ninetynine) { out = rules.latest() };

$hundred = ([a] hundred {out = 100} | $base hundred {out = 100 * rules.base});

$small_number = $hundred {out = rules.latest()} [[and] $tens {out += rules.latest}] | $tens { out = rules.latest() };

$small_number_and_text = $small_number {out.number = rules.latest(); out.text = meta.current().text};



Was this article helpful?
Copyright (C) 2001-2024, Ai Software, LLC d/b/a LumenVox