Parser grammar power to the power of two!
In my last blog post, I described how we have to do a bit of grammar analysis with the parser in order to properly interpret more complex sentences. If you have not read the article, you should do so now, as the following will build upon what I described last time.
While the solution I offered up last time to detect and process subjects and objects that may be decorated with an adjective, the problem is that not all adjectives are inherently adjectives. In many instances, we will detect them as being nouns and only the context defines that they are being used as an adjective. The word “lamp oil” is a perfect example of this because it describes one subject consisting of two nouns.
In a minimalistic parser, when entering
look at the lamp oil, the software would interpret it as
look at the lamp, followed by the word oil without further reference. The parser could theoretically infer that we wanted to numerate through the nouns and interpret it as
look at the lamp, look at the oil but neither actually reflects the original intention.
In my last installment I solved this problem by inserting a stage in the parser that was looking for specific word combinations to identify such decorated nouns by checking if the first noun is “lamp” and the second noun is “oil” and then turning it into a single noun named “lamp_oil.” The solution is perfectly feasible and valid. But is it good?
No, not really—for a number of reasons. First of all, because the substitution happens after the entire sentence has been processed, the slot for
theNoun2 has been used up which means a third noun would not be processed. (Note: In its current implementation, my parser only processes two nouns, something that can be easily rectified, but I’m just trying to make a point here.) As a result, a more complex sentence like
get the lamp oil from under the bed
would not be correctly processed because it contains three nouns. That’s not good.
Secondly, and perhaps even more importantly, the command the player entered could have been
dip the lamp in the oil
If the parser did the methodical substitution described above, we would end up with an interpreted command that says
dip the lamp_oil with
Clearly, grammar is not as simple as just that. A bit more work is needed to really nail down the inherent meaning of sentences.
The sequence of words in a sentence is important.
The sequence of words in a sentence is important. We know that in the word “lamp oil” the “lamp” part always comes directly before the “oil” part. Always. There’s no separating them. If they are not following each other immediately, the meaning changes and it does not refer to “lamp oil” anymore.
This brings us to our first improvement. If we keep track in which order words are being processed, we can then check if two words follow each other. Voilá, the solution.
In order to do this, we create a counter that is incremented every time the parser processes a new word. We then assign that serial number to the respective verb, noun, preposition, adjective, etc as they are being tokenized.
if Vocab [ _cleanWord ] [ "type" ] == WordType.Verb: if not globals.theVerb: # If no verb found yet globals.theVerb = Vocab [ _cleanWord ] [ "meaning" ] globals.theVerbString = self.TokenLookup ( globals.theVerb ) globals.theVerbSerial = _serial
With this in place, we can now perform specific checks to identify subjects that are decorated by adjectives or another noun, as I do in the following example with the lamp oil.
if Tokens.Lamp == globals.theNoun and Tokens.Oil == globals.theNoun2: if globals.theNounSerial == globals.theNoun2Serial-1: globals.theNoun = Tokens.LampOil globals.theNounString = self.TokenLookup ( Tokens.LampOil ) globals.theNoun2 = None globals.theNoun2String = None globals.theNoun2Serial = 0
This approach prevents a lot of misunderstandings and it addresses both problems I mentioned earlier. Because we are analyzing the sequence of words, we can now move this check from the
Grammar() function that is performed after all words have been processed and move it into the
InstaCheck() function instead. The immediate benefit is that upon encountering such a compound word, the parser now immediately frees up the second noun, making room for more words to be processed. This makes it possible to correctly interpret a command like
get the lamp oil from under the bed
Because these kinds of checks follow the same structure over and over again, I decided to create a general, parametrized method for it that can be easily called from within the
def UnifyTwoNouns ( self, token1, token2, token3 ): if token1 == globals.theNoun and token2 == globals.theNoun2: if globals.theNounSerial == globals.theNoun2Serial-1: globals.theNoun = token3 globals.theNounString = self.TokenLookup ( token3 ) globals.theNoun2 = None globals.theNoun2String = None globals.theNoun2Serial = 0
To show you how it looks like, here is a snippet from my
InstaCheck() function. See how neat and clean this is? Easy to maintain, easy to add new compound words to it and easy to extend for even more logic.
def InstaCheck ( self ): """ Check for word combinations that can be instantly replaced, while still parsing the input """ if Tokens.In == globals.thePrep and Tokens.Front == globals.theAdjective: # In front globals.thePrep = Tokens.Before globals.thePrepString = self.TokenLookup ( Tokens.Before ) globals.theAdjective = None globals.theAdjectiveString = None if Tokens.It == globals.theNoun: # Handle IT globals.theNoun = globals.theLastNoun globals.theNounString = globals.theLastNounString self.UnifyTwoNouns ( Tokens.Lamp, Tokens.Oil, Tokens.LampOil ) # Lamp oil self.UnifyTwoNouns ( Tokens.Jewelry, Tokens.Box, Tokens.JewelryBox ) # Jewelry box self.UnifyAdjNoun( Tokens.Brass, Tokens.Key, Tokens.BrassKey ) # Brass Key self.UnifyAdjNoun( Tokens.Small, Tokens.Key, Tokens.SmallKey ) # Small Key
There you go… our parser just got a whole lot smarter yet again. It’s not some abstract, behind-the-scenes improvement but something that directly affects the player’s experience, because the parser grammar functions will misinterpret the player input far less frequently while, at the same time, automating a lot of the logic for the game designer who doesn’t have to think about grammar pitfalls and can instead focus on simple meaning.
This way you can easily parse even the most complex commands, such as
Use the trowel to plant the pot plant in the plant pot, an example that text adventure developer Magnetic Scrolls reportedly used to show off its parser’s prowess.
2 Replies to “Parser grammar power to the power of two!”
Hi …came across your site when searching for text adventure parsers.
I am prototyping a Traveller RPG game. What I am trying to do is create a system where the game master can create adventures using text files to define the environment, places people objectives and how to achieve the objectives.
That sounds very interesting. Something like that, I would probably approach using YAML, which allows you to have an easy, deterministic approach to your data management.