A view from the Treetops: Creating parser grammars on the fly 0

Posted by Matt Williams

Treetop is, as they say on their website, ... a language for describing languages. Combining the elegance of Ruby with cutting-edge parsing expression grammars, it helps you analyze syntax with revolutionarily ease.

In working on my game library, I've been experimenting with Treetop for parsing user input. At some point I am wanting to say: create Militia of size 100. In order to do this, I'd use a syntax like this (contrived):

1
2
3
4
5
6
7
8
9
10
11
12

grammar CreateUnit
  rule create_unit
     "create " unit_type " of size " number
  end
  rule number
    ([1-9] [0-9]* / '0')
  end
  rule unit_type
    [A-Z][a-z]
  end
end

Pretty simple rule, eh? There's one problem: my rule for unit_type will allow us to use unit types which don't actually exist. And since the unit types are defined at runtime by a yaml file (for information on how we're doing this see Classes on the Fly), we can't rightly say what we're going to use for our unit types ahead of time. Nor is Treetop set up very well for defining parsers on the fly -- it can read from files at runtime, but it can't without having to go to the filesystem. Hence this addition:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

module Treetop
  module Compiler
    class GrammarCompiler
      def ruby_string(s)
        parser = MetagrammarParser.new
        result = parser.parse(s)
        unless result
          raise RuntimeError.new(parser.failure_reason)
        end
        result.compile
      end
    end
  end
  def self.compile_string(s)
    compiler = Treetop::Compiler::GrammarCompiler.new
    Object.class_eval(compiler.ruby_string(s))
  end
end

To use it, we'd do something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

$ irb
>> require 'rubygems'
=> false
>> require 'war'
=> true
>> require 'treetop'
=> false
>> War::Utils::Loader.load_classes("../config/units.yml",War::Unit)
militia
rifleman
scout
landship
walker
=> [{"name"=>"militia", "template"=>"soldier", "attributes"=>{"strength"=>50, "symbol"=>"m", "cost"=>50, "defence"=>10, "attack"=>-10}}, {"name"=>"rifleman", "template"=>"soldier", "attributes"=>{"strength"=>100, "symbol"=>"r", "cost"=>125, "defence"=>30, "attack"=>30}}, {"name"=>"scout", "template"=>"scout", "attributes"=>{"strength"=>20, "symbol"=>"s", "cost"=>200, "defence"=>10, "attack"=>10}}, {"name"=>"landship", "template"=>"tank", "attributes"=>{"strength"=>1500, "symbol"=>"L", "cost"=>2000, "defence"=>200, "attack"=>200}}, {"name"=>"walker", "template"=>"tank", "attributes"=>{"strength"=>500, "cost"=>1250, "symbol"=>"w", "movement"=>2, "defence"=>100, "vision"=>2, "attack"=>100}}]
>> s = <<EOF
grammar UnitType
rule unit_type
#{War::Unit.subclasses.map{|c|"'#{c.to_s}'"}.join(" / ")}
end
end
EOF
=> "grammar UnitType\nrule unit_type\n'Walker' / 'Landship' / 'Scout' / 'Rifleman' / 'Militia'\nend\nend\n"
>> require 'treetop_extensions'
=> true
>> Treetop.compile_string(s)
=> UnitTypeParser
>> parser = UnitTypeParser.new
=> #<UnitTypeParser:0xb76ee2c4 @consume_all_input=true>
>> parser.parse("Foo")
=> nil
>> parser.parse("Militia")
=> SyntaxNode offset=0, "Militia"
>> 

So, we can now parse unit types -- let's change our grammar now to reflect this:

1
2
3
4
5
6
7
8
9
10

grammar CreateUnit
  include UnitType
  rule create_unit
     "create " unit_type " of size " number
  end
  rule number
    ([1-9] [0-9]* / '0')
  end
end

Let's see it in practice (assume everything's been loaded...):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

>> p = CreateUnitParser.new
=> #<CreateUnitParser:0xb76c845c @consume_all_input=true>
>> p.parse("create Militia of size 100")
=> SyntaxNode+CreateUnit0 offset=0, "... Militia of size 100" (number,unit_type):
  SyntaxNode offset=0, "create "
  SyntaxNode offset=7, "Militia"
  SyntaxNode offset=14, " of size "
  SyntaxNode+Number0 offset=23, "100":
    SyntaxNode offset=23, "1"
    SyntaxNode offset=24, "00":
      SyntaxNode offset=24, "0"
      SyntaxNode offset=25, "0"
>> p.parse("create Foo of size -1")
=> nil
>> 

So, there you have it, we've created a parser grammar on the fly, which we can use....