Pattern Engines
Jul. 5th, 2005 11:05 pmif ( $x = "yes" ) { }" type errors that are oh so easy in Perl, C, whatever. Unfortunately, it makes the task of writing a text chomper a bit of a chore, since it's not possible to write:
for line in open("input").readlines(): if m = re.match("foo(\\d+)", line): print m.group(1) elif m = re.match("bar([a-z]+)", line): print m.group(1)
Instead, you've got go all:
for line in open("input").readlines(): m = re.match("foo(\\d+)", line) if m: print "Foo: " + m.group(1) m = re.match("bar([a-z]+)", line): if m: print "Bar: " + m.group(1)
My lazy solution - a trivial pattern matching class. Write once, use many:class Matcher: def __init__(self): self.patterns = {} def add(self, key, pattern): if not self.patterns.has_key(key): self.patterns[key] = re.compile(pattern) else: raise "Key %s already in use" % key def __call__(self, line): self.key, self.match = "", None for p in self.patterns.keys(): m = self.patterns[p].search(line) if m: self.key, self.match = p, m return ( p, m ) return None
Then to actually use the thing, all that is required is:
from mymatch import Matcher x = Matcher() x.add("foo", "foo(\\d+)") x.add("bar", "bar([a-z]+)") for line in open("input").readlines(): x(line) if x.key == "foo" print "Foo: " + x.match.group(1) elif x.key == "bar": print "Bar: " + x.match.group(1)
Ok, so in my rather contrived example, it's not actually all that much simpler than the original, but you get the general idea. The process of tuning the code to get half decent performance out it is left as exercise for the reader.