Introduction


I've always been lex/yacc-challenged, but have several times in my career been thrust into the position of having to parse text of varying form and style. I've had to re-implement an algebraic infix notation parser on 3 or 4 separate occasions, had to process listing and equipment control input files and output reports, in a number of different environments and implementation languages.

I've also had similar feelings for regexp's as I have for lex/yacc, that I was reluctant to introduce another technology and syntax for trying to represent data format and program code.

I finally decided to implement a class library that I could use to easily and directly configure syntax definitions for any number of text parsing applications. Ironically, about the same time, an old friend of mine, Steve Metsker, was just finishing his book, Building Parsers in Java , that takes a very similar approach to this problem.

I recently took up Python, and as an exercise, looked at the opportunity to implement such a class library using Python. It turns out that this was a much more pleasant project overall, especially since the Python language provides some native features that really lend themselves well to this kind of project (native dictionary type, operator overloading, and dynamic object attributes). Please note - pyparsing is NOT a Python implementation/port of the Java class framework in Steve Metsker's book.