Latest Pyparsing News:

August 15, 2016 - Pyparsing 2.1.8 released

  • Fixed issue in the optimization to _trim_arity, when the full stacktrace is retrieved to determine if a TypeError is raised in pyparsing or in the caller's parse action. Code was traversing the full stacktrace, and potentially encountering UnicodeDecodeError.
  • Fixed bug in ParserElement.inlineLiteralsUsing, causing infinite loop with Suppress.
  • Fixed bug in Each, when merging named results from multiple expressions in a ZeroOrMore or OneOrMore. Also fixed bug when ZeroOrMore expressions were erroneously treated as required expressions in an Each expression.
  • Added a few more inline doc examples. Improved use of runTests in several example scripts.

August 11, 2016 - Pyparsing 2.1.7 released

  • Fixed regression introduced in 2.1.6, reported by Andrea Censi (surfaced in PyContracts tests) when using ParseSyntaxExceptions (raised when using operator '-') with packrat parsing.
  • Minor fix to oneOf, to accept all iterables, not just space-delimited strings and lists. (If you have a list or set of strings, it is not necessary to concat them using ' '.join to pass them to oneOf, oneOf will accept the list or set or generator directly.)

August 9, 2016 - Pyparsing 2.1.6 released

  • *Major packrat upgrade*, inspired by patch provided by Tal Einat - many, many, thanks to Tal for working on this! Tal's tests show faster parsing performance (2X in some tests), *and* memory reduction from 3GB down to ~100MB! Requires no changes to existing code using packratting.
  • Minor API change - to better distinguish between the flexible numeric types defined in pyparsing_common, I've changed "numeric" (which parsed numbers of different types and returned int for ints, float for floats, etc.) and "number" (which parsed numbers of int or float type, and returned all floats) to "number" and "fnumber" respectively. Also fixed a bug in pyparsing_common.numeric (now renamed to pyparsing_common.number), integers were parsed and returned as floats instead of being retained as ints.
  • Fixed bug in upcaseTokens and downcaseTokens introduced in 2.1.5, when the parse action was used in conjunction with results names.
  • Major change to docs! Substantial expansion of inline docs, including example code.
  • Deprecated ParseResults.asXML, to be removed in a future release

June 13, 2016 - Pyparsing 2.1.5 released

  • Added ParserElement.split() generator method, similar to re.split().
  • Added a new parse action construction helper tokenMap, which will apply a function and optional arguments to each element in a ParseResults. So this parse action:
      def lowercase_all(tokens):
          return [str(t).lower() for t in tokens]

can now be written:

  • Added more expressions to pyparsing_common:
    • IPv4 and IPv6 addresses (including long, short, and mixed formsof IPv6)
    • MAC address
    • ISO8601 date and date time strings (with named fields for year, month, etc.)
    • UUID (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
    • hex integer (returned as int)
    • fraction (integer '/' integer, returned as float)
    • mixed integer (integer '-' fraction, or just fraction, returned as float)
    • stripHTMLTags (parse action to remove tags from HTML source)
    • parse action helpers convertToDate and convertToDatetime to do custom parse time conversions of parsed ISO8601 strings

  • New example, shows samples of parsing integer and real numbers using locale-dependent formats:
    4 294 967 295,000  

May 13, 2016 - Pyparsing 2.1.4 released

  • Split out the '==' behavior in ParserElement, now implemented as the ParserElement.matches() method. Using '==' for string test purposes will be removed in a future release.
  • Expanded capabilities of runTests(). Will now accept embedded comments (default is Python style, leading '#' character, but customizable). Comments will be emitted along with the tests and test output. Useful during test development, to create a test string consisting only of test case description comments separated by blank lines, and then fill in the test cases. Will also highlight ParseFatalExceptions with "(FATAL)".
  • Added a 'pyparsing_common' class containing common/helpful little expressions such as integer, float, identifier, etc. I used this class as a sort of embedded namespace, to contain these helpers without further adding to pyparsing's namespace bloat.
  • Minor enhancement to traceParseAction decorator, to retain the parse action's name for the trace output.
  • Added optional 'fatal' keyword arg to addCondition, to indicate that a condition failure should halt parsing immediately.

May 11, 2016 - Pyparsing 2.1.3 released

  • _trim_arity fix in 2.1.2 was very version-dependent on Py 3.5.0. Now works for Python 2.x, 3.3, 3.4, 3.5.0, and 3.5.1 (and hopefully beyond).

May 10, 2016 - Pyparsing 2.1.2 released

  • Fixed bug in _trim_arity when pyparsing code is included in a PyInstaller, reported by maluwa.
  • Fixed catastrophic regex backtracking in implementation of the quoted string expressions (dblQuotedString, sglQuotedString, and quotedString). Reported on the pyparsing wiki by webpentest, good catch! (Also tuned up some other expressions susceptible to the same backtracking problem, such as cStyleComment, cppStyleComment, etc.)

March, 2016 - Pyparsing 2.1.1 released

  • Added support for assigning to ParseResults using slices.
  • Fixed bug in ParseResults.toDict(), in which dict values were always converted to dicts, even if they were just unkeyed lists of tokens. Reported on SO by Gerald Thibault, thanks Gerald!
  • Fixed bug in SkipTo when using failOn, reported by robyschek, thanks!
  • Fixed bug in Each introduced in 2.1.0, reported by AND patch and unit test submitted by robyschek, well done!
  • Removed use of functools.partial in replaceWith, as this creates an ambiguous signature for the generated parse action, which fails in PyPy. Reported by Evan Hubinger, thanks Evan!
  • Added default behavior to QuotedString to convert embedded '\t', '\n', etc. characters to their whitespace counterparts. Found during Q&A exchange on SO with Maxim.

February, 2016 - Pyparsing 2.1.0 released

  • Modified the internal _trim_arity method to distinguish between TypeError's raised while trying to determine parse action arity and those raised within the parse action itself. This will clear up those confusing "<lambda>() takes exactly 1 argument (0 given)" error messages when there is an actual TypeError in the body of the parse action. Thanks to all who have raised this issue in the past, and most recently to Michael Cohen, who sent in a proposed patch, and got me to finally tackle this problem.
  • Added compatibility for pickle protocols 2-4 when pickling ParseResults. In Python 2.x, protocol 0 was the default, and protocol 2 did not work. In Python 3.x, protocol 3 is the default, so explicitly naming protocol 0 or 1 was required to pickle ParseResults. With this release, all protocols 0-4 are supported. Thanks for reporting this on StackOverflow, Arne Wolframm, and for providing a nice simple test case!
  • Added optional 'stopOn' argument to ZeroOrMore and OneOrMore, to simplify breaking on stop tokens that would match the repetition expression.

It is a common problem to fail to look ahead when matching repetitive tokens if the sentinel at the end also matches the repetition expression, as when parsing "BEGIN aaa bbb ccc END" with:
    "BEGIN" + OneOrMore(Word(alphas)) + "END"

Since "END" matches the repetition expression "Word(alphas)", it will never get parsed as the terminating sentinel. Up until now, this has to be resolved by the user inserting their own negative lookahead:
    "BEGIN" + OneOrMore(~Literal("END") + Word(alphas)) + "END"

Using stopOn, they can more easily write:
    "BEGIN" + OneOrMore(Word(alphas), stopOn="END") + "END"

The stopOn argument can be a literal string or a pyparsing expression. Inspired by a question by Lamakaha on StackOverflow (and many previous questions with the same negative-lookahead resolution).

  • Added expression names for many internal and builtin expressions, to reduce name and error message overhead during parsing.
  • Converted helper lambdas to functions to refactor and add docstring support.
  • Fixed ParseResults.asDict() to correctly convert nested ParseResults values to dicts.
  • Cleaned up some examples, fixed typo in identified by aristotle2600 on reddit.
  • Removed keepOriginalText helper method, which was deprecated ages ago. Superceded by originalTextFor.
  • Same for the Upcase class, which was long ago deprecated and replaced with the upcaseTokens method.

December, 2015 - Pyparsing 2.0.7 released

  • Simplified string representation of Forward class, to avoid memory and performance errors while building ParseException messages. Thanks, Will McGugan, Andrea Censi, and Martijn Vermaat for the bug reports and test code.
  • Cleaned up additional issues from enhancing the error messages for Or and MatchFirst, handling Unicode values in expressions. Fixes Unicode encoding issues in Python 2, thanks to Evan Hubinger for the bug report.
  • Fixed implementation of dir() for ParseResults - was leaving out all the defined methods and just adding the custom results names.
  • Fixed bug in ignore() that was introduced in pyparsing 1.5.3, that would not accept a string literal as the ignore expression.
  • Added new example to illustrate parsing of data formatted in columns, with detection of empty cells.
  • Updated a number of examples to more current Python and pyparsing forms.

November 29, 2015 - Pyparsing 2.0.6 released

  • Fixed a bug in Each when multiple Optional elements are present. Thanks for reporting this, whereswalden on SO.
  • Fixed another bug in Each, when Optional elements have results names or parse actions, reported by Max Rothman - thank you, Max!
  • Added optional parseAll argument to runTests, whether tests should require the entire input string to be parsed or not (similar to parseAll argument to parseString). Plus a little neaten-up of the output on Python 2 (no stray ()'s).
  • Modified exception messages from MatchFirst and Or expressions. These were formerly misleading as they would only give the first or longest exception mismatch error message. Now the error message includes all the alternatives that were possible matches. Originally proposed by a pyparsing user, but I've lost the email thread - finally figured out a fairly clean way to do this.
  • Fixed a bug in Or, when a parse action on an alternative raises an exception, other potentially matching alternatives were not always tried. Reported by TheVeryOmni on the pyparsing wiki, thanks!
  • Fixed a bug to dump() introduced in 2.0.4, where list values were shown in duplicate.

October 29, 2015 - Pyparsing 2.0.5 released :(

  • (&$(@#&$(@!!!! Some "print" statements snuck into pyparsing v2.0.4, breaking Python 3 compatibility! Fixed. Reported by jenshn, thanks!

October 28, 2015 - Pyparsing 2.0.4 released!

Mostly minor bugfixes, but a few new features and conveniences:

  • Added ParserElement.addCondition, to simplify adding parse actions that act primarily as filters. If the given condition evaluates False, pyparsing will raise a ParseException. The condition should be a method with the same method signature as a parse action, but should return a boolean. Suggested by Victor Porton, nice idea Victor, thanks!
  • Added ParserElement.runTests, a little test bench for quickly running an expression against a list of sample input strings. Basically, I got tired of writing the same test code over and over, and finally added it as a test point method on ParserElement.
  • Added withClass helper method, a simplified version of withAttribute for the common but annoying case when defining a filter on a div's class - made difficult because 'class' is a Python reserved word.
  • Slight mod to srange to accept unicode literals for the input string, such as "[а-яА-Я]" instead of "[\u0430-\u044f\u0410-\u042f]". Thanks to Alexandr Suchkov for the patch!
  • Fixed bug in ParseResults.dump() method when the results consists only of an unnamed array of sub-structure results. Reported by Robin Siebler, thanks for your patience and persistence, Robin!
  • Fixed bug in example code, where pi and e were defined using CaselessLiteral instead of CaselessKeyword. This was not a problem until adding a new function 'exp', and the leading 'e' of 'exp' was accidentally parsed as the mathematical constant 'e'. Nice catch, Tom Grydeland - thanks!
  • Adopt new-fangled Python features, like decorators and ternary expressions, per suggestions from Williamzjc - thanks William! (Oh yeah, I'm not supporting Python 2.3 with this code any more...) Plus, some additional code fixes/cleanup - thanks again!

October 10, 2014 - Pyparsing 2.0.3 released!

"Improvements" that I made to the ParseResults.pop() method introduced some serious regressions, such that it pretty much stopped working at all. This has been fixed.

I've also enhanced the ParseResults.dump() method, to better show those results that are returned as lists of unnamed elements. This output is a big improvement over the previous version.

Other notes and bugfixes:

  • Fixed escaping behavior in QuotedString. Formerly, only quotation marks (or characters designated as quotation marks in the QuotedString constructor) would be escaped. Now all escaped characters will be escaped, and the escaping backslashes will be removed.
  • Fixed bug in And class when initializing using a generator.
  • Fixed UnboundLocalError under Python 3.4 in oneOf method, reported on Sourceforge by aldanor, thanks!
  • Fixed bug in ParseResults init method, when returning non-ParseResults types from parse actions that implement eq. Raised during discussion on the pyparsing wiki with cyrfer.

April 13, 2014 - Pyparsing 2.0.2 released!

Bugfixes and some new features:

  • Extended "expr(name)" shortcut (same as "expr.setResultsName(name)") to accept "expr()" as a shortcut for "expr.copy()".
  • Added "locatedExpr(expr)" helper, to decorate any returned tokens with their location within the input string. Adds the results names locn_start and locn_end to the output parse results.
  • Added "pprint()" method to ParseResults, to simplify troubleshooting and prettified output. Now instead of importing the pprint module and then writing "pprint.pprint(result)", you can just write "result.pprint()". This method also accepts addtional positional and keyword arguments (such as indent, width, etc.), which get passed through directly to the pprint method (see
  • Removed deprecation warnings when using '<<' for Forward expression assignment. '<<=' is still preferred, but '<<' will be retained for cases whre '<<=' operator is not suitable (such as in defining lambda expressions).
  • Expanded argument compatibility for classes and functions that take list arguments, to now accept generators as well.
  • Extended list-like behavior of ParseResults, adding support for append and extend. NOTE: if you have existing applications using these names as results names, you will have to access them using dict-style syntax: res["append"] and res["extend"]
  • ParseResults emulates the change in list vs. iterator semantics for methods like keys(), values(), and items(). Under Python 2.x, these methods will return lists, under Python 3.x, these methods will return iterators.
  • ParseResults now has a method haskeys() which returns True or False depending on whether any results names have been defined. This simplifies testing for the existence of results names under Python 3.x, which returns keys() as an iterator, not a list.
  • ParseResults now supports both list and dict semantics for pop(). If passed no argument or an integer argument, it will use list semantics and pop tokens from the list of parsed tokens. If passed a non-integer argument (most likely a string), it will use dict semantics and pop the corresponding value from any defined results names. A second default return value argument is supported, just as in dict.pop().
  • Fixed bug in markInputline, thanks for reporting this, Matt Grant!
  • Cleaned up my unit test environment, now runs with Python 2.6 and 3.3.

July 19, 2013 - Pyparsing 2.0.1 released!

Pyparsing's 2.0.0 release caused some issues for a number of users using Python 2.6 or 2.7, especially using modules like matplotlib or Celery that are dependent on pyparsing. With release 2.0.1, I've removed the code that was specific to Python 3.x, so that it can be installed on any Python version 2.6 or later. Users of Python 2.5 and earlier still need to install pyparsing version 1.5.7.

Pyparsing 2.0.1 also fixes a bug in the implementation of the '<<=' operator, so that it can be used in place of '<<' within problems with recursive grammars.

December 16, 2012 - Pyparsing 1.5.7/2.0.0 released!

Well, it looks like Python 3 is starting to catch on, I've decided to split the versions of pyparsing such that version 1.5.x maintains compatibility with Python2, and versions 2.x and beyond are Python3-compatible. If you are using Python 2.x and installing with easy_install, use:
easy_install pyparsing==1.5.7

I am deprecating two popular API features, in the interests of renaming them better, and/or having them behave better:
  • Added new operator '<<=', which will eventually replace '<<' for storing the contents of a Forward(). '<<=' does not have the same operator precedence problems that '<<' does.
  • 'operatorPrecedence' is being renamed 'infixNotation' as a better description of what this helper function creates. 'operatorPrecedence' is deprecated, and will be dropped entirely in a future release.

Some other changes:
  • An awesome new example is included in this release, submitted by Luca DellOlio, for parsing ANTLR grammar definitions, nice work Luca!
  • Fixed implementation of ParseResults.__str__ to use Pythonic ''.join() instead of repeated string concatenation. This purportedly has been a performance issue under PyPy.
  • Fixed bug in ParseResults.__dir__ under Python 3, reported by Thomas Kluyver, thank you Thomas!
  • Added ParserElement.inlineLiteralsUsing static method, to override pyparsing's default behavior of converting string literals to Literal instances, to use other classes (such as Suppress or CaselessLiteral).
  • Added optional arguments lpar and rpar to operatorPrecedence, so that expressions that use it can override the default suppression of the grouping characters.
  • Added support for using single argument builtin functions as parse actions. Now you can write 'expr.setParseAction(len)' and get back the length of the list of matched tokens. Supported builtins are: sum, len, sorted, reversed, list, tuple, set, any, all, min, and max. A script demonstrating this feature is included in the examples directory.
  • Improved linking in generated docs, proposed on the pyparsing wiki by techtonik, thanks!
  • Fixed a bug in the definition of 'alphas', which was based on the string.uppercase and string.lowercase "constants", which in fact *aren't* constant, but vary with locale settings. This could make parsers locale-sensitive in a subtle way. Thanks to Kef Schecter for his diligence in following through on reporting and monitoring this bugfix!
  • Fixed a bug in the Py3 version of pyparsing, during exception handling with packrat parsing enabled, reported by Catherine Devlin - thanks Catherine!
  • Fixed typo in ParseBaseException.__dir__, reported anonymously on the SourceForge bug tracker, thank you Pyparsing User With No Name.
  • Fixed bug in srange when using '\x###' hex character codes.
  • Addeed optional 'intExpr' argument to countedArray, so that you can define your own expression that will evaluate to an integer, to be used as the count for the following elements. Allows you to define a countedArray with the count given in hex, for example, by defining intExpr as "Word(hexnums).setParseAction(int(t[0],16))".

September, 2011 - Writing DSL's with Pyparsing talk, presented at PyCon India/2011

Siddharta Govindaraj presented "Creating Domain Specific Languages in Python" at PyCon India/2011. He based his DSL parser examples on pyparsing, and did a very nice progression of his DSL parser from BNF to a pyparsing parser with named results. Siddaharta's sample DSL was a form parser that would render to HTML:
name:CharField -> label:Username size:25
email:EmailField -> size:32
<form id='userform'>
<input type='text' name='name' size='25'/><br/>
<input type='text' name='email' size='32'/><br/>
<input type='password' name='password'/><br/>

Well done, and thanks for using pyparsing for your DSL example!

June 30, 2011 - Pyparsing 1.5.6 released!

After about 10 months, there is a new release of pyparsing, version 1.5.6. This release contains some small enhancements, some bugfixes, and some new examples.

Most notably, this release includes the first public release of the Verilog parser. I have tired of restricting this parser for commercial use, and so I am distributing it under the same license as pyparsing, with the request that if you use it for commmercial use, please make a commensurate donation to your local Red Cross.

This release also contains a rewrite of the adaptive parse action arguments, based on a submission from none other than Raymond Hettinger - thanks Raymond for contributing to pyparsing!

New features:
  • Added 'ungroup' helper method, to address token grouping done implicitly by And expressions, even if only one expression in the And actually returns any text - also inspired by stackoverflow discussion with Frankie Ribery!
  • Enhancement to countedArray, accepting an optional expression to be used for matching the leading integer count - proposed by Mathias on the pyparsing mailing list, good idea!
  • Added the excludeChars argument to the Word class, to simplify defining a word composed of all characters in a large range except for one or two. Suggested by JesterEE on the pyparsing wiki.
  • Added optional overlap parameter to scanString, to return overlapping matches found in the source text.
  • Enhanced form of using the "expr('name')" style of results naming, in lieu of calling setResultsName. If name ends with an '*', then this is equivalent to expr.setResultsName('name',listAllMatches=True).

Other new examples:
  • protobuf parser - parses Google's protobuf language
  • btpyparse - a BibTex parser contributed by Matthew Brett, with test suite (thanks, Matthew!)
  • - demo using trailing '*' for results names

August 12, 2010 - Typo in Pyparsing 1.5.4, Pyparsing 1.5.5 released!

To my dismay, my Python 3 installation for Pyparsing 1.5.4 *still* had a typo in it, and so to keep things clear for the 100 or so people/organizations who jumped right on it and downloaded version 1.5.4, that version has been replaced with version 1.5.5. And I have confirmation from users in both Python 2 and Python 3 worlds that 1.5.5 installs without any problems!

I think this release will probably mark the end of any substantial work on the Python 2.x branch of pyparsing. There have been no substantive changes or bugfixes in about 16 months, so I think it is at a solid plateau. Moving forward, I'll probably make changes only in the Python 3.0 side of the house. I guess that might be a decent time to start numbering the releases 2.x. (Leaves me room for 1.5.n+1 in case I really need/want to do something in the Py2 world still.) Or maybe jump to pyparsing 3.x, to tie in with Python 3.

August 11, 2010 - Pyparsing 1.5.4 released!

With this release of pyparsing, all of the Python 3 incompatibilities should be resolved! This version will install on Python 3 with no complaints about __builtin__ or file usages, which were removed in Python 3.

In addition, I've added 2 more example programs:
  • - a demo of a syntax scanner of a Tcl-like syntax for verifying proper parameter passing to API functions. Based on a wiki inquiry by Peter Lom, thanks Peter!
  • - a parser to convert informal time references such as "in 3 minutes", "a couple of days ago", or "next Sunday at 2pm" into Python datetime objects.

June 24, 2010 - Pyparsing 1.5.3 released!

The latest version of pyparsing is now available from SourceForge. This version is mostly a bug-fix and compatibility release. Most notably, this release resolves the installation problems created in version 1.5.2, with a cleaner installation approach for Python 2.x vs. Python 3.x.

IMPORTANT API CHANGE for PYTHON 3 USERS! - This release also clears up the import discrepancy between the two versions of Python, that was introduced in version 1.5.2 - now regardless of Python version, users can just write import pyparsing in their code, there is no longer a separate pyparsing_py3 module.

Other changes and fixes:

  • Fixed bug on Python3 when using parseFile, getting bytes instead of a str from the input file.
  • Fixed subtle bug in originalTextFor, if followed by significant whitespace (like a newline) - discovered by Francis Vidal, thanks!
  • Fixed related bug in originalTextFor in which trailing comments or otherwise ignored text got slurped in with the matched expression. Thanks to michael_ramirez44 on the pyparsing wiki for reporting this just in time to get into this release!
  • Fixed very sneaky bug in Each, in which Optional elements were not completely recognized as optional - found by Tal Weiss, thanks for your patience.
  • Fixed off-by-1 bug in line() method when the first line of the input text was an empty line. Thanks to John Krukoff for submitting a patch!
  • Fixed bug in transformString if grammar contains Group expressions, thanks to patch submitted by barnabas79, nice work!
  • Added better support for summing ParseResults, see the new example,
  • Added support for composing a Regex using a compiled RE object; thanks to my new colleague, Mike Thornton!
  • In version 1.5.2, I changed the way exceptions are raised in order to simplify the stacktraces reported during parsing. An anonymous user posted a bug report on SF that this behavior makes it difficult to debug some complex parsers, or parsers nested within parsers. In this release I've added a class attribute ParserElement.verbose_stacktrace, with a default value of False. If you set this to True, pyparsing will report stacktraces using the pre-1.5.2 behavior.

This release also includes several new examples (soon to be added to the wiki Examples page):

  • , a MicroC compiler submitted by Zarko Zivanov. (Note: this example is separately licensed under the GPLv3, and requires Python 2.6 or higher.) Thank you, Zarko!
  • , a subset C parser, using the BNF from the 1996 Obfuscated C Contest.
  • , a parser for reading SQLite SELECT statements, as specified at; this goes into much more detail than the simple SQL parser included in pyparsing's source code
  • , a modified version of submitted by Matt Anderson, that is compatible with Python versions 2.7 and above - thanks so much, Matt!
  •, a *simplistic* first-cut at a parser for Excel expressions, which I originally posted on comp.lang.python in January, 2010; beware, this parser omits many common Excel cases (addition of numbers represented as strings, references to named ranges)
  •, a nice little parser posted my Mark Tolonen on comp.lang.python in August, 2009 (redistributed here with Mark's permission). Thanks a bunch, Mark!
  •, a sample I posted to, implementing a special variation on Literal that does "close" matching, up to a given number of allowed mismatches. The application was to find matching gene sequences, with allowance for one or two mismatches.
  •, a sample showing how to use a Forward placeholder to enforce matching of text parsed in a previous expression.
  •, simple demo showing how the matchPreviousLiteral helper method is used to match a previously parsed token.

March 22, 2010 - PyCon presentation on PLY and Pyparsing

Andrew Dalke's PyCon presentation, comparing PLY and PyParsing, is now online here. Andrew confesses to a preference for PLY, but his presentation is very even-handed - great job, Andrew!

September 10, 2009 - Pyparsing Compatibility Testing

Great news for Jythoners and IronPythoners!

I just finished running my unit test suite with pyparsing 1.5.2 against Jython 2.5.0 and IronPython 2.0.2. Happily, pyparsing is nearly 100% compatible with both! IronPython fails to run the keepOriginalText parse action, but this has been deprecated in favor of the new helper originalTextFor.

May 1, 2009 - Python Magazine article on developing DSL's with pyparsing

pymag_2009_04.jpgThe April issue of Python Magazine includes a feature article, "Create Your Own Domain Specific Language in Python With Imputil and Pyparsing," on using pyparsing to define your own domain-specific language (DSL) mixed right in with your Python code. Here is an example of Python, augmented using the imputil hook to support an inline state machine definition, in file trafficLight.pystate:
# define state machine
statemachine TrafficLight:
    Red -> Green
    Green -> Yellow
    Yellow -> Red
# define some class level constants
Red.carsCanGo = False
Yellow.carsCanGo = True
Green.carsCanGo = True
Red.delay = wait(20)
Yellow.delay = wait(3)
Green.delay = wait(15)
And here is how that module would look in a script that imports it:
import statemachine
import trafficLight
tl = trafficLight.Red()
for i in range(6):
    print tl, "GO" if tl.carsCanGo else "STOP"
    tl = tl.next_state()
Use this technique to enhance Python with your own domain syntax!

April 18, 2009 - Pyparsing 1.5.2 released!

The latest version of pyparsing is now available from SourceForge. This version is mostly a bug-fix and compatibility release:
  • Added module, so that Python 3 users can use pyparsing by changing their pyparsing import statement to:
      import pyparsing_py3
  • Removed __slots__ declaration on ParseBaseException, for compatibility with IronPython 2.0.1.
  • Fixed bugs in SkipTo/failOn and ignore handling.
  • Simplified exception stack traces when reporting parse exceptions back to caller of parseString or parseFile.
  • Changed behavior of scanString to avoid infinitely looping on expressions that match zero-length strings.
  • Added new example, which extends the example to actually evaluate the parsed expressions.

January 12, 2009 - Check out pyparsing_helper 0.1.0!

Catherine Devlin writes in her blog about developing pyparsing_helper, a GUI workbench for working on pyparsing grammars and testing them against sample source strings. Even though it is at version 0.1.0, it is very helpful at quickly trying and tweaking simple grammars. Here is a screenshot from Catherine's blog:
Download pyparsing_helper here! (or just use "easy_install pyparsing_helper")

October 18, 2008 - Pyparsing 1.5.1 Released!

I've just uploaded to SourceForge the latest update to pyparsing, version 1.5.1. This version includes a few new features, and some bug-fixes:
  • Added dir() methods to ParseBaseException and ParseResults, to support new dir() behavior in Py2.6 and Py3.0. If dir() is called on a ParseResults object, the returned list will include the base set of attribute names, plus any results names that are defined.

  • Added new helper method originalTextFor, to replace the use of the current keepOriginalText parse action. The implementation of originalTextFor is simpler and faster than keepOriginalText, and does not depend on using the inspect or imp modules.

  • Added failOn argument to SkipTo, so that grammars can define literal strings or pyparsing expressions which, if found in the skipped text, will cause SkipTo to fail. Useful to prevent SkipTo from reading past a terminating expression.

  • Fixed bug in nestedExpr if multi-character expressions are given for nesting delimiters.

  • Removed dependency on xml.sax.saxutils.escape, and included internal implementation instead.

  • Fixed typo in ParseResults.insert.

  • Fixed bug in '-' error stop, when '-' operator is used inside a Combine expression.

  • Fixed bug in parseString(parseAll=True), when the input string ends with a comment or whitespace.

  • Fixed bug in LineStart and LineEnd that did not recognize any special whitespace chars defined using ParserElement.setDefaultWhitespaceChars.

  • Forward class is now more tolerant of subclassing.

All are described in more detail in the CHANGES file.

(I'm still having to maintain separate code for Python 3 compatibility, due to the change in syntax when catching exceptions. For those wishing to use pyparsing with Python 3.0, please get from the SVN repository.)

August 29, 2008 - Python Magazine article on advanced Pyparsing methods

pymag_2008_08.jpgThe August issue of Python Magazine includes an article on using pyparsing to parse text, with dynamic results names. The article steps through a series of examples, including an interesting example to parse this table of data:
| samp |  min |  max |  ave | sdev |
|  A1  |    7 |   11 |    9 |    1 |
|  B1  |   43 |   52 |   47 |    3 |
|  C1  |    7 |   10 |    8 |    1 |
|  A2  |   82 |   85 |   84 |    1 |
|  B2  |   98 |  112 |  106 |    3 |
|  C2  |    1 |    4 |    3 |    1 |
Using results names, we then modify the table format, and the same parser works with no changes at all!. The article finishes with an example of parsing the JSON data format, using dynamic results names to construct parse results that look like a demarshaled JSON object. That is, from this JSON object description:
{ "reference" : 
  { "article" : 
    { "title" : "Writing a Simple 
            Interpreter/Compiler with Pyparsing",
      "author": "Paul McGuire",
      "magazine" : "Python Magazine",
      "issue" : "May, 2008"
      "pages" : 6      } } }
the parsed results allows you to write code like this:
entry = parser.parseString(jsontext)
print entry["reference"]["article"]["pages"]
print entry.reference.article.issue

July 20, 2008 - Pyparsing helps track down international arms dealer!

Alexander Harrowell chronicles his efforts in this slideshow to uncover the illegal cargo flights of arms dealer Victor Bouk. You can view Alexander's continuing efforts by pasting this link ( into a Google Maps search field.

June 18, 2008 - Pyparsing 1.5.0 tests show compatibility with Python 2.6b1

I just downloaded the release of Python 2.6b1 and reran the unit test suite for pyparsing 1.5.0. There were no changes or regressions.

Here are the performance results using the Verilog parser, with various optimizations (performance is shown in lines parsed per second - larger number is faster):

Python V2.5.1
Python V2.6b1

So it would appear that Python 2.6 is 15-50% faster in parsing! (Couldn't test with psyco, since there is no psyco download available for Python 2.6 yet.)

June 1, 2008 - Pyparsing 1.5.0 Released!

I've just uploaded to SourceForge the latest update to pyparsing, version 1.5.0. This version includes a number of long-awaited features, so I thought it was time to bump the minor rev version:
  • parsing a complete string without having to add StringEnd() to the pyparsing grammar, by adding parseAll argument to the parseString method (default value is False to maintain compatibility with prior versions, set to True to force parsing of the full input string)
  • support for indentation-based grammars (like Python's), using a new helper method, indentedBlock.
  • improved syntax error detection and reporting, based on the ErrStop class submitted by Eike Welk on the pyparsing forum, and Thomas/Poldy's proposal on the pyparsing wiki; the CHANGES file includes a detailed example showing how syntax errors can be designated by using '-' instead of '+' operators

Pyparsing 1.5.0 also includes a number of bug-fixes, described in more detail in the CHANGES file.

(Despite my best efforts, I have *not* been able to include support for Python 3.0 using a common source code base. For those who wish to try out pyparsing with Python 3.0, there is a file in the SourceForge Subversion repository. I have done some testing of this code, but many of my unit tests still need to be converted to Python 3.)

May 29, 2008 - Python Magazine Features Pyparsing

pymag_2008_05.jpgThe May issue of Python Magazine includes the article "Writing a Simple Interpreter/Compiler with Pyparsing," in which I step through the development of a BrainF*ck interpreter using pyparsing. For those unfamiliar with BF, here is "Hello, World!" in that scatonymous (a word I made up myself for the article) language:
The plot twist comes at the end when I convert the interpreter to a compiler. I had a lot of fun writing the article, and according to Doug Hellman in his blog, he had fun editing it!

April 10, 2008 - Pyparsing innovations abound!

Check out the "Who's Using Pyparsing" page to see the latest in pyparsing creativity!
  • asDox - Actionscript class extractor
  • svg2imagemap - SVG -> HTML image map converter
  • Quameon - Quantum Monte Carlo algorithms implemented in Python
  • Pybtex - BibTeX parser
  • Tunnelhack - text adventure
  • madlib - fiction generator web service
  • poetrygen - poetry generator
  • PyMLNs - Markov Logic Networks
  • dsniff - network monitor
  • Bauble - biodiversity database
  • Firebird PowerTool
  • Numbler - spreadsheet web service

February 10, 2008 - Pyparsing version 1.4.11 released

This version of pyparsing is updated to be compatible with Python 3.0a3, thanks from help from Robert A. Clark!

There are also some interesting new features in this release:
  • Added WordStart and WordEnd positional classes, to support expressions that must occur at the start or end of a word. Proposed by piranha on the pyparsing wiki, good idea!
  • Added matchOnlyAtCol helper parser action, to simplify parsing log or data files that have optional fields that are column dependent. Inspired by a discussion thread with hubritic on comp.lang.python.
  • Added withAttribute.ANY_VALUE as a match-all value when using withAttribute. Used to ensure that an attribute is present, without having to match on the actual attribute value.
  • Added get() method to ParseResults, similar to dict.get(). Suggested by new pyparsing user, Alejandro Dubrovksy, thanks!
  • Added '==' short-cut to see if a given string matches a pyparsing expression. For instance, you can now write:
    integer = Word(nums)
    if "123" == integer:
       # do something
    print [ x for x in "123 234 asld".split() if x==integer ]
    # prints ['123', '234']
  • Changed '<<' operator on Forward to return None, since this is really used as a pseudo-assignment operator, not as a left-shift operator. By returning None, it is easier to catch faulty statements such as a << b | c, where precedence of operations causes the '|' operation to be performed *after* inserting b into a, so no alternation is actually implemented. The correct form is a << (b | c). With this change, an error will be reported instead of silently clipping the alternative term. (Note: this may break some existing code, but if it does, the code had a silent bug in it anyway.) Proposed by wcbarksdale on the pyparsing wiki, thanks!
And finally, several unit tests were added to pyparsing's regression suite, courtesy of the Google Highly-Open Participation Contest. Thanks to all who administered and took part in this event!

December 10, 2007 - Pyparsing version 1.4.10 released

Mostly bug-fixes in this release, but a few new features as well - here are the high points:
  • Expression multiplication - now you can specify an exact number of replicated elements, or a range of replicated elements using the '*' multiplication operator. If you multiply an expression 'expr' by an integer 'n', then you will get the equivalent of expr + expr + ..., or 'expr' repeated 'n' times. You can also multiply an expression by a two-integer tuple (m,n), meaning 'at least m copies of expr, up to n copies total'. Of course, 'n' must be greater than 'm'; 'm' may be 0.
  • pop() method added to ParseResults - now you can call pop() on the ParseResults passed to a parse action, or returned from calling parseString, to extract the -1'th element of the ParseResults, or call pop(n) to extract the n'th element of the ParseResults (just like with a Python list). If you call pop(resultsName), where resultsName is a key created as an expression results name, then that key-value will be extracted from the dict portion of the ParseResults.
  • Fixed bug in nestedExpr - the original version of nestedExpr was too greedy, in that it would consume all text after the last closing delimited as content at a 0'th level of nesting. This has been fixed.
(I had released version 1.4.9 but quickly withdrew it when it was reported that a serious bug remained in the operatorPrecedence code - this bug is fixed in 1.4.10.)

December 5, 2007 - Pyparsing participates in Google's Highly Open Participation Contest

ghoplogosm.jpg The Google Highly Open Participation Contest (GHOP) is an effort by Google to involve pre-university students into open source projects, by offering a set of short 2-3 day tasks.

The GHOP includes several tasks for improving pyparsing unit tests and example code. Look for these contributions in future versions of pyparsing.

Learn more about the GHOP at

November 9, 2007 - UtilityMill includes Pyparsing among available modules

UtilityMill is a web site with common utilities implemented as HTML forms, using Python code. Pyparsing has been included in the list of available modules for implementing a handy utility. See this example, a repackaging of .

November 5, 2007 - Blogging about Pyparsing (a la the Daily Python URL)

Here are some interesting recent blog entries related to Pyparsing:
  • Andrew Dalke does several implementations of the chemical formula parser and molecular weight calculator, and includes a mention of the pyparsing example. Pyparsing's version comes in last in raw performance, but Andrew is not eager to discard pyparsing even so.
  • Hairy Trollstomper describes a web attack on their system database application, and posts the code he used to automate blacklist/whitelist management by parsing the server log file.
  • Michel Hollands documents his early experiences with pyparsing (parsing SQL source and parsing SQL INSERT statements), and his thoughts on the new e-book.

October 7, 2007 - Pyparsing version 1.4.8 released

The latest release of pyparsing includes some new helper methods:
  • nestedExpr - a helper expression for easily creating grammars for expressions with nesting within ()'s, []'s, {}'s, etc. There is a new example included in this release that demonstrates the use of this expression.
  • withAttribute - a helper parse action to be used with the starting tag expression returned by makeHTMLTags. Many times, when scraping a web page, you must extract a <TD> or <DIV> tag, but really only want a special form of this tag with a particular class or other attribute. Use withAttribute to set up an additional filter for such parsers. There is a new example in the examples directory showing how to use this new parse action.

This release also includes some improvements and bug-fixes:
  • added performance speedup to grammars using operatorPrecedence
  • fixed bug/typo when deleting an element from a ParseResults by using the element's results name.
  • fixed whitespace-skipping bug in wrapper classes (such as Group, Suppress, Combine, etc.)
  • fixed bug in makeHTMLTags that did not detect HTML tag attributes with no '= value' portion (such as '<td nowrap>')
  • fixed minor bug in makeHTMLTags and makeXMLTags, which did not accept whitespace in closing tags.

October 5, 2007 - Pyparsing E-book Available from O'Reilly

gswp_cover.gifO'Reilly Publishing has released a 65 page e-book, Getting Started With Pyparsing by Paul McGuire, on its catalog of Short Cuts for the modest sum of US$9.99. Some of the topics covered are:
- Basic structure of a Pyparsing program
- The Zen of Pyparsing (FREE sample chapter online)
- "Hello, World!" revisited, with more elaborate grammar and results processing than has been previously published
- Parser for S-Expressions
- Extracting complex table data from a web page
- Parsing search strings, and writing a search engine in under 100 lines of code
All examples are accompanied by complete source code listings. Throughout the book, helpful tips and notes are covered in separate sidebar discussions, finishing with a short list of additional helpful resources. Lastly, an index adds to the reference value of this book. Download a copy today!

September 9, 2007 - Pyparsing Recipe in the Python Cookbook

Kevin Atkinson has submitted this recipe to the Python Cookbook. It uses a new feature of the Python eval and exec commands to implement custom behavior when a symbol is not found. So instead of this:
B = Forward()
C = Forward()
A = B + C
B << Literal('b')
C << Literal('c')
D_list = Forward()
D = Forward()
D_list << D | (D + D_list)
D << Literal('d')

you can just write this:
A = B + C
B = Literal('b') 
C = Literal('c')
D_list = D | (D + D_list)
D = Literal('d')

This is not a feature of pyparsing itself, but it is a handy add-on for auto-defining Forwards.

August 12, 2007 - New Pyparsing Applications

Check out the latest "Who's Who" of pyparsing users, demonstrating some novel applications of integrating pyparsing into higher-level applications:
  • Zhpy - Chinese->English Python pre-processor
  • Robotic instruction assembler - assembler for Khepera robot programming
  • ZMAS - Agent Communication Language message parser
  • pymon - Network Administration and Monitoring command parser

July 22, 2007 - Pyparsing 1.4.7 released

A minor upgrade release, but with a major notation shortcut. You can now invoke setResultsName using function call notation. That is, instead of this:
    stats = "AVE:" + realNum.setResultsName("average") + 
            "MIN:" + realNum.setResultsName("min") + 
            "MAX:" + realNum.setResultsName("max")  
you can write this:
    stats = "AVE:" + realNum("average") + 
            "MIN:" + realNum("min") + 
            "MAX:" + realNum("max")  

Also added the new method setBreak() on ParserElement, so that you can break out to the Python
PDB debugger when about to parse an expression.

Other bugfixes:
  • fixed a packrat parsing bug with cached ParseResults
  • fixed a bug in operatorPrecedence with unary operators
  • corrected the AND/OR precedence in the example
  • fixed a bug in the Dict class, in which non-string keys were converted to strings
  • fixed a bug in StringEnd and LineEnd when reading at the end of the input string

July 1, 2007 - Pyparsing Now Supports easy_install

Thanks to help from Jochen Kupperschmidt, pyparsing is now able to be installed using easy_install. Please post any questions or suggestions to the discussion tab on the pyparsing wiki home page.

June 14, 2007 - Pyparsing Subversion Repository rehosted to SourceForge

I just finished going through the process of exporting my local SVN repository, and uploading it to be hosted on SourceForge. The repository address is:
This will allow you to download the latest version of pyparsing direct from SVN, in advance of any formal releases. You can get more info on the pyparsing SourceForge SVN page.

May 23, 2007 - New Pyparsing-based success stories

New success stories on the Whos Using Pyparsing page. Check out:

April 29, 2007 - New online tutorial article

Sam writes in his blog a step-by-step tutorial, cataloging his efforts at building up a search-string parser. This tutorial starts with basic word recognition, and builds up to include special search prefixes and parenthesized lists.

April 11, 2007 - Pyparsing 1.4.6 released

This version of pyparsing has a few more bug-fixes and enhancements:
  • tuple as named result now reports entire tuple, not just first element
  • SkipTo with include=True now returns the skipped-to tokens properly
  • makeHTMLTags/makeXMLTags and anyOpenTag and anyCloseTag helpers now recognize attributes and tags with namespaces
  • countedArray now matches when defined within an Or expression
  • keepOriginalText now preserves named results fields
  • fixed Unicode bug in upcase and downcase methods
  • corrected typo in OnceOnly reset() method
  • enhanced documentation to describe behavior when parsing input strings containing tabs
  • cleaned up internal decorators to preserve function names, docstrings, etc.

New examples included in this release:
  • - Spanish translation of HelloWorld
  • - S-exp parser
  • (update) - minor bug-fixes to previously released example for parsing JSON (JavaScript Object Notation) object serialization strings
  • - macro preprocessor to substitute defined macros

December 22, 2006 - Pyparsing 1.4.5 released

This latest version of pyparsing has a few minor bug-fixes and enhancements, and a performance improvement of up to 100% increase in parsing speed.

This release also includes some new examples:
  • - parses strings representing lists, dicts, and tuples, with nesting support
  • - SQL diagram generator, parsed from schema table definitions
  • - parses JSON object serializations into hierarchical ParseResults, accessible using list, dict, or object attribute methods
  • - strips HTML tags from HTML pages, leaving only body text

November 3, 2006 - Secret Meaning of P.Y.P.A.R.S.I.N.G. revealed!

On comp.lang.python, Tim Chase posted this summary of the SE, Pyparsing, and re modules, in response to a comment by the effbot:

  • > nah, if you've spent more than five minutes on c.l.python lately, you'd
  • > noticed that it's the Solution to Everything (up there with pyparsing, I
  • > think).

  • SE would be the Solution to Everything.

  • pyparsing Provides Your Perfect Alternative where Regexp Syntax Is No Good.

  • The "re" module is just Routinely Expected to solve all problems.

Nice job, Tim!

October 19, 2006 - Pyparsing 1.4.4 released

The latest release of pyparsing includes mostly minor bug-fixes and enhancements. Here are some of the more interesting new features:
- Added new helper operatorPrecedence (based on e-mail list discussion with Ralph Corderoy and Paolo Losi), to facilitate defintion of grammars for expressions with unary and binary operators. For instance, this grammar defines a 6-function arithmetic expression grammar, with unary plus and minus, proper operator precedence,and right- and left-associativity:
 expr = operatorPrecedence( operand,
     [("!", 1, opAssoc.LEFT),
      ("^", 2, opAssoc.RIGHT),
      (oneOf("+ -"), 1, opAssoc.RIGHT),
      (oneOf("* /"), 2, opAssoc.LEFT),
      (oneOf("+ -"), 2, opAssoc.LEFT),]
- Added new helpers matchPreviousLiteral and matchPreviousExpr, for creating adaptive parsing expressions that match the same content as was parsed in a previous part of the parsing grammar.

October 4, 2006 - Pyparsing listed in WikiPedia

Well, this really happened on January 26 of this year, but I only just noticed it today. Pyparsing was included as a reference for recursive descent parsing in the Wikipedia article "Recursive descent parser". World domination can only be around the corner!

September 19, 2006 - Pyparsing 1.4.3 tests show compatibility with Python 2.5

I just downloaded the final production release of Python 2.5 and reran the unit test suite for pyparsing 1.4.3. There were no changes or regressions.

Here are the performance results using the Verilog parser, with various optimizations (performance is shown in lines parsed per second - larger number is faster):

Python V2.4.1
Python V2.5
- n/a -
packrat + psyco
- n/a -

So it would appear that Python 2.5 is about 5-10% slower in parsing. (Couldn't test with psyco, since there is no psyco download available for Python 2.5 yet.)

August 6, 2006 - Wilson Fowlie Knocks 'Em Dead at Vancouver Python Workshop

On Sunday, Wilson Fowlie presented his talk on "Pyparsing: A User's Perspective". The response was quite good - here are some excerpts from his report:
It went well! I got some good questions, showed off some well-received features (the access to named results by attribute or dict index was a big hit) and the feedback afterwards - both about pyparsing itself and the presentation - was very positive. A lot of people agreed that it would be very useful.
One guy even came up to me afterwards and told me that the whole reason he *came* to the conference was to find a tool like pyparsing. When I said that I was glad that I'd been there even for one person, he said that he'd heard a lot of murmuring around him to the effect that people would be able to find a use for it.
There was even an existing pyparsing user in the audience, which both added even more legitimacy to the module (if it were needed) and a was bit unnerving (what if he knew more than I did? :).
I had a really good time doing this presentation, Paul... I learned more about pyparsing in the process (since I didn't only want to talk about the particular features I'd used myself), so it was valuable in that way, too!
Here are the slides from Wilson's talk.

July 1, 2006 - Pyparsing 1.4.3 Released

The latest release of pyparsing includes some long-awaited bug-fixes, and a number of enhancements to parse actions.
  • simplified parse action interface; parse actions no longer must take all three arguments consisting of the original parsed string, the parsed location, and the parsed tokens; parse actions can now be defined with simplified argument interfaces:
    • no arguments
    • just the parsed tokens
    • just the parse location and parsed tokens
  • REMOVED SUPPORT FOR PARSE ACTIONS THAT RETURN LOCATION AND TOKENS; or looking at this another way, added support for parse actions to return tuples; parse actions that previously returned loc,tokens will now be interpreted to return the tuple (loc, tokens); this impending change was announced over 2 years ago, with explicit deprecation warnings in the previous release
  • new troubleshooting helper decorator, traceParseAction
  • new parse action helper class OnlyOnce, for parse actions that should only be called one time; subsequent invocations of an OnlyOnce-wrapped parse action will raise a ParseException
  • new setFailAction, to attach a method to an expression to be called when the expression is tried and fails (sort of an anti-parse action)
  • fixed the attachment of multiple parse actions, by breaking out the attempt at mind-reading in setParseAction; setParseAction now reverts to its previous behavior, and addParseAction appends new functions to the expression's list of parse actions
  • some new examples:
    • list string parser (reconstitutes a Python list from a string representation), including lists that contain elements that are lists, tuples, ints, reals, or quoted strings
    • line number demonstration, using the pyparsing line, lineno and col built-ins
    • listAllMatches example
    • line break remover, for removing hard line breaks in word-wrapped paragraphs with blank lines between paragraphs

Download it now from SourceForge!

June 28, 2006 - Wilson Fowlie to Give Pyparsing Presentation at VPW

Wilson Fowlie has been bit by the pyparsing bug, and will present "Pyparsing, A User's Perspective" at the Vancouver Python Workshop in August. Good luck, Wilson!