Latest Pyparsing News:


July 19, 2013 - Pyparsing 2.0.1 released!


Pyparsing's 2.0.0 release caused some issues for a number of users using Python 2.6 or 2.7, especially using modules like matplotlib or Celery that are dependent on pyparsing. With release 2.0.1, I've removed the code that was specific to Python 3.x, so that it can be installed on any Python version 2.6 or later. Users of Python 2.5 and earlier still need to install pyparsing version 1.5.7.

Pyparsing 2.0.1 also fixes a bug in the implementation of the '<<=' operator, so that it can be used in place of '<<' within problems with recursive grammars.

December 16, 2012 - Pyparsing 1.5.7/2.0.0 released!


Well, it looks like Python 3 is starting to catch on, I've decided to split the versions of pyparsing such that version 1.5.x maintains compatibility with Python2, and versions 2.x and beyond are Python3-compatible. If you are using Python 2.x and installing with easy_install, use:
easy_install pyparsing==1.5.7

I am deprecating two popular API features, in the interests of renaming them better, and/or having them behave better:
  • Added new operator '<<=', which will eventually replace '<<' for storing the contents of a Forward(). '<<=' does not have the same operator precedence problems that '<<' does.
  • 'operatorPrecedence' is being renamed 'infixNotation' as a better description of what this helper function creates. 'operatorPrecedence' is deprecated, and will be dropped entirely in a future release.

Some other changes:
  • An awesome new example is included in this release, submitted by Luca DellOlio, for parsing ANTLR grammar definitions, nice work Luca!
  • Fixed implementation of ParseResults.__str__ to use Pythonic ''.join() instead of repeated string concatenation. This purportedly has been a performance issue under PyPy.
  • Fixed bug in ParseResults.__dir__ under Python 3, reported by Thomas Kluyver, thank you Thomas!
  • Added ParserElement.inlineLiteralsUsing static method, to override pyparsing's default behavior of converting string literals to Literal instances, to use other classes (such as Suppress or CaselessLiteral).
  • Added optional arguments lpar and rpar to operatorPrecedence, so that expressions that use it can override the default suppression of the grouping characters.
  • Added support for using single argument builtin functions as parse actions. Now you can write 'expr.setParseAction(len)' and get back the length of the list of matched tokens. Supported builtins are: sum, len, sorted, reversed, list, tuple, set, any, all, min, and max. A script demonstrating this feature is included in the examples directory.
  • Improved linking in generated docs, proposed on the pyparsing wiki by techtonik, thanks!
  • Fixed a bug in the definition of 'alphas', which was based on the string.uppercase and string.lowercase "constants", which in fact *aren't* constant, but vary with locale settings. This could make parsers locale-sensitive in a subtle way. Thanks to Kef Schecter for his diligence in following through on reporting and monitoring this bugfix!
  • Fixed a bug in the Py3 version of pyparsing, during exception handling with packrat parsing enabled, reported by Catherine Devlin - thanks Catherine!
  • Fixed typo in ParseBaseException.__dir__, reported anonymously on the SourceForge bug tracker, thank you Pyparsing User With No Name.
  • Fixed bug in srange when using '\x###' hex character codes.
  • Addeed optional 'intExpr' argument to countedArray, so that you can define your own expression that will evaluate to an integer, to be used as the count for the following elements. Allows you to define a countedArray with the count given in hex, for example, by defining intExpr as "Word(hexnums).setParseAction(int(t[0],16))".


September, 2011 - Writing DSL's with Pyparsing talk, presented at PyCon India/2011


Siddharta Govindaraj presented "Creating Domain Specific Languages in Python" at PyCon India/2011. He based his DSL parser examples on pyparsing, and did a very nice progression of his DSL parser from BNF to a pyparsing parser with named results. Siddaharta's sample DSL was a form parser that would render to HTML:
INPUT:
UserForm
name:CharField -> label:Username size:25
email:EmailField -> size:32
password:PasswordField
 
OUTPUT:
<form id='userform'>
<label>Username</label>
<input type='text' name='name' size='25'/><br/>
<label>email</label>
<input type='text' name='email' size='32'/><br/>
<label>password</label>
<input type='password' name='password'/><br/>
</form>

Well done, and thanks for using pyparsing for your DSL example!


June 30, 2011 - Pyparsing 1.5.6 released!


After about 10 months, there is a new release of pyparsing, version 1.5.6. This release contains some small enhancements, some bugfixes, and some new examples.

Most notably, this release includes the first public release of the Verilog parser. I have tired of restricting this parser for commercial use, and so I am distributing it under the same license as pyparsing, with the request that if you use it for commmercial use, please make a commensurate donation to your local Red Cross.

This release also contains a rewrite of the adaptive parse action arguments, based on a submission from none other than Raymond Hettinger - thanks Raymond for contributing to pyparsing!

New features:
  • Added 'ungroup' helper method, to address token grouping done implicitly by And expressions, even if only one expression in the And actually returns any text - also inspired by stackoverflow discussion with Frankie Ribery!
  • Enhancement to countedArray, accepting an optional expression to be used for matching the leading integer count - proposed by Mathias on the pyparsing mailing list, good idea!
  • Added the excludeChars argument to the Word class, to simplify defining a word composed of all characters in a large range except for one or two. Suggested by JesterEE on the pyparsing wiki.
  • Added optional overlap parameter to scanString, to return overlapping matches found in the source text.
  • Enhanced form of using the "expr('name')" style of results naming, in lieu of calling setResultsName. If name ends with an '*', then this is equivalent to expr.setResultsName('name',listAllMatches=True).

Other new examples:
  • protobuf parser - parses Google's protobuf language
  • btpyparse - a BibTex parser contributed by Matthew Brett, with test suite test_bibparse.py (thanks, Matthew!)
  • groupUsingListAllMatches.py - demo using trailing '*' for results names


August 12, 2010 - Typo in Pyparsing 1.5.4, Pyparsing 1.5.5 released!


To my dismay, my Python 3 installation for Pyparsing 1.5.4 *still* had a typo in it, and so to keep things clear for the 100 or so people/organizations who jumped right on it and downloaded version 1.5.4, that version has been replaced with version 1.5.5. And I have confirmation from users in both Python 2 and Python 3 worlds that 1.5.5 installs without any problems!

I think this release will probably mark the end of any substantial work on the Python 2.x branch of pyparsing. There have been no substantive changes or bugfixes in about 16 months, so I think it is at a solid plateau. Moving forward, I'll probably make changes only in the Python 3.0 side of the house. I guess that might be a decent time to start numbering the releases 2.x. (Leaves me room for 1.5.n+1 in case I really need/want to do something in the Py2 world still.) Or maybe jump to pyparsing 3.x, to tie in with Python 3.


August 11, 2010 - Pyparsing 1.5.4 released!


With this release of pyparsing, all of the Python 3 incompatibilities should be resolved! This version will install on Python 3 with no complaints about __builtin__ or file usages, which were removed in Python 3.

In addition, I've added 2 more example programs:
  • apiCheck.py - a demo of a syntax scanner of a Tcl-like syntax for verifying proper parameter passing to API functions. Based on a wiki inquiry by Peter Lom, thanks Peter!
  • deltaTime.py - a parser to convert informal time references such as "in 3 minutes", "a couple of days ago", or "next Sunday at 2pm" into Python datetime objects.


June 24, 2010 - Pyparsing 1.5.3 released!


The latest version of pyparsing is now available from SourceForge. This version is mostly a bug-fix and compatibility release. Most notably, this release resolves the installation problems created in version 1.5.2, with a cleaner installation approach for Python 2.x vs. Python 3.x.

IMPORTANT API CHANGE for PYTHON 3 USERS! - This release also clears up the import discrepancy between the two versions of Python, that was introduced in version 1.5.2 - now regardless of Python version, users can just write import pyparsing in their code, there is no longer a separate pyparsing_py3 module.

Other changes and fixes:

  • Fixed bug on Python3 when using parseFile, getting bytes instead of a str from the input file.
  • Fixed subtle bug in originalTextFor, if followed by significant whitespace (like a newline) - discovered by Francis Vidal, thanks!
  • Fixed related bug in originalTextFor in which trailing comments or otherwise ignored text got slurped in with the matched expression. Thanks to michael_ramirez44 on the pyparsing wiki for reporting this just in time to get into this release!
  • Fixed very sneaky bug in Each, in which Optional elements were not completely recognized as optional - found by Tal Weiss, thanks for your patience.
  • Fixed off-by-1 bug in line() method when the first line of the input text was an empty line. Thanks to John Krukoff for submitting a patch!
  • Fixed bug in transformString if grammar contains Group expressions, thanks to patch submitted by barnabas79, nice work!
  • Added better support for summing ParseResults, see the new example, parseResultsSumExample.py.
  • Added support for composing a Regex using a compiled RE object; thanks to my new colleague, Mike Thornton!
  • In version 1.5.2, I changed the way exceptions are raised in order to simplify the stacktraces reported during parsing. An anonymous user posted a bug report on SF that this behavior makes it difficult to debug some complex parsers, or parsers nested within parsers. In this release I've added a class attribute ParserElement.verbose_stacktrace, with a default value of False. If you set this to True, pyparsing will report stacktraces using the pre-1.5.2 behavior.

This release also includes several new examples (soon to be added to the wiki Examples page):

  • , a MicroC compiler submitted by Zarko Zivanov. (Note: this example is separately licensed under the GPLv3, and requires Python 2.6 or higher.) Thank you, Zarko!
  • , a subset C parser, using the BNF from the 1996 Obfuscated C Contest.
  • , a parser for reading SQLite SELECT statements, as specified at http://www.sqlite.org/lang_select.html; this goes into much more detail than the simple SQL parser included in pyparsing's source code
  • , a modified version of stateMachine.py submitted by Matt Anderson, that is compatible with Python versions 2.7 and above - thanks so much, Matt!
  • excelExpr.py, a *simplistic* first-cut at a parser for Excel expressions, which I originally posted on comp.lang.python in January, 2010; beware, this parser omits many common Excel cases (addition of numbers represented as strings, references to named ranges)
  • cpp_enum_parser.py, a nice little parser posted my Mark Tolonen on comp.lang.python in August, 2009 (redistributed here with Mark's permission). Thanks a bunch, Mark!
  • partial_gene_match.py, a sample I posted to Stackoverflow.com, implementing a special variation on Literal that does "close" matching, up to a given number of allowed mismatches. The application was to find matching gene sequences, with allowance for one or two mismatches.
  • tagCapture.py, a sample showing how to use a Forward placeholder to enforce matching of text parsed in a previous expression.
  • matchPreviousDemo.py, simple demo showing how the matchPreviousLiteral helper method is used to match a previously parsed token.


March 22, 2010 - PyCon presentation on PLY and Pyparsing


Andrew Dalke's PyCon presentation, comparing PLY and PyParsing, is now online here. Andrew confesses to a preference for PLY, but his presentation is very even-handed - great job, Andrew!
dalke_pycon2010.jpg

September 10, 2009 - Pyparsing Compatibility Testing


Great news for Jythoners and IronPythoners!

I just finished running my unit test suite with pyparsing 1.5.2 against Jython 2.5.0 and IronPython 2.0.2. Happily, pyparsing is nearly 100% compatible with both! IronPython fails to run the keepOriginalText parse action, but this has been deprecated in favor of the new helper originalTextFor.


May 1, 2009 - Python Magazine article on developing DSL's with pyparsing


pymag_2009_04.jpgThe April issue of Python Magazine includes a feature article, "Create Your Own Domain Specific Language in Python With Imputil and Pyparsing," on using pyparsing to define your own domain-specific language (DSL) mixed right in with your Python code. Here is an example of Python, augmented using the imputil hook to support an inline state machine definition, in file trafficLight.pystate:
# define state machine
statemachine TrafficLight:
    Red -> Green
    Green -> Yellow
    Yellow -> Red
 
# define some class level constants
Red.carsCanGo = False
Yellow.carsCanGo = True
Green.carsCanGo = True
 
Red.delay = wait(20)
Yellow.delay = wait(3)
Green.delay = wait(15)
And here is how that module would look in a script that imports it:
import statemachine
import trafficLight
 
tl = trafficLight.Red()
for i in range(6):
    print tl, "GO" if tl.carsCanGo else "STOP"
    tl.delay()
    tl = tl.next_state()
Use this technique to enhance Python with your own domain syntax!

April 18, 2009 - Pyparsing 1.5.2 released!


The latest version of pyparsing is now available from SourceForge. This version is mostly a bug-fix and compatibility release:
  • Added pyparsing_py3.py module, so that Python 3 users can use pyparsing by changing their pyparsing import statement to:
      import pyparsing_py3
  • Removed __slots__ declaration on ParseBaseException, for compatibility with IronPython 2.0.1.
  • Fixed bugs in SkipTo/failOn and ignore handling.
  • Simplified exception stack traces when reporting parse exceptions back to caller of parseString or parseFile.
  • Changed behavior of scanString to avoid infinitely looping on expressions that match zero-length strings.
  • Added new example eval_arith.py, which extends the example simpleArith.py to actually evaluate the parsed expressions.

January 12, 2009 - Check out pyparsing_helper 0.1.0!

Catherine Devlin writes in her blog about developing pyparsing_helper, a GUI workbench for working on pyparsing grammars and testing them against sample source strings. Even though it is at version 0.1.0, it is very helpful at quickly trying and tweaking simple grammars. Here is a screenshot from Catherine's blog:
pyparsing_helper_screenshot.png
Download pyparsing_helper here! (or just use "easy_install pyparsing_helper")

October 18, 2008 - Pyparsing 1.5.1 Released!

I've just uploaded to SourceForge the latest update to pyparsing, version 1.5.1. This version includes a few new features, and some bug-fixes:
  • Added dir() methods to ParseBaseException and ParseResults, to support new dir() behavior in Py2.6 and Py3.0. If dir() is called on a ParseResults object, the returned list will include the base set of attribute names, plus any results names that are defined.

  • Added new helper method originalTextFor, to replace the use of the current keepOriginalText parse action. The implementation of originalTextFor is simpler and faster than keepOriginalText, and does not depend on using the inspect or imp modules.

  • Added failOn argument to SkipTo, so that grammars can define literal strings or pyparsing expressions which, if found in the skipped text, will cause SkipTo to fail. Useful to prevent SkipTo from reading past a terminating expression.

  • Fixed bug in nestedExpr if multi-character expressions are given for nesting delimiters.

  • Removed dependency on xml.sax.saxutils.escape, and included internal implementation instead.

  • Fixed typo in ParseResults.insert.

  • Fixed bug in '-' error stop, when '-' operator is used inside a Combine expression.

  • Fixed bug in parseString(parseAll=True), when the input string ends with a comment or whitespace.

  • Fixed bug in LineStart and LineEnd that did not recognize any special whitespace chars defined using ParserElement.setDefaultWhitespaceChars.

  • Forward class is now more tolerant of subclassing.

All are described in more detail in the CHANGES file.

(I'm still having to maintain separate code for Python 3 compatibility, due to the change in syntax when catching exceptions. For those wishing to use pyparsing with Python 3.0, please get pyparsing_py3.py from the SVN repository.)

August 29, 2008 - Python Magazine article on advanced Pyparsing methods

pymag_2008_08.jpgThe August issue of Python Magazine includes an article on using pyparsing to parse text, with dynamic results names. The article steps through a series of examples, including an interesting example to parse this table of data:
+------+------+------+------+------+
| samp |  min |  max |  ave | sdev |
+------+------+------+------+------+
|  A1  |    7 |   11 |    9 |    1 |
|  B1  |   43 |   52 |   47 |    3 |
|  C1  |    7 |   10 |    8 |    1 |
|  A2  |   82 |   85 |   84 |    1 |
|  B2  |   98 |  112 |  106 |    3 |
|  C2  |    1 |    4 |    3 |    1 |
+------+------+------+------+------+
Using results names, we then modify the table format, and the same parser works with no changes at all!. The article finishes with an example of parsing the JSON data format, using dynamic results names to construct parse results that look like a demarshaled JSON object. That is, from this JSON object description:
{ "reference" : 
  { "article" : 
    { "title" : "Writing a Simple 
            Interpreter/Compiler with Pyparsing",
      "author": "Paul McGuire",
      "magazine" : "Python Magazine",
      "issue" : "May, 2008"
      "pages" : 6      } } }
the parsed results allows you to write code like this:
entry = parser.parseString(jsontext)
print entry["reference"]["article"]["pages"]
print entry.reference.article.issue

July 20, 2008 - Pyparsing helps track down international arms dealer!

Alexander Harrowell chronicles his efforts in this slideshow to uncover the illegal cargo flights of arms dealer Victor Bouk. You can view Alexander's continuing efforts by pasting this link (http://seagrass.goatchurch.org.uk/~yorksranter/viktorfeed.xml) into a Google Maps search field.


June 18, 2008 - Pyparsing 1.5.0 tests show compatibility with Python 2.6b1

I just downloaded the release of Python 2.6b1 and reran the unit test suite for pyparsing 1.5.0. There were no changes or regressions.

Here are the performance results using the Verilog parser, with various optimizations (performance is shown in lines parsed per second - larger number is faster):


Python V2.5.1
Python V2.6b1
base
209.2
307.0
packrat
349.8
408.0

So it would appear that Python 2.6 is 15-50% faster in parsing! (Couldn't test with psyco, since there is no psyco download available for Python 2.6 yet.)

June 1, 2008 - Pyparsing 1.5.0 Released!

I've just uploaded to SourceForge the latest update to pyparsing, version 1.5.0. This version includes a number of long-awaited features, so I thought it was time to bump the minor rev version:
  • parsing a complete string without having to add StringEnd() to the pyparsing grammar, by adding parseAll argument to the parseString method (default value is False to maintain compatibility with prior versions, set to True to force parsing of the full input string)
  • support for indentation-based grammars (like Python's), using a new helper method, indentedBlock.
  • improved syntax error detection and reporting, based on the ErrStop class submitted by Eike Welk on the pyparsing forum, and Thomas/Poldy's proposal on the pyparsing wiki; the CHANGES file includes a detailed example showing how syntax errors can be designated by using '-' instead of '+' operators

Pyparsing 1.5.0 also includes a number of bug-fixes, described in more detail in the CHANGES file.

(Despite my best efforts, I have *not* been able to include support for Python 3.0 using a common source code base. For those who wish to try out pyparsing with Python 3.0, there is a file pyparsing_py3.py in the SourceForge Subversion repository. I have done some testing of this code, but many of my unit tests still need to be converted to Python 3.)

May 29, 2008 - Python Magazine Features Pyparsing

pymag_2008_05.jpgThe May issue of Python Magazine includes the article "Writing a Simple Interpreter/Compiler with Pyparsing," in which I step through the development of a BrainF*ck interpreter using pyparsing. For those unfamiliar with BF, here is "Hello, World!" in that scatonymous (a word I made up myself for the article) language:
++++++++++[>+++++++>++++++++++>+++<<<-]>++.>+.+++++
++..+++.>++.<<+++++++++++++++.>.+++.------.--------
.>+.
The plot twist comes at the end when I convert the interpreter to a compiler. I had a lot of fun writing the article, and according to Doug Hellman in his blog, he had fun editing it!

April 10, 2008 - Pyparsing innovations abound!

Check out the "Who's Using Pyparsing" page to see the latest in pyparsing creativity!
  • asDox - Actionscript class extractor
  • svg2imagemap - SVG -> HTML image map converter
  • Quameon - Quantum Monte Carlo algorithms implemented in Python
  • Pybtex - BibTeX parser
  • Tunnelhack - text adventure
  • madlib - fiction generator web service
  • poetrygen - poetry generator
  • PyMLNs - Markov Logic Networks
  • dsniff - network monitor
  • Bauble - biodiversity database
  • Firebird PowerTool
  • Numbler - spreadsheet web service

February 10, 2008 - Pyparsing version 1.4.11 released

This version of pyparsing is updated to be compatible with Python 3.0a3, thanks from help from Robert A. Clark!

There are also some interesting new features in this release:
  • Added WordStart and WordEnd positional classes, to support expressions that must occur at the start or end of a word. Proposed by piranha on the pyparsing wiki, good idea!
  • Added matchOnlyAtCol helper parser action, to simplify parsing log or data files that have optional fields that are column dependent. Inspired by a discussion thread with hubritic on comp.lang.python.
  • Added withAttribute.ANY_VALUE as a match-all value when using withAttribute. Used to ensure that an attribute is present, without having to match on the actual attribute value.
  • Added get() method to ParseResults, similar to dict.get(). Suggested by new pyparsing user, Alejandro Dubrovksy, thanks!
  • Added '==' short-cut to see if a given string matches a pyparsing expression. For instance, you can now write:
    integer = Word(nums)
    if "123" == integer:
       # do something
 
    print [ x for x in "123 234 asld".split() if x==integer ]
    # prints ['123', '234']
  • Changed '<<' operator on Forward to return None, since this is really used as a pseudo-assignment operator, not as a left-shift operator. By returning None, it is easier to catch faulty statements such as a << b | c, where precedence of operations causes the '|' operation to be performed *after* inserting b into a, so no alternation is actually implemented. The correct form is a << (b | c). With this change, an error will be reported instead of silently clipping the alternative term. (Note: this may break some existing code, but if it does, the code had a silent bug in it anyway.) Proposed by wcbarksdale on the pyparsing wiki, thanks!
And finally, several unit tests were added to pyparsing's regression suite, courtesy of the Google Highly-Open Participation Contest. Thanks to all who administered and took part in this event!

December 10, 2007 - Pyparsing version 1.4.10 released

Mostly bug-fixes in this release, but a few new features as well - here are the high points:
  • Expression multiplication - now you can specify an exact number of replicated elements, or a range of replicated elements using the '*' multiplication operator. If you multiply an expression 'expr' by an integer 'n', then you will get the equivalent of expr + expr + ..., or 'expr' repeated 'n' times. You can also multiply an expression by a two-integer tuple (m,n), meaning 'at least m copies of expr, up to n copies total'. Of course, 'n' must be greater than 'm'; 'm' may be 0.
  • pop() method added to ParseResults - now you can call pop() on the ParseResults passed to a parse action, or returned from calling parseString, to extract the -1'th element of the ParseResults, or call pop(n) to extract the n'th element of the ParseResults (just like with a Python list). If you call pop(resultsName), where resultsName is a key created as an expression results name, then that key-value will be extracted from the dict portion of the ParseResults.
  • Fixed bug in nestedExpr - the original version of nestedExpr was too greedy, in that it would consume all text after the last closing delimited as content at a 0'th level of nesting. This has been fixed.
(I had released version 1.4.9 but quickly withdrew it when it was reported that a serious bug remained in the operatorPrecedence code - this bug is fixed in 1.4.10.)

December 5, 2007 - Pyparsing participates in Google's Highly Open Participation Contest

ghoplogosm.jpg The Google Highly Open Participation Contest (GHOP) is an effort by Google to involve pre-university students into open source projects, by offering a set of short 2-3 day tasks.

The GHOP includes several tasks for improving pyparsing unit tests and example code. Look for these contributions in future versions of pyparsing.

Learn more about the GHOP at http://code.google.com/opensource/ghop/2007-8/.

November 9, 2007 - UtilityMill includes Pyparsing among available modules

UtilityMill is a web site with common utilities implemented as HTML forms, using Python code. Pyparsing has been included in the list of available modules for implementing a handy utility. See this example, a repackaging of .

November 5, 2007 - Blogging about Pyparsing (a la the Daily Python URL)

Here are some interesting recent blog entries related to Pyparsing:
  • Andrew Dalke does several implementations of the chemical formula parser and molecular weight calculator, and includes a mention of the pyparsing example. Pyparsing's version comes in last in raw performance, but Andrew is not eager to discard pyparsing even so.
  • Hairy Trollstomper describes a web attack on their system database application, and posts the code he used to automate blacklist/whitelist management by parsing the server log file.
  • Michel Hollands documents his early experiences with pyparsing (parsing SQL source and parsing SQL INSERT statements), and his thoughts on the new e-book.

October 7, 2007 - Pyparsing version 1.4.8 released

The latest release of pyparsing includes some new helper methods:
  • nestedExpr - a helper expression for easily creating grammars for expressions with nesting within ()'s, []'s, {}'s, etc. There is a new example included in this release that demonstrates the use of this expression.
  • withAttribute - a helper parse action to be used with the starting tag expression returned by makeHTMLTags. Many times, when scraping a web page, you must extract a <TD> or <DIV> tag, but really only want a special form of this tag with a particular class or other attribute. Use withAttribute to set up an additional filter for such parsers. There is a new example in the examples directory showing how to use this new parse action.

This release also includes some improvements and bug-fixes:
  • added performance speedup to grammars using operatorPrecedence
  • fixed bug/typo when deleting an element from a ParseResults by using the element's results name.
  • fixed whitespace-skipping bug in wrapper classes (such as Group, Suppress, Combine, etc.)
  • fixed bug in makeHTMLTags that did not detect HTML tag attributes with no '= value' portion (such as '<td nowrap>')
  • fixed minor bug in makeHTMLTags and makeXMLTags, which did not accept whitespace in closing tags.

October 5, 2007 - Pyparsing E-book Available from O'Reilly

gswp_cover.gifO'Reilly Publishing has released a 65 page e-book, Getting Started With Pyparsing by Paul McGuire, on its catalog of Short Cuts for the modest sum of US$9.99. Some of the topics covered are:
- Basic structure of a Pyparsing program
- The Zen of Pyparsing (FREE sample chapter online)
- "Hello, World!" revisited, with more elaborate grammar and results processing than has been previously published
- Parser for S-Expressions
- Extracting complex table data from a web page
- Parsing search strings, and writing a search engine in under 100 lines of code
All examples are accompanied by complete source code listings. Throughout the book, helpful tips and notes are covered in separate sidebar discussions, finishing with a short list of additional helpful resources. Lastly, an index adds to the reference value of this book. Download a copy today!

September 9, 2007 - Pyparsing Recipe in the Python Cookbook

Kevin Atkinson has submitted this recipe to the Python Cookbook. It uses a new feature of the Python eval and exec commands to implement custom behavior when a symbol is not found. So instead of this:
B = Forward()
C = Forward()
A = B + C
B << Literal('b')
C << Literal('c')
D_list = Forward()
D = Forward()
D_list << D | (D + D_list)
D << Literal('d')

you can just write this:
A = B + C
B = Literal('b') 
C = Literal('c')
D_list = D | (D + D_list)
D = Literal('d')

This is not a feature of pyparsing itself, but it is a handy add-on for auto-defining Forwards.

August 12, 2007 - New Pyparsing Applications

Check out the latest "Who's Who" of pyparsing users, demonstrating some novel applications of integrating pyparsing into higher-level applications:
  • Zhpy - Chinese->English Python pre-processor
  • Robotic instruction assembler - assembler for Khepera robot programming
  • ZMAS - Agent Communication Language message parser
  • pymon - Network Administration and Monitoring command parser

July 22, 2007 - Pyparsing 1.4.7 released

A minor upgrade release, but with a major notation shortcut. You can now invoke setResultsName using function call notation. That is, instead of this:
    stats = "AVE:" + realNum.setResultsName("average") + 
            "MIN:" + realNum.setResultsName("min") + 
            "MAX:" + realNum.setResultsName("max")  
you can write this:
    stats = "AVE:" + realNum("average") + 
            "MIN:" + realNum("min") + 
            "MAX:" + realNum("max")  

Also added the new method setBreak() on ParserElement, so that you can break out to the Python
PDB debugger when about to parse an expression.

Other bugfixes:
  • fixed a packrat parsing bug with cached ParseResults
  • fixed a bug in operatorPrecedence with unary operators
  • corrected the AND/OR precedence in the simpleBool.py example
  • fixed a bug in the Dict class, in which non-string keys were converted to strings
  • fixed a bug in StringEnd and LineEnd when reading at the end of the input string

July 1, 2007 - Pyparsing Now Supports easy_install

Thanks to help from Jochen Kupperschmidt, pyparsing is now able to be installed using easy_install. Please post any questions or suggestions to the discussion tab on the pyparsing wiki home page.

June 14, 2007 - Pyparsing Subversion Repository rehosted to SourceForge

I just finished going through the process of exporting my local SVN repository, and uploading it to be hosted on SourceForge. The repository address is:
https://pyparsing.svn.sourceforge.net/svnroot/pyparsing
This will allow you to download the latest version of pyparsing direct from SVN, in advance of any formal releases. You can get more info on the pyparsing SourceForge SVN page.

May 23, 2007 - New Pyparsing-based success stories

New success stories on the Whos Using Pyparsing page. Check out:

April 29, 2007 - New online tutorial article

Sam writes in his blog a step-by-step tutorial, cataloging his efforts at building up a search-string parser. This tutorial starts with basic word recognition, and builds up to include special search prefixes and parenthesized lists.

April 11, 2007 - Pyparsing 1.4.6 released

This version of pyparsing has a few more bug-fixes and enhancements:
  • tuple as named result now reports entire tuple, not just first element
  • SkipTo with include=True now returns the skipped-to tokens properly
  • makeHTMLTags/makeXMLTags and anyOpenTag and anyCloseTag helpers now recognize attributes and tags with namespaces
  • countedArray now matches when defined within an Or expression
  • keepOriginalText now preserves named results fields
  • fixed Unicode bug in upcase and downcase methods
  • corrected typo in OnceOnly reset() method
  • enhanced documentation to describe behavior when parsing input strings containing tabs
  • cleaned up internal decorators to preserve function names, docstrings, etc.

New examples included in this release:
  • holaMundo.py - Spanish translation of HelloWorld
  • sexpParser.py - S-exp parser
  • jsonParser.py (update) - minor bug-fixes to previously released example for parsing JSON (JavaScript Object Notation) object serialization strings
  • macroExpander.py - macro preprocessor to substitute defined macros

December 22, 2006 - Pyparsing 1.4.5 released

This latest version of pyparsing has a few minor bug-fixes and enhancements, and a performance improvement of up to 100% increase in parsing speed.

This release also includes some new examples:
  • parsePythonValue.py - parses strings representing lists, dicts, and tuples, with nesting support
  • sql2dot.py - SQL diagram generator, parsed from schema table definitions
  • jsonParser.py - parses JSON object serializations into hierarchical ParseResults, accessible using list, dict, or object attribute methods
  • htmlStripper.py - strips HTML tags from HTML pages, leaving only body text

November 3, 2006 - Secret Meaning of P.Y.P.A.R.S.I.N.G. revealed!

On comp.lang.python, Tim Chase posted this summary of the SE, Pyparsing, and re modules, in response to a comment by the effbot:

  • > nah, if you've spent more than five minutes on c.l.python lately, you'd
  • > noticed that it's the Solution to Everything (up there with pyparsing, I
  • > think).

  • SE would be the Solution to Everything.

  • pyparsing Provides Your Perfect Alternative where Regexp Syntax Is No Good.

  • The "re" module is just Routinely Expected to solve all problems.

Nice job, Tim!


October 19, 2006 - Pyparsing 1.4.4 released

The latest release of pyparsing includes mostly minor bug-fixes and enhancements. Here are some of the more interesting new features:
- Added new helper operatorPrecedence (based on e-mail list discussion with Ralph Corderoy and Paolo Losi), to facilitate defintion of grammars for expressions with unary and binary operators. For instance, this grammar defines a 6-function arithmetic expression grammar, with unary plus and minus, proper operator precedence,and right- and left-associativity:
 expr = operatorPrecedence( operand,
     [("!", 1, opAssoc.LEFT),
      ("^", 2, opAssoc.RIGHT),
      (oneOf("+ -"), 1, opAssoc.RIGHT),
      (oneOf("* /"), 2, opAssoc.LEFT),
      (oneOf("+ -"), 2, opAssoc.LEFT),]
     )
- Added new helpers matchPreviousLiteral and matchPreviousExpr, for creating adaptive parsing expressions that match the same content as was parsed in a previous part of the parsing grammar.

October 4, 2006 - Pyparsing listed in WikiPedia

Well, this really happened on January 26 of this year, but I only just noticed it today. Pyparsing was included as a reference for recursive descent parsing in the Wikipedia article "Recursive descent parser". World domination can only be around the corner!

September 19, 2006 - Pyparsing 1.4.3 tests show compatibility with Python 2.5

I just downloaded the final production release of Python 2.5 and reran the unit test suite for pyparsing 1.4.3. There were no changes or regressions.

Here are the performance results using the Verilog parser, with various optimizations (performance is shown in lines parsed per second - larger number is faster):


Python V2.4.1
Python V2.5
base
160.6
146.5
packrat
428.7
395.8
psyco
365.7
- n/a -
packrat + psyco
614.4
- n/a -

So it would appear that Python 2.5 is about 5-10% slower in parsing. (Couldn't test with psyco, since there is no psyco download available for Python 2.5 yet.)

August 6, 2006 - Wilson Fowlie Knocks 'Em Dead at Vancouver Python Workshop

On Sunday, Wilson Fowlie presented his talk on "Pyparsing: A User's Perspective". The response was quite good - here are some excerpts from his report:
It went well! I got some good questions, showed off some well-received features (the access to named results by attribute or dict index was a big hit) and the feedback afterwards - both about pyparsing itself and the presentation - was very positive. A lot of people agreed that it would be very useful.
One guy even came up to me afterwards and told me that the whole reason he *came* to the conference was to find a tool like pyparsing. When I said that I was glad that I'd been there even for one person, he said that he'd heard a lot of murmuring around him to the effect that people would be able to find a use for it.
There was even an existing pyparsing user in the audience, which both added even more legitimacy to the module (if it were needed) and a was bit unnerving (what if he knew more than I did? :).
I had a really good time doing this presentation, Paul... I learned more about pyparsing in the process (since I didn't only want to talk about the particular features I'd used myself), so it was valuable in that way, too!
Here are the slides from Wilson's talk.

July 1, 2006 - Pyparsing 1.4.3 Released

The latest release of pyparsing includes some long-awaited bug-fixes, and a number of enhancements to parse actions.
  • simplified parse action interface; parse actions no longer must take all three arguments consisting of the original parsed string, the parsed location, and the parsed tokens; parse actions can now be defined with simplified argument interfaces:
    • no arguments
    • just the parsed tokens
    • just the parse location and parsed tokens
  • REMOVED SUPPORT FOR PARSE ACTIONS THAT RETURN LOCATION AND TOKENS; or looking at this another way, added support for parse actions to return tuples; parse actions that previously returned loc,tokens will now be interpreted to return the tuple (loc, tokens); this impending change was announced over 2 years ago, with explicit deprecation warnings in the previous release
  • new troubleshooting helper decorator, traceParseAction
  • new parse action helper class OnlyOnce, for parse actions that should only be called one time; subsequent invocations of an OnlyOnce-wrapped parse action will raise a ParseException
  • new setFailAction, to attach a method to an expression to be called when the expression is tried and fails (sort of an anti-parse action)
  • fixed the attachment of multiple parse actions, by breaking out the attempt at mind-reading in setParseAction; setParseAction now reverts to its previous behavior, and addParseAction appends new functions to the expression's list of parse actions
  • some new examples:
    • list string parser (reconstitutes a Python list from a string representation), including lists that contain elements that are lists, tuples, ints, reals, or quoted strings
    • line number demonstration, using the pyparsing line, lineno and col built-ins
    • listAllMatches example
    • line break remover, for removing hard line breaks in word-wrapped paragraphs with blank lines between paragraphs

Download it now from SourceForge!

June 28, 2006 - Wilson Fowlie to Give Pyparsing Presentation at VPW

Wilson Fowlie has been bit by the pyparsing bug, and will present "Pyparsing, A User's Perspective" at the Vancouver Python Workshop in August. Good luck, Wilson!