ASPN ActiveState Programmer Network
  ActiveState
/ Home / Perl / PHP / Python / Tcl / XSLT /
/ Safari / My ASPN /
Cookbooks | Documentation | Mailing Lists | Modules | News Feeds | Products | User Groups | Web Services
SEARCH
advanced | search help

Reference
ActivePython 2.3
Python Documentation
Library Reference
4. String Services
4.2 re -- Regular expression operations
4.2.6 Examples

MyASPN >> Reference >> ActivePython 2.3 >> Python Documentation >> Library Reference >> 4. String Services >> 4.2 re -- Regular expression operations

4.2.6 Examples

Simulating scanf()

Python does not currently have an equivalent to scanf().   Regular expressions are generally more powerful, though also more verbose, than scanf() format strings. The table below offers some more-or-less equivalent mappings between scanf() format tokens and regular expressions.

scanf() Token  Regular Expression 
%c .
%5c .{5}
%d [-+]?\d+
%e, %E, %f, %g [-+]?(\d+(\.\d*)?|\d*\.\d+)([eE][-+]?\d+)?
%i [-+]?(0[xX][\dA-Fa-f]+|0[0-7]*|\d+)
%o 0[0-7]*
%s \S+
%u \d+
%x, %X 0[xX][\dA-Fa-f]+

To extract the filename and numbers from a string like

    /usr/sbin/sendmail - 0 errors, 4 warnings

you would use a scanf() format like

    %s - %d errors, %d warnings

The equivalent regular expression would be

    (\S+) - (\d+) errors, (\d+) warnings

Avoiding recursion

If you create regular expressions that require the engine to perform a lot of recursion, you may encounter a RuntimeError exception with the message maximum recursion limit exceeded. For example,

>>> import re
>>> s = 'Begin ' + 1000*'a very long string ' + 'end'
>>> re.match('Begin (\w| )*? end', s).end()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/local/lib/python2.3/sre.py", line 132, in match
    return _compile(pattern, flags).match(string)
RuntimeError: maximum recursion limit exceeded

You can often restructure your regular expression to avoid recursion.

Starting with Python 2.3, simple uses of the *? pattern are special-cased to avoid recursion. Thus, the above regular expression can avoid recursion by being recast as Begin [a-zA-Z0-9_ ]*?end. As a further benefit, such regular expressions will run faster than their recursive equivalents.

See About this document... for information on suggesting changes.

Privacy Policy | Email Opt-out | Feedback | Syndication
© ActiveState 2004 All rights reserved