|
The iterator feature introduced in Python 2.2 and the
itertools module make it easier to write programs that loop
through large data sets without having the entire data set in memory
at one time. List comprehensions don't fit into this picture very
well because they produce a Python list object containing all of the
items. This unavoidably pulls all of the objects into memory, which
can be a problem if your data set is very large. When trying to write
a functionally-styled program, it would be natural to write something
like:
links = [link for link in get_all_links() if not link.followed]
for link in links:
...
instead of
for link in get_all_links():
if link.followed:
continue
...
The first form is more concise and perhaps more readable, but if
you're dealing with a large number of link objects you'd have to write
the second form to avoid having all link objects in memory at the same
time.
Generator expressions work similarly to list comprehensions but don't
materialize the entire list; instead they create a generator that will
return elements one by one. The above example could be written as:
links = (link for link in get_all_links() if not link.followed)
for link in links:
...
Generator expressions always have to be written inside parentheses, as
in the above example. The parentheses signalling a function call also
count, so if you want to create a iterator that will be immediately
passed to a function you could write:
print sum(obj.count for obj in list_all_objects())
Generator expressions differ from list comprehensions in various small
ways. Most notably, the loop variable (obj in the above
example) is not accessible outside of the generator expression. List
comprehensions leave the variable assigned to its last value; future
versions of Python will change this, making list comprehensions match
generator expressions in this respect.
See About this document... for information on suggesting changes.
|