| 1 |
==================================== |
|---|
| 2 |
Chapter 18: Iterators and Generators |
|---|
| 3 |
==================================== |
|---|
| 4 |
|
|---|
| 5 |
|
|---|
| 6 |
Iterators |
|---|
| 7 |
========= |
|---|
| 8 |
|
|---|
| 9 |
An iterator is a general object that provides values one by one |
|---|
| 10 |
in a series. In this sense a simple list or tuple may act as an |
|---|
| 11 |
iterator. |
|---|
| 12 |
|
|---|
| 13 |
>>> for element in ('a', 'b', 'c'): |
|---|
| 14 |
... print element |
|---|
| 15 |
a |
|---|
| 16 |
b |
|---|
| 17 |
c |
|---|
| 18 |
|
|---|
| 19 |
But the iterator represents a more general case of sequential |
|---|
| 20 |
access to objects - without the need toload all the values into |
|---|
| 21 |
memory. |
|---|
| 22 |
|
|---|
| 23 |
All objects that basically provide sequential access may be |
|---|
| 24 |
converted to an iterator using the ``iter`` function. |
|---|
| 25 |
|
|---|
| 26 |
>>> iterator = iter(('a', 'b', 'c')) |
|---|
| 27 |
>>> for element in iterator: |
|---|
| 28 |
... print element |
|---|
| 29 |
a |
|---|
| 30 |
b |
|---|
| 31 |
c |
|---|
| 32 |
|
|---|
| 33 |
The characteristic element of an iterator is the method ''next()''. |
|---|
| 34 |
|
|---|
| 35 |
>>> dir(iterator) |
|---|
| 36 |
[..., 'next'] |
|---|
| 37 |
>>> iterator.next() |
|---|
| 38 |
Traceback (most recent call last): |
|---|
| 39 |
... |
|---|
| 40 |
StopIteration |
|---|
| 41 |
|
|---|
| 42 |
What's that? |
|---|
| 43 |
|
|---|
| 44 |
With the above ``for`` loop we have already exhausted the |
|---|
| 45 |
iterator, so we have to create a new one. |
|---|
| 46 |
|
|---|
| 47 |
>>> iterator = iter(('a', 'b', 'c')) |
|---|
| 48 |
>>> iterator.next() |
|---|
| 49 |
'a' |
|---|
| 50 |
>>> iterator.next(), iterator.next() |
|---|
| 51 |
('b', 'c') |
|---|
| 52 |
>>> iterator.next() |
|---|
| 53 |
Traceback (most recent call last): |
|---|
| 54 |
... |
|---|
| 55 |
StopIteration |
|---|
| 56 |
|
|---|
| 57 |
So the ``for`` loop just calls ``next()`` repeatedly until |
|---|
| 58 |
the ``StopIteration`` exception is raised. |
|---|
| 59 |
|
|---|
| 60 |
Iterable and iterator classes |
|---|
| 61 |
----------------------------- |
|---|
| 62 |
|
|---|
| 63 |
We can create an iterable object by providing the special |
|---|
| 64 |
method ``__iter__()`` that will be called when the ``iter()`` |
|---|
| 65 |
function is called and should return an iterator. |
|---|
| 66 |
|
|---|
| 67 |
>>> class Iter1(object): |
|---|
| 68 |
... data = ('a', 'b', 'c') |
|---|
| 69 |
... def __iter__(self): |
|---|
| 70 |
... return iter(self.data) |
|---|
| 71 |
|
|---|
| 72 |
A ``for`` loop checks if its target is iterable, i.e. |
|---|
| 73 |
if an iterator can be derived from it. |
|---|
| 74 |
|
|---|
| 75 |
>>> for x in Iter1(): |
|---|
| 76 |
... print x |
|---|
| 77 |
a |
|---|
| 78 |
b |
|---|
| 79 |
c |
|---|
| 80 |
|
|---|
| 81 |
By defining the ``next()`` method we can directly create |
|---|
| 82 |
an iterator. |
|---|
| 83 |
|
|---|
| 84 |
>>> class Iter2(object): |
|---|
| 85 |
... char = 'a' |
|---|
| 86 |
... def next(self): |
|---|
| 87 |
... char = self.char |
|---|
| 88 |
... if char > 'c': |
|---|
| 89 |
... raise StopIteration |
|---|
| 90 |
... self.char = chr(ord(char) + 1) |
|---|
| 91 |
... return char |
|---|
| 92 |
|
|---|
| 93 |
The ``for`` loop now just calls the ``next()`` method |
|---|
| 94 |
repeatedly. |
|---|
| 95 |
|
|---|
| 96 |
>>> for x in Iter1(): |
|---|
| 97 |
... print x |
|---|
| 98 |
a |
|---|
| 99 |
b |
|---|
| 100 |
c |
|---|
| 101 |
|
|---|
| 102 |
The ``Iter2`` class is especially interesting because it does |
|---|
| 103 |
never create the full list of values that it provides but delivers |
|---|
| 104 |
them one by one upon request. Thus the number of values it provides |
|---|
| 105 |
may be arbitrarily large. |
|---|
| 106 |
|
|---|
| 107 |
With Python 2.3 iterators can be created using a special type |
|---|
| 108 |
of function - a generator... |
|---|
| 109 |
|
|---|
| 110 |
|
|---|
| 111 |
Generators |
|---|
| 112 |
========== |
|---|
| 113 |
|
|---|
| 114 |
A generator is a function that does not ``return`` a value but |
|---|
| 115 |
instead ``yield``s one or more values. |
|---|
| 116 |
|
|---|
| 117 |
>>> def greeting(): |
|---|
| 118 |
... yield 'Hello' |
|---|
| 119 |
... yield 'World' |
|---|
| 120 |
|
|---|
| 121 |
>>> greeting |
|---|
| 122 |
<function greeting ...> |
|---|
| 123 |
|
|---|
| 124 |
>>> greeting() |
|---|
| 125 |
<generator object ...> |
|---|
| 126 |
|
|---|
| 127 |
>>> for x in greeting(): |
|---|
| 128 |
... print x |
|---|
| 129 |
Hello |
|---|
| 130 |
World |
|---|
| 131 |
|
|---|
| 132 |
Typically the ``yield`` statement is executed in a loop thus |
|---|
| 133 |
showing the usual behaviour of an iterator. |
|---|
| 134 |
|
|---|
| 135 |
>>> def characters1(): |
|---|
| 136 |
... char = 'a' |
|---|
| 137 |
... while char <= 'c': |
|---|
| 138 |
... yield char |
|---|
| 139 |
... char = chr(ord(char) + 1) |
|---|
| 140 |
|
|---|
| 141 |
>>> list(characters1()) |
|---|
| 142 |
['a', 'b', 'c'] |
|---|
| 143 |
|
|---|
| 144 |
A ``for`` loop may be even more compact than a ``while`` loop. |
|---|
| 145 |
|
|---|
| 146 |
>>> def characters2(): |
|---|
| 147 |
... for i in xrange(ord('a'), ord('d')): |
|---|
| 148 |
... yield chr(i) |
|---|
| 149 |
|
|---|
| 150 |
>>> list(characters2()) |
|---|
| 151 |
['a', 'b', 'c'] |
|---|
| 152 |
|
|---|
| 153 |
Such a generator is just a special case of an iterator - it just |
|---|
| 154 |
provides the ``next()`` method. |
|---|
| 155 |
|
|---|
| 156 |
>>> generator = characters2() |
|---|
| 157 |
>>> generator.next() |
|---|
| 158 |
'a' |
|---|
| 159 |
>>> generator.next(), generator.next() |
|---|
| 160 |
('b', 'c') |
|---|
| 161 |
>>> generator.next() |
|---|
| 162 |
Traceback (most recent call last): |
|---|
| 163 |
... |
|---|
| 164 |
StopIteration |
|---|
| 165 |
|
|---|
| 166 |
Excercise |
|---|
| 167 |
--------- |
|---|
| 168 |
|
|---|
| 169 |
Write a generator that provides the Fibonacci sequence. |
|---|
| 170 |
|
|---|
| 171 |
>>> from cybertrain.work.work181 import fibogen |
|---|
| 172 |
>>> for i, fibo in enumerate(fibogen()): |
|---|
| 173 |
... if i > 6: |
|---|
| 174 |
... break |
|---|
| 175 |
... print fibo, |
|---|
| 176 |
0 1 1 2 3 5 8 |
|---|
| 177 |
|
|---|
| 178 |
|
|---|
| 179 |
Generator Expressions |
|---|
| 180 |
===================== |
|---|
| 181 |
|
|---|
| 182 |
In Chapter 9 we already met list comprehensions, a simple but |
|---|
| 183 |
powerful construct to transform and filter a sequence of values. |
|---|
| 184 |
|
|---|
| 185 |
We can use a list comprehension to calculate the sum of the |
|---|
| 186 |
squares of the numbers up to 100. |
|---|
| 187 |
|
|---|
| 188 |
>>> squares = [i * i for i in xrange(101)] |
|---|
| 189 |
>>> sum(squares) |
|---|
| 190 |
338350 |
|---|
| 191 |
|
|---|
| 192 |
Note that this way the full list of all the numbers is created |
|---|
| 193 |
in memory. |
|---|
| 194 |
|
|---|
| 195 |
>>> len(squares) |
|---|
| 196 |
101 |
|---|
| 197 |
>>> squares[0] |
|---|
| 198 |
0 |
|---|
| 199 |
>>> squares[-1] |
|---|
| 200 |
10000 |
|---|
| 201 |
|
|---|
| 202 |
This is also the case if we put everything into one expression. |
|---|
| 203 |
|
|---|
| 204 |
>>> sum([i * i for i in xrange(101)]) |
|---|
| 205 |
338350 |
|---|
| 206 |
|
|---|
| 207 |
We can avoid the allocation of memory for all the numbers by |
|---|
| 208 |
using a generator expression instead of a list comprehension. |
|---|
| 209 |
A generator expression looks like a list comprehension but |
|---|
| 210 |
is enclosed in parentheses. |
|---|
| 211 |
|
|---|
| 212 |
>>> squares = (i * i for i in xrange(101)) |
|---|
| 213 |
>>> sum(squares) |
|---|
| 214 |
338350 |
|---|
| 215 |
|
|---|
| 216 |
>>> squares |
|---|
| 217 |
<generator object ...> |
|---|
| 218 |
|
|---|
| 219 |
This way one number after the other is requested from the |
|---|
| 220 |
generator by the ``sum()`` function using the generator's |
|---|
| 221 |
``next()`` method, thus the whole list of numbers is not |
|---|
| 222 |
created. |
|---|
| 223 |
|
|---|
| 224 |
When using a generator expression as the only argument in |
|---|
| 225 |
a function call we can omit the extra parentheses. |
|---|
| 226 |
|
|---|
| 227 |
>>> sum(i * i for i in xrange(101)) |
|---|
| 228 |
338350 |
|---|
| 229 |
|
|---|
| 230 |
Caveat! |
|---|
| 231 |
------- |
|---|
| 232 |
|
|---|
| 233 |
As any iterator a generator expression will be exhausted when |
|---|
| 234 |
once iterated over so we have to recreate it. |
|---|
| 235 |
|
|---|
| 236 |
>>> len(list(squares)) |
|---|
| 237 |
0 |
|---|
| 238 |
>>> squares = (i * i for i in xrange(101)) |
|---|
| 239 |
>>> len(list(squares)) |
|---|
| 240 |
101 |
|---|
| 241 |
|
|---|
| 242 |
If a certain iterator is to be used multiple times we can |
|---|
| 243 |
use the ``tee()`` function from the ``itertools`` module. |
|---|
| 244 |
|
|---|
| 245 |
|
|---|
| 246 |
The ``itertools`` Module |
|---|
| 247 |
======================== |
|---|
| 248 |
|
|---|
| 249 |
See section 6.5 of the Python library reference. |
|---|
| 250 |
|
|---|
| 251 |
The ``itertools`` module provides a set of handy functions |
|---|
| 252 |
for creating, processing, and filtering iterators without |
|---|
| 253 |
having to create full lists. |
|---|
| 254 |
|
|---|
| 255 |
These functions are especially useful when you have to deal |
|---|
| 256 |
with a large number of object, e.g. like database records |
|---|
| 257 |
or even infinite sequences. |
|---|
| 258 |
|
|---|
| 259 |
Let's just look at two of them. |
|---|
| 260 |
|
|---|
| 261 |
tee - duplicate an iterator |
|---|
| 262 |
--------------------------- |
|---|
| 263 |
|
|---|
| 264 |
>>> from itertools import tee |
|---|
| 265 |
|
|---|
| 266 |
>>> sq1, sq2 = tee(i * i for i in xrange(101)) |
|---|
| 267 |
>>> sq1, sq2 |
|---|
| 268 |
(<itertools.tee object ...>, <itertools.tee object ...>) |
|---|
| 269 |
|
|---|
| 270 |
>>> sum(sq1) |
|---|
| 271 |
338350 |
|---|
| 272 |
>>> len(list(sq2)) |
|---|
| 273 |
101 |
|---|
| 274 |
|
|---|
| 275 |
islice - take a slice from an iterator |
|---|
| 276 |
-------------------------------------- |
|---|
| 277 |
|
|---|
| 278 |
>>> from itertools import islice |
|---|
| 279 |
|
|---|
| 280 |
>>> fibo30to35 = islice(fibogen(), 30, 36) |
|---|
| 281 |
>>> fibo30to35 |
|---|
| 282 |
<itertools.islice object ...> |
|---|
| 283 |
|
|---|
| 284 |
>>> list(fibo30to35) |
|---|
| 285 |
[832040, 1346269, 2178309, 3524578, 5702887, 9227465] |
|---|
| 286 |
|
|---|
| 287 |
Excercise |
|---|
| 288 |
--------- |
|---|
| 289 |
|
|---|
| 290 |
Write a function that returns the nth Fibonacci number |
|---|
| 291 |
using the fibogen and islice functions. |
|---|
| 292 |
|
|---|
| 293 |
>>> from cybertrain.work.work181 import fibonacci |
|---|
| 294 |
>>> fibonacci(7) |
|---|
| 295 |
13 |
|---|