Welcome to the next pikoTutorial !
yield
is a well known keyword in Python which allows to optimize the code by generating data streams on the fly instead of generating the same data all at once. To get started, let’s look at a simple example where we need to create a data stream of squares of numbers from 0 to 100 million. The naive approach would be to just create a list of these numbers:
<span>data</span> <span>=</span> <span>[</span><span>i</span><span>**</span><span>2</span> <span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>100_000_000</span><span>)]</span><span>data</span> <span>=</span> <span>[</span><span>i</span><span>**</span><span>2</span> <span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>100_000_000</span><span>)]</span>data = [i**2 for i in range(100_000_000)]
Enter fullscreen mode Exit fullscreen mode
However, this approach ignores the fact that such data has to be fully allocated in memory, which is, of course, a waste of resources because we may end up not needing certain values, and we almost certainly won’t need them all at once. So the better approach is to use a generator function by utilizing keyword yield
:
<span>def</span> <span>generate_data</span><span>():</span><span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>100_000_000</span><span>):</span><span>yield</span> <span>i</span><span>**</span><span>2</span><span>def</span> <span>generate_data</span><span>():</span> <span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>100_000_000</span><span>):</span> <span>yield</span> <span>i</span><span>**</span><span>2</span>def generate_data(): for i in range(100_000_000): yield i**2
Enter fullscreen mode Exit fullscreen mode
This function will generate individual values on the fly by pausing the function execution at yield
keyword after the new value has been returned. Thanks to this mechanism, there’s no need for space allocation for all the 100 millions values – we get every subsequent value one by one and only when it is really needed.
That’s a well known application of yield
keyword, but what more can we do with it?
Yield from
Let’s look at an example in which we have 2 generator functions and we want to use the first one in the body of the second one:
<span>def</span> <span>sub_generator</span><span>():</span><span>yield</span> <span>1</span><span>yield</span> <span>2</span><span>def</span> <span>main_generator</span><span>():</span><span>for</span> <span>value</span> <span>in</span> <span>sub_generator</span><span>():</span><span>yield</span> <span>value</span><span>print</span><span>(</span><span>"</span><span>Finally able to proceed to main generator logic!</span><span>"</span><span>)</span><span>yield</span> <span>3</span><span>for</span> <span>output</span> <span>in</span> <span>main_generator</span><span>():</span><span>print</span><span>(</span><span>output</span><span>)</span><span>def</span> <span>sub_generator</span><span>():</span> <span>yield</span> <span>1</span> <span>yield</span> <span>2</span> <span>def</span> <span>main_generator</span><span>():</span> <span>for</span> <span>value</span> <span>in</span> <span>sub_generator</span><span>():</span> <span>yield</span> <span>value</span> <span>print</span><span>(</span><span>"</span><span>Finally able to proceed to main generator logic!</span><span>"</span><span>)</span> <span>yield</span> <span>3</span> <span>for</span> <span>output</span> <span>in</span> <span>main_generator</span><span>():</span> <span>print</span><span>(</span><span>output</span><span>)</span>def sub_generator(): yield 1 yield 2 def main_generator(): for value in sub_generator(): yield value print("Finally able to proceed to main generator logic!") yield 3 for output in main_generator(): print(output)
Enter fullscreen mode Exit fullscreen mode
The output of such code is:
12Finally able to proceed to main generator logic!31 2 Finally able to proceed to main generator logic! 31 2 Finally able to proceed to main generator logic! 3
Enter fullscreen mode Exit fullscreen mode
In such situations, when we first want to obtain values from the sub-generator, we can use the yield from
keyword to avoid explicit loop iterating over the sub-generator:
<span>def</span> <span>sub_generator</span><span>():</span><span>yield</span> <span>1</span><span>yield</span> <span>2</span><span>def</span> <span>main_generator</span><span>():</span><span>yield</span> <span>from</span> <span>sub_generator</span><span>()</span><span>print</span><span>(</span><span>"</span><span>Finally able to proceed to main generator logic!</span><span>"</span><span>)</span><span>yield</span> <span>3</span><span>for</span> <span>output</span> <span>in</span> <span>main_generator</span><span>():</span><span>print</span><span>(</span><span>output</span><span>)</span><span>def</span> <span>sub_generator</span><span>():</span> <span>yield</span> <span>1</span> <span>yield</span> <span>2</span> <span>def</span> <span>main_generator</span><span>():</span> <span>yield</span> <span>from</span> <span>sub_generator</span><span>()</span> <span>print</span><span>(</span><span>"</span><span>Finally able to proceed to main generator logic!</span><span>"</span><span>)</span> <span>yield</span> <span>3</span> <span>for</span> <span>output</span> <span>in</span> <span>main_generator</span><span>():</span> <span>print</span><span>(</span><span>output</span><span>)</span>def sub_generator(): yield 1 yield 2 def main_generator(): yield from sub_generator() print("Finally able to proceed to main generator logic!") yield 3 for output in main_generator(): print(output)
Enter fullscreen mode Exit fullscreen mode
Coroutines
Untill this moment, all the examples above showed yield
in a role of slightly different return
statement – we write yield on the left and the value to be returned on the right. This can however be reversed – yield can actually be assigned to a variable inside the function:
<span>def</span> <span>echo_coroutine</span><span>():</span><span>while</span> <span>True</span><span>:</span><span>value</span> <span>=</span> <span>yield</span><span>print</span><span>(</span><span>f</span><span>"</span><span>Received value = </span><span>{</span><span>value</span><span>}</span><span>"</span><span>)</span><span>coroutine</span> <span>=</span> <span>echo_coroutine</span><span>()</span><span>next</span><span>(</span><span>coroutine</span><span>)</span><span>coroutine</span><span>.</span><span>send</span><span>(</span><span>12</span><span>)</span><span>coroutine</span><span>.</span><span>send</span><span>(</span><span>24</span><span>)</span><span>def</span> <span>echo_coroutine</span><span>():</span> <span>while</span> <span>True</span><span>:</span> <span>value</span> <span>=</span> <span>yield</span> <span>print</span><span>(</span><span>f</span><span>"</span><span>Received value = </span><span>{</span><span>value</span><span>}</span><span>"</span><span>)</span> <span>coroutine</span> <span>=</span> <span>echo_coroutine</span><span>()</span> <span>next</span><span>(</span><span>coroutine</span><span>)</span> <span>coroutine</span><span>.</span><span>send</span><span>(</span><span>12</span><span>)</span> <span>coroutine</span><span>.</span><span>send</span><span>(</span><span>24</span><span>)</span>def echo_coroutine(): while True: value = yield print(f"Received value = {value}") coroutine = echo_coroutine() next(coroutine) coroutine.send(12) coroutine.send(24)
Enter fullscreen mode Exit fullscreen mode
The output of such code is:
Received value = 12Received value = 24Received value = 12 Received value = 24Received value = 12 Received value = 24
Enter fullscreen mode Exit fullscreen mode
You can treat value = yield
line as “assign to value variable whatever will be sent here in the future with send
function”.
State machines
Another interesting application of generator functions could be the implementation of FSM (finite state machine). If states are organized in a circular order,, such implementation, together with next()
function, may result in a very friendly and verbose interface for iterating over repeating states:
<span>def</span> <span>state_machine</span><span>():</span><span>while</span> <span>True</span><span>:</span><span># perform steps to transition to State 1 </span> <span>yield</span> <span>"</span><span>Reached State 1</span><span>"</span><span># perform steps to transition to State 2 </span> <span>yield</span> <span>"</span><span>Reached State 2</span><span>"</span><span># perform steps to transition to State 3 </span> <span>yield</span> <span>"</span><span>Reached State 3</span><span>"</span><span>state</span> <span>=</span> <span>state_machine</span><span>()</span><span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>6</span><span>):</span><span>print</span><span>(</span><span>next</span><span>(</span><span>state</span><span>))</span><span>def</span> <span>state_machine</span><span>():</span> <span>while</span> <span>True</span><span>:</span> <span># perform steps to transition to State 1 </span> <span>yield</span> <span>"</span><span>Reached State 1</span><span>"</span> <span># perform steps to transition to State 2 </span> <span>yield</span> <span>"</span><span>Reached State 2</span><span>"</span> <span># perform steps to transition to State 3 </span> <span>yield</span> <span>"</span><span>Reached State 3</span><span>"</span> <span>state</span> <span>=</span> <span>state_machine</span><span>()</span> <span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>6</span><span>):</span> <span>print</span><span>(</span><span>next</span><span>(</span><span>state</span><span>))</span>def state_machine(): while True: # perform steps to transition to State 1 yield "Reached State 1" # perform steps to transition to State 2 yield "Reached State 2" # perform steps to transition to State 3 yield "Reached State 3" state = state_machine() for i in range(6): print(next(state))
Enter fullscreen mode Exit fullscreen mode
The output of such code is:
Reached State 1Reached State 2Reached State 3Reached State 1Reached State 2Reached State 3Reached State 1 Reached State 2 Reached State 3 Reached State 1 Reached State 2 Reached State 3Reached State 1 Reached State 2 Reached State 3 Reached State 1 Reached State 2 Reached State 3
Enter fullscreen mode Exit fullscreen mode
Fattening nested structures
In the first part of this article I described how yield from
keywords can be used to delegate to sub-generators within generator functions. It’s worth noting that the role of sub-generator can be played by…the generator function itself! This allows for recursive generator function invocation which can be used e.g. for flattening the nested lists:
<span>def</span> <span>flatten_list</span><span>(</span><span>input</span><span>):</span><span>for</span> <span>element</span> <span>in</span> <span>input</span><span>:</span><span>if</span> <span>isinstance</span><span>(</span><span>element</span><span>,</span> <span>list</span><span>):</span><span>yield</span> <span>from</span> <span>flatten_list</span><span>(</span><span>element</span><span>)</span><span>else</span><span>:</span><span>yield</span> <span>element</span><span>data</span> <span>=</span> <span>[[</span><span>1</span><span>,</span> <span>2</span><span>,</span> <span>[</span><span>3</span><span>,</span> <span>4</span><span>]],</span> <span>[</span><span>5</span><span>,</span> <span>6</span><span>],</span> <span>7</span><span>]</span><span>flattened_data</span> <span>=</span> <span>list</span><span>(</span><span>flatten_list</span><span>(</span><span>data</span><span>))</span><span>print</span><span>(</span><span>flattened_data</span><span>)</span> <span># Output: [1, 2, 3, 4, 5, 6, 7] </span><span>def</span> <span>flatten_list</span><span>(</span><span>input</span><span>):</span> <span>for</span> <span>element</span> <span>in</span> <span>input</span><span>:</span> <span>if</span> <span>isinstance</span><span>(</span><span>element</span><span>,</span> <span>list</span><span>):</span> <span>yield</span> <span>from</span> <span>flatten_list</span><span>(</span><span>element</span><span>)</span> <span>else</span><span>:</span> <span>yield</span> <span>element</span> <span>data</span> <span>=</span> <span>[[</span><span>1</span><span>,</span> <span>2</span><span>,</span> <span>[</span><span>3</span><span>,</span> <span>4</span><span>]],</span> <span>[</span><span>5</span><span>,</span> <span>6</span><span>],</span> <span>7</span><span>]</span> <span>flattened_data</span> <span>=</span> <span>list</span><span>(</span><span>flatten_list</span><span>(</span><span>data</span><span>))</span> <span>print</span><span>(</span><span>flattened_data</span><span>)</span> <span># Output: [1, 2, 3, 4, 5, 6, 7] </span>def flatten_list(input): for element in input: if isinstance(element, list): yield from flatten_list(element) else: yield element data = [[1, 2, [3, 4]], [5, 6], 7] flattened_data = list(flatten_list(data)) print(flattened_data) # Output: [1, 2, 3, 4, 5, 6, 7]
Enter fullscreen mode Exit fullscreen mode
暂无评论内容