Generators in python
Imagine you want to work with an infinite sequence in Python, like generating the digits of Pi or an endless stream of Fibonacci numbers. You might think the best approach is to store all the values in a list or some other data structure. But with infinite sequences your program would quickly run out of memory.
This is where generators come in.
Generators are a powerful feature in Python that allow you to handle iterables efficiently by generating values one at a time.
Instead of computing and storing all values at once, a generator calculates each value only when you need it. This approach, called lazy evaluation, makes working with large or infinite datasets memory-efficient.
This blog explores what generators are, how they work, and why they’re so useful.
What is a generator
A generator is a type of iterable in Python that allows you to produce a sequence of values over time, rather than all at once in memory. It is a function that can pause its execution and return a value using the yield
keyword, resuming where it left off when called again.
This makes generators useful for handling large datasets or streams of data where loading everything into memory would be inefficient or impossible.
Basic Example
Here is an example of a generator that will count numbers up to a certain value n
.
The function takes in a value and will print numbers up to whatever n
is then cause an error.
def count_up_to(n):
count = 1
while count <= n:
yield count
count += 1
To call a generator, you first assign it to a variable then you can step through each call with the next
function.
generator = count_up_to(2)
print(next(generator))
print(next(generator))
print(next(generator))
"""Out
1
2
Traceback (most recent call last):
File "...", line 10, in <module>
print(next(generator))
^^^^^^^^^^^^^^^
StopIteration
"""
Key Characteristics
Lazy Evaluation
Generators produce items only when requested. This is known as "lazy evaluation" and can save significant memory and computation time.
def infinite_numbers():
n = 0
while True:
yield n
n += 1
State Preservation
When a generator it pauses execution and retains its current state. The next call resumes execution from where it paused.
def countdown(n):
while n > 0:
yield n
n -= 1
gen = countdown(3)
print(next(gen)) # Outputs: 3
print(next(gen)) # Outputs: 2
One Time Use
Generators can be iterated only once. Once all items have been generated, the generator is exhausted, and further calls will raise a StopIteration
exception.
Memory Efficiency
Generators do not store their entire output in memory. Instead, they generate each item on-the-fly, making them suitable for handling streams or very large data.
How to Write Generators
To create a generator the process is pretty similar to writing any function.
- Define a function.
- Replace where you would normally return something with
yield
. - Wrap that
yield
statement instead of some kind of while loop. - Assign the generator to a variable.
- Step through it with the
next
function.
Generator Single Line
You can write a generator in a single line with a syntax that looks similar to list comprehension. The only difference is that you change the square brackets with parenthesis.
Here is an example for squares:
squares = (x * x for x in range(1, 10000))
print(next(squares))
print(next(squares))
print(next(squares))
print(next(squares))
print(next(squares))
"""Out
1
4
9
16
25
"""
Advanced Examples
Example 1 - File Processing with Generators
Generators are great for processing large files without loading the full content into memory.
Here is an example returning a file line by line without overloading the memory of your code.
def read_large_file(file_name):
with open(file_name, 'r') as file:
for line in file:
yield line.strip()
# Using the generator
for line in read_large_file('large_file.txt'):
print(line)
Example 2 - Fibonacci Numbers
Here is an example generating the Fibonacci sequence.
The generator keeps yielding Fibonacci numbers indefinitely. It only calculates values as requested, avoiding unnecessary computations.
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
# Using the generator
fib = fibonacci()
for _ in range(10):
print(next(fib))
Conclusion
Generators are a powerful feature of Python that enable memory-efficient and scalable data processing. By yielding values one at a time, they allow developers to handle large datasets, infinite sequences, and streams seamlessly.