Daily Python - Generators

Daily Python - Generators

By Maximus Meadowcroft | December 30, 2024


Generators in python

Imagine you want to work with an infinite sequence in Python, like generating the digits of Pi or an endless stream of Fibonacci numbers. You might think the best approach is to store all the values in a list or some other data structure. But with infinite sequences your program would quickly run out of memory.

This is where generators come in.

Generators are a powerful feature in Python that allow you to handle iterables efficiently by generating values one at a time.

Instead of computing and storing all values at once, a generator calculates each value only when you need it. This approach, called lazy evaluation, makes working with large or infinite datasets memory-efficient.

This blog explores what generators are, how they work, and why they’re so useful.

What is a generator

A generator is a type of iterable in Python that allows you to produce a sequence of values over time, rather than all at once in memory. It is a function that can pause its execution and return a value using the yield keyword, resuming where it left off when called again.

This makes generators useful for handling large datasets or streams of data where loading everything into memory would be inefficient or impossible.

Basic Example

Here is an example of a generator that will count numbers up to a certain value n.

The function takes in a value and will print numbers up to whatever n is then cause an error.

def count_up_to(n):
    count = 1
    while count <= n:
        yield count
        count += 1

To call a generator, you first assign it to a variable then you can step through each call with the next function.

generator = count_up_to(2)  
print(next(generator))  
print(next(generator))  
print(next(generator))


"""Out
1
2
Traceback (most recent call last):
  File "...", line 10, in <module>
    print(next(generator))
          ^^^^^^^^^^^^^^^
StopIteration
"""

Key Characteristics

Lazy Evaluation

Generators produce items only when requested. This is known as "lazy evaluation" and can save significant memory and computation time.

def infinite_numbers():
    n = 0
    while True:
        yield n
        n += 1

State Preservation

When a generator it pauses execution and retains its current state. The next call resumes execution from where it paused.

def countdown(n):
    while n > 0:
        yield n
        n -= 1
gen = countdown(3)
print(next(gen))  # Outputs: 3
print(next(gen))  # Outputs: 2

One Time Use

Generators can be iterated only once. Once all items have been generated, the generator is exhausted, and further calls will raise a StopIteration exception.

Memory Efficiency

Generators do not store their entire output in memory. Instead, they generate each item on-the-fly, making them suitable for handling streams or very large data.

How to Write Generators

To create a generator the process is pretty similar to writing any function.

  1. Define a function.
  2. Replace where you would normally return something with yield.
  3. Wrap that yield statement instead of some kind of while loop.
  4. Assign the generator to a variable.
  5. Step through it with the next function.

Generator Single Line

You can write a generator in a single line with a syntax that looks similar to list comprehension. The only difference is that you change the square brackets with parenthesis.

Here is an example for squares:

squares = (x * x for x in range(1, 10000))  

print(next(squares))  
print(next(squares))  
print(next(squares))  
print(next(squares))  
print(next(squares))


"""Out
1
4
9
16
25
"""

Advanced Examples

Example 1 - File Processing with Generators

Generators are great for processing large files without loading the full content into memory.

Here is an example returning a file line by line without overloading the memory of your code.

def read_large_file(file_name):
    with open(file_name, 'r') as file:
        for line in file:
            yield line.strip()

# Using the generator
for line in read_large_file('large_file.txt'):
    print(line)

Example 2 - Fibonacci Numbers

Here is an example generating the Fibonacci sequence.

The generator keeps yielding Fibonacci numbers indefinitely. It only calculates values as requested, avoiding unnecessary computations.

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Using the generator
fib = fibonacci()
for _ in range(10):
    print(next(fib))

Conclusion

Generators are a powerful feature of Python that enable memory-efficient and scalable data processing. By yielding values one at a time, they allow developers to handle large datasets, infinite sequences, and streams seamlessly.