pwshub.com

Asynchronous Iterators and Iterables in Python

When you write asynchronous code in Python, you’ll likely need to create asynchronous iterators and iterables at some point. Asynchronous iterators are what Python uses to control async for loops, while asynchronous iterables are objects that you can iterate over using async for loops.

Both tools allow you to iterate over awaitable objects without blocking your code. This way, you can perform different tasks asynchronously.

In this tutorial, you’ll:

  • Learn what async iterators and iterables are in Python
  • Create async generator expressions and generator iterators
  • Code async iterators and iterables with the .__aiter__() and .__anext__() methods
  • Use async iterators in async loops and comprehensions

To get the most out of this tutorial, you should know the basics of Python’s iterators and iterables. You should also know about Python’s asynchronous features and tools.

Take the Quiz: Test your knowledge with our interactive “Asynchronous Iterators and Iterables in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Getting to Know Async Iterators and Iterables in Python

Iterators and iterables are fundamental components in Python. You’ll use them in almost all your programs where you iterate over data streams using a for loop. Iterators power and control the iteration process, while iterables typically hold data that you want to iterate over.

Python iterators implement the iterator design pattern, which allows you to traverse a container and access its elements. To implement this pattern, iterators need the .__iter__() and .__next__() special methods. Similarly, iterables are typically data containers that implement the .__iter__() method.

Python has extended the concept of iterators and iterables to asynchronous programming with the asyncio module and the async and await keywords. In this scenario, asynchronous iterators drive the asynchronous iteration process, mainly powered by async for loops and comprehensions.

In the following sections, you’ll briefly examine the concepts of asynchronous iterators and iterables in Python.

Async Iterators

Python’s documentation defines asynchronous iterators, or async iterators for short, as the following:

An object that implements the .__aiter__() and .__anext__() [special] methods. .__anext__() must return an awaitable object. [An] async for [loop] resolves the awaitables returned by an asynchronous iterator’s .__anext__() method until it raises a StopAsyncIteration exception. (Source)

Similar to regular iterators that must implement .__iter__() and .__next__(), async iterators must implement .__aiter__() and .__anext__(). In regular iterators, the .__iter__() method usually returns the iterator itself. This is also true for async iterators.

To continue with this parallelism, in regular iterators, the .__next__() method must return the next object for the iteration. In async iterators, the .__anext__() method must return the next object, which must be awaitable.

Python defines awaitable objects as described in the quote below:

An object that can be used in an await expression. [It] can be a coroutine or an object with an .__await__() method. (Source)

In practice, a quick way to make an awaitable object in Python is to call an asynchronous function. You define this type of function with the async def keyword construct. This call creates a coroutine object.

When the data stream runs out of data, the method must raise a StopAsyncIteration exception to end the asynchronous iteration process.

Here’s an example of an async iterator that allows iterating over a range of numbers asynchronously:

In the .__aiter__() method, you return self, which is the current object—the iterator itself. In the .__anext__() method, you generate a number from the range between .start and .end.

To simulate the required awaitable object, you use the asyncio.sleep() function with a delay of 0.5 seconds in an await statement. When the range is covered, you raise a StopAsyncIteration exception to finish the iteration.

When you run the script, you get the following output:

In this example, you’ll get each number after waiting half a second, which is congruent with the asynchronous iteration.

The above example is a quick first look at async iterators and how to define them. You’ll learn more about the .__aiter__() and .__anext__() methods and related concepts when you get to the sections on creating async iterators. Now, it’s time to learn the basics of async iterables.

Async Iterables

When it comes to async iterables, the Python documentation says the following:

An object, that can be used in an async for statement. Must return an asynchronous iterator from its .__aiter__() method. (Source)

In practice, an object only needs an .__aiter__() method that returns an async iterator to be iterable. It doesn’t need the .__anext__() method.

Here’s how you’d modify the AsyncRange to be an async iterable rather than an iterator:

This new implementation of your AsyncRange class is more concise than the previous one. It just has the .__aiter__() method, which yields numbers on demand. Again, you use the asyncio.sleep() function to simulate the awaitable object.

Here’s how the class works:

You get the same result as in the previous section, but instead of using an async iterator, you use an iterable.

With this quick background on async iterators and iterables, you can now dive deeper into how asynchronous iteration works and why you’d want to use it in your code.

Async Iteration

In Python, asynchronous iteration refers to traversing asynchronous iterables using async for loops. Under the hood, async for loops rely on async iterators to control the iteration process. Asynchronous iteration allows you to perform non-blocking operations within the loop.

Async iteration enables you to handle I/O-bound tasks efficiently and makes it possible to run tasks concurrently. Common I/O-bound tasks include the following:

  • File system operations, such as reading and writing files and accessing a file’s metadata like its size, creation date, and modification date.
  • Network operations, such as HTTP requests and socket communication.
  • Database operations, such as running SQL queries for CRUD (create, read, update, and delete) operations.
  • User input and output operations, such as reading the user input from the keyboard, mouse, or other input device and displaying output to the screen. These operations are critical in GUI (graphical user interface) applications, where rendering the interface can be resource-intensive.
  • External device communication, such as interacting with external sensors, printers, or other peripherals connected to serial or parallel ports.

In Python, asynchronous code runs in an event loop, which you typically start with the asyncio.run() function.

When you iterate over an async iterable using an async for loop, the loop gives control back to the event loop after each cycle so that other asynchronous tasks can run. This type of iteration is non-blocking because it doesn’t block the app’s execution while the loop is running.

Asynchronous iterators and iterables allow for asynchronous iteration, which lets you perform tasks concurrently.

Concurrency allows multiple tasks to progress by sharing time on the same CPU core or to run in parallel using multiple CPU cores. This programming technique can help you make your code more efficient. It also allows you to prevent blocking your program’s execution with time-consuming tasks like the ones listed above.

Asynchronous programming is a specific type of concurrency based on non-blocking operations and event-driven execution. That’s why async code runs in a main event loop, which takes care of handling asynchronous events.

In your asynchronous programming adventure in Python, you’ll probably be required to create your own asynchronous iterators and iterables. In practice, the preferred way to do this is using async generator iterators, which is the topic of the following section.

Creating Async Generator Functions

In Python’s documentation, an asynchronous generator is defined as shown below:

A function which returns an asynchronous generator iterator. It looks like a coroutine function defined with async def except that it contains yield expressions for producing a series of values usable in an async for loop. (Source)

For a quick illustration of a generator iterator, consider the following modification of your async range iterator:

An asynchronous generator is a coroutine function that you define using the async def keyword construct. The function must have a yield statement to generate awaitable objects on demand. In this example, you simulate the awaitable object with asyncio.sleep() as you’ve done so far.

Asynchronous generator functions can contain await expressions, async for loops, and async with statements. This type of function returns an asynchronous generator iterator that yields items on demand.

For a more elaborate example, say that you want to create a script to back up the files in a given directory. You want the script to process the files asynchronously and generate a ZIP file with the content.

Below is a possible implementation of your backup script. First, note that for the script to work, you need to install the aiofiles and aiozipstream packages from PyPI using pip and the following command:

Now that you have the external dependencies installed, you can take a look at the code:

In this example, you first import the required modules and classes. Then, on line 7, you define an async generator function. In this function, you take a list of files as an argument. The items in this list must be dictionaries with a "file" key that maps to the file path. On line 8, you create an AioZipStream using the list of files as an argument.

On line 9, you start an async for loop over the stream of zipped data. By default, the .stream() method returns the zipped data as chunks of at most 1024 bytes. On line 10, you yield chunks of data on demand with the yield statement. This statement turns the function into an async generator.

On line 22, you create the directory variable to hold the target directory. In this example, you use Path.cwd() which gives you the current working directory. In other words, the directory defaults to the folder where your script is running. Finally, you run the event loop. If you run this script from your command line, then you’ll get a ZIP archive with the files in the script’s directory.

In practice, using async generator functions like the ones in this section is the quicker and preferred approach to creating async iterators in Python. However, if you need your iterators to maintain some internal state, then you can use class-based async iterators.

Creating Class-Based Async Iterators and Iterables

If you need to create async iterators that maintain some internal state, then you can create the iterator using a class. In this situation, your class must implement the .__aiter__() and .__anext__() special methods.

In the following sections, you’ll study .__aiter__() and .__anext__() in more detail. To kick things off, you’ll start by learning about the .__aiter__() method, which is part of the async iterators protocol and is the only method required for implementing async iterables. Then, you’ll learn about the .__anext__() method.

The .__aiter__() Method

When you create async iterators, the .__aiter__() method must be a regular method that immediately returns an async iterator object. The typical implementation of this method in an async iterator looks something like this:

There isn’t much to this implementation. You define the method as a regular instance method and return self, which holds the current object, and the object is the iterator.

When creating async iterables, you only need to implement the .__aiter__() method for the iterable to work. However, in this case, the method will have a more elaborate implementation that returns a proper async iterator object that yields items on demand.

In practice, you’ll often code .__aiter__() as an async generator function with the yield statement. For example, say that you want to create an async iterable to process large files. In this situation, you can end up with the following code:

Python large_file_iterable.py

In this example, the AsyncFileIterable class implements only the .__aiter__() method as an async generator function. The method opens the input file and reads it in chunks. Then, it yields file chunks on demand. With this iterator, you can process large files in chunks without blocking the script’s execution. That’s what you simulate in the script’s main() function.

Go ahead and run the script against one of your large files. To do this, update the path to your large file when you instantiate AsyncFileIterable in the main() function.

Another way to write the .__aiter__() method is to make it return an existing async iterator:

In this example, the AsyncIterable class returns an instance of AsyncIterator from its .__aiter__() method.

The .__anext__() Method

Only async iterators need the .__anext__() method. This method should be async def because it needs to perform asynchronous operations to fetch the next item during iteration. So, the method should generally look something like this:

The .__anext__() method must return an awaitable object. It can be a coroutine object or an object with an .__await__() method.

Another characteristic of .__anext__() is that it has to raise a StopAsyncIteration exception when the data stream is exhausted or consumed. This exception will tell Python to terminate the iteration process.

Here’s an example of creating an async iterator to process a large file in chunks. It works the same as the example in the previous section, but instead of using an async iterable, it uses an async iterator that implements both the .__aiter__() and .__anext__() methods:

Python large_file_iterator.py

In this example, the .__aiter__() method provides the minimal required implementation of an async iterator. It just returns self, which is the current object—the iterator itself.

Then, you define the .__anext__() method. First, you open the file asynchronously. Note that you can’t use an async with statement here because, if you did, you’d be opening and closing the file in every call to .__anext__(), and your code wouldn’t work.

Next, you read a chunk of the target file, which is your awaitable object. The second conditional statement checks whether the chunk holds data. If not, then you close the file and raise the StopAsyncIteration exception to signal that the data is exhausted. Finally, you return the awaitable object, chunk.

In main(), you iterate over the file’s chunks and process them. Go ahead and run the script. You’ll get the same result as in the previous section.

Up to this point, you’ve coded several examples of async iterators. In most cases, you’ve used the iterators in async for loops. However, there are other constructs where you can use these iterators. You can also traverse them with the built-in anext() function or a comprehension.

In the following sections, you’ll learn how to use iterators with these alternative tools.

The Built-in anext() Function

You can use the built-in anext() function to traverse an async iterator one item at a time in a controlled way. It’s particularly useful when you need more granular control over the iteration process. For example, you may need to skip a few items from the iterator before getting to the data that you want to process.

Consider the following code example where you create an async iterator to process CSV files:

In this example, the AsyncCSVIterator reads a CSV file’s content once in the .__anext__() method. The reading task runs asynchronously. Then, it returns a single line at a time.

In main(), you use anext() to skip the first row of the CSV file. This line typically contains the headers for your data. Then, you start a loop over the rest of the rows, which hold the actual data.

The anext() function can also help when you must iterate over potentially infinite async iterators. In this situation, using an async for loop may be inappropriate. Alternatively, you can use anext() in a while loop.

Consider the following async generator that yields potentially infinite integer numbers on demand:

This async generator function yields a potentially infinite stream of integer numbers. The call to asyncio.sleep() simulates an asynchronous operation here.

To process this iterator, you can use a while loop along with the anext() function instead of using an async for loop:

In this code snippet, you have a main() function that implements a potentially infinite while loop. The code explicitly communicates that you’re running a potentially infinite loop, which would be harder to communicate with an async for loop.

The anext() function lets you retrieve numbers from the async iterator on demand. Then, you can process the current number. Finally, you use the stop argument in a conditional to stop the loop.

Asynchronous Comprehensions and Generator Expressions

You can also use async iterators and iterables in asynchronous comprehensions. To create an async comprehension, you can use the following syntax:

  • List comprehensions: [item async for item in async_iter]
  • Set comprehensions: {item async for item in async_iter}
  • Dictionary comprehensions: {key: value async for key, value in async_iter}

These comprehensions look like regular comprehensions. The only difference is that you need to use the async for construct, and the async_iter object should be an asynchronous iterator, iterable, or generator.

To illustrate how async comprehensions work, consider the following example:

In the first highlighted line, you use a list comprehension to generate five integer numbers using the async_range() generator function. In the second highlighted line, you create a dictionary comprehension using the numbers as keys and their string representation as values.

When you run the example, you’ll have to wait for the code to complete, and then you’ll get the following output on your screen:

Both comprehensions work as expected. You can play around with other examples and generate a set of numbers, a list of squares, and so on.

Finally, you can also create asynchronous generator expressions with the following syntax:

Async generator expressions are similar to async comprehensions, but the difference is that they use parentheses instead of other brackets. In this case, instead of a list, set, or dictionary, you get an async generator iterator. Then, you can use this iterator as you’d use a regular one.

Async Iterators in Concurrent Code

Asynchronous iterators shine when used in asynchronous apps that perform several other asynchronous tasks apart from just async iteration. In these situations, your async for loops can give control back to the app’s event loop so that it can run other tasks while waiting for time-consuming tasks to complete.

In the end, the purpose of asynchronous code is to allow you to execute multiple operations concurrently instead of sequentially, making your code more efficient and preventing unresponsive programs.

Up to this point, your code examples only show apps that loop over async iterables or iterators and don’t run other async tasks. This practice doesn’t make much sense because async for loops don’t run the iteration concurrently but sequentially. In other words, an async loop iterates over an item. When that iteration finishes, then the loop starts the next iteration, and so on, until it consumes the data.

The real benefit of an async loop in terms of efficiency comes when you run other asynchronous tasks while the loop is running a long-lasting iteration.

To illustrate this situation with an example, say that you have an AsyncCounterIterator that increments a count asynchronously. Here’s the code for this class:

This counter increments the count using asyncio.sleep() to simulate awaitable objects with a random execution time.

In the code below, you create a task() function that iterates over an input async iterator and prints a message to the screen. The main() function calls task() twice. Each time, you pass a new instance of your iterator with a different name. Finally, you run the event loop as usual:

In this example, the await statements run sequentially, which means that the second statement runs only after the first one has finished:

As you can conclude from this output, the calls to task() run sequentially. This means that your program can’t run a task from the second loop while the first loop is running. The ideal behavior will be that the first loop’s execution doesn’t block the execution of the second loop.

To fix this issue and make the code work concurrently, you can do something like the following:

In this update of your counter.py script, you use the asyncio.gather() function to run awaitable objects concurrently.

Now, when you run your script, you get an output similar to the following:

Note that the script now produces items from each task concurrently. This means that the first task doesn’t block the second task’s execution. This behavior can make your code more efficient in terms of execution time if the running tasks are I/O-bound and non-blocking operations.

Conclusion

Now you know how to write asynchronous iterators and iterables in Python. Asynchronous iterators are what Python uses to control async for loops, while asynchronous iterables are objects that you can iterate over using an async for loop, the built-in anext() function, or an async comprehension.

With async iterables and iterators, you can write non-blocking loops in your asynchronous code. This way, you can perform different tasks asynchronously.

In this tutorial, you’ve learned how to:

  • Differentiate async iterators and iterables in Python
  • Create async generator expressions and generator iterators
  • Write async iterators and iterables using .__aiter__() and .__anext__()
  • Use async iterators in async loops and comprehensions

With this knowledge, you can start creating and using asynchronous iterators and iterables in your code, making it faster and more efficient.

Take the Quiz: Test your knowledge with our interactive “Asynchronous Iterators and Iterables in Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Copy

Copied!

Happy Pythoning!

Source: realpython.com

Related stories
1 month ago - Take this quiz to test your understanding of how to create and use Python async iterators and iterables in the context of asynchronous code.
3 weeks ago - This tutorial covers how to write a Python web crawler using Scrapy to scrape and parse data, and then store the data in MongoDB.
4 hours ago - Design thinking workshops are your key to turning big problems into clear solutions. In this blog, I share how to run them efficiently and keep your team aligned. The post How to run a design thinking workshop appeared first on LogRocket...
1 day ago - Today in my stream (I'll share a link to the video at the bottom), I spent some time digging into Leaflet and worked on a demo that made use of the National Parks Service API. This is a fun API I've used many times in the past,...
1 month ago - If you're familiar with languages like JavaScript and Python, you may have heard about asynchronous programming. And perhaps you're wondering how it works in Rust. In this article, I'll give you a simple overview of how asynchronous...
Other stories
1 hour ago - Ubuntu 24.10 ‘Oracular Oriole’ is released on October 13th, and as you’d expect from a new version of Ubuntu, it’s packed with new features. As a short-term release, Ubuntu 24.10 gets 9 months of ongoing updates, security patches, and...
3 hours ago - Did you know that CSS can play a significant role in web accessibility? While CSS primarily handles the visual presentation of a webpage, when you use it properly it can enhance the user’s experience and improve accessibility. In this...
4 hours ago - New memory-optimized X8g instances offer up to 3 TiB DDR5 memory, 192 vCPUs, and 50 Gbps network bandwidth, designed for memory-intensive workloads like databases, analytics, and caching with unparalleled price/performance and efficiency.
4 hours ago - Gain indispensable data engineering expertise through a hands-on specialization by DeepLearning.AI and AWS. This professional certificate covers ingestion, storage, querying, modeling, and more.
7 hours ago - The FARM stack is a modern web development stack that combines four powerful technologies: FastAPI, React, and MongoDB. This full-stack solution provides developers with a robust set of tools to build scalable, efficient, and...