One skill a day: the perfect combination of coroutines and multi-processes

We know that coroutines are essentially single-threaded and single-process, and achieve high concurrency by making full use of IO wait time. Code outside the IO wait time is still run serially. Therefore, if there are many coroutines, how much the serial code running time inside each coroutine exceeds the waiting time of IO requests, then there will be an upper limit on its concurrency.

For example, the rice cooker cooks rice, the washing machine does laundry, and the kettle boils water. They can all run by themselves after starting the device. We can use their own running time to make these three things seem to be happening almost at the same time. But if in addition to these three things, there are dozens of things such as turning on the TV, turning on the air conditioner, and sending WeChat. Each thing taken out alone really only needs to be a beginning, and the rest is to wait, but since it takes time to do this beginning, it takes a lot of time to start all of them, and your efficiency is still stuck.

Now, if there are two people doing these things together, it’s a different story. One cooks and boils water, while the other runs the washing machine, TV and air conditioner. Efficiency is further improved.

This is the combination of coroutines and multi-processes. Multiple coroutines in each process run at the same time, making full use of each core of the CPU, and making full use of the IO waiting time to fully run the CPU and network bandwidth. Strong alliance, faster.

There is a third-party library aiomultiprocess that allows you to combine multiprocessing and coroutines with a few lines of code.

First install using pip:

 1
 python3 -m pip install aiomultiprocess

Its syntax is very simple:

 1
2
3
 from aiomultiprocess import Pool
async with Pool() as pool:
results = await pool.map(coroutine, parameter list)

In just 3 lines of code, it will start a process per core on your CPU, and each process will start coroutines continuously.

Let’s write a piece of actual code:

 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
 import asyncio
import httpx
from aiomultiprocess import Pool

async def get (url) :
async with httpx.AsyncClient() as client:
resp = await client.get(url)
return resp.text


async def main () :
urls = [url1, url2, url3]
async with Pool() as pool:
async for result in pool.map(get, urls):
print(result) # The content returned by each URL

if __name__ == '__main__' :
asyncio.run(main())

When I wrote an article on asynchronous coroutines before, some students would ask me, is the speed of the crawler really that important? Isn’t it the most important thing to break through anti-reptiles?

My answer is, don’t think that you are a crawler when you use aiohttp to request a URL. In microservices, if you request your own HTTP interface, you also need to use httpx or aiohttp. In such a scene, speed is very important, and sometimes it is necessary to do it as fast as possible.

For more usage of aiomultiprocess , you can refer to its official documentation .

This article is reprinted from: https://www.kingname.info/2022/04/22/aiomultiprocess/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment