Reading Notes: “Learning Pod Memory Usage Using StressChaos Experience”

Original link: https://www.hwchiu.com/read-notes-58.html

Title: “Learning Pod Memory Usage Using the StressChaos Experience”
Category: others
Link: https://chaos-mesh.org/blog/how-to-efficiently-stress-test-pod-memory/

This article is an official article from Chaos Mesh, mainly to discuss why the actual condition of using Chaso Mesh to test the memory condition is inconsistent with the set condition. The article discusses all the problems step by step. Memory related mechanism in Kubernetes

At the beginning of the article, the author first deployed a simple Pod (only one container), the Pod set request: 200Mi, limits: 500Mi for the Memory part
As a result, the author went to the Container through the free and top commands, and observed that the displayed memory usage was as high as 4G. This part obviously conflicts with the set limit of 500Mi, so the first point here is to pay special attention.

Kubernetes uses cgroups to calculate and control the memory usage of Pods. However, commands such as free/top are not integrated with cgroups. Therefore, even if you run these two commands in the container, the output you see is actually related to the host. If you want to To know the number of real containers, you still need to use cgroup-related commands to get them, such as
cat /sys/fs/cgroup/memory/memory.usage_in_bytes

The article also specifically mentioned that Kubernetes will classify your Pods into three levels according to the Request/Limit setting method, namely BestEffort, Burstable and Guaranteed
Among them, when the system starts to find victims due to insufficient OOM, the application set as Guaranteed will be the lowest priority, and only when no other victims can be found, will the Guaranteed type Pod be processed.

Finally, we will discuss the use and management of Memory in Kubernetes in more detail. For Kuberentes, when the amount of Memory on the system is insufficient, Evict’s behavior may be triggered, and some running Pods will be kicked out of the node. As stated, Kubernetes is dependent on
Cgroup to deal with, so /sys/fs/cgroup/memory/memory.usage_in_bytes naturally becomes an important parameter of its decision-making

It should be noted that /sys/fs/cgroup/memory/memory.usage_in_bytes does not represent “the amount of Memory that is just being used on the system”, its value is determined by “resident set”, “cache”, “total_inactive_file” ” and other three aspects are combined, so Kubernetes will actually start with
Get related parameters from /sys/fs/cgroup/memory/memory.usage_in_bytes and /sys/fs/cgroup/memory/memory.stat, where the latter can get the number of total_inactive_file finally through the following formula
working_set = usage_in_bytes – total_inactive_file to get a variable named working_set, which can actually be obtained by kubectl top, which is also the main indicator used by kubernetes to judge whether to execute evict.

How much memory a node has available is determined by
memory.available = nodes.status.capacity[memory] – working_set
Therefore, the total amount of each node deducted from workign_set is the current available amount. Once the current available amount is lower than the threshold, that is, when k8s executes evict, the official website file is actually full of details to describe these operation behaviors. Interested can spend some time time to read all
https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/

personal information

I currently have Kubernetes-related courses on the Hiskio platform. Interested people are welcome to refer and share, which contains my various ideas about Kubernetes from the bottom to the actual combat.

For details, please refer to the online course details: https://course.hwchiu.com/

In addition, please click like to join my personal fan page, which will regularly share various articles, some are translated articles, and some are original articles, mainly focusing on the CNCF field
https://www.facebook.com/technologynoteniu

If you use Telegram, you can also subscribe to the following channels, where I will regularly push notifications of various articles
https://t.me/technologynote

Your donation will give me the motivation to grow my article

This article is reprinted from: https://www.hwchiu.com/read-notes-58.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment