Troubleshoot the problem of increasing tcp connections

Original link: https://jasonkayzk.github.io/2022/08/08/%E6%8E%92%E6%9F%A5tcp%E8%BF%9E%E6%8E%A5%E6%95%B0% E4%B8%8D%E6%96%AD%E5%8D%87%E9%AB%98%E7%9A%84%E9%97%AE%E9%A2%98/

A “weird thing” happened recently. The number of TCP connections on my server is increasing at a rate every day. It must be irrelevant where long connections have been made;

I have been quite busy recently, I took the time to look at it tonight and solved it;

Troubleshoot the problem of increasing tcp connections

At first, the interface was deployed on the cloud function. Since the blog has not received a lot of visits, the pod was killed in a short time, so even if there was a tcp connection leak, it was not found…;

But recently cloud functions started to charge, and there is no free quota anymore, so I moved the service back to my own server;

So the bug of TCP leak is revealed;

The problem is as follows, you can see that the number of TCP connections keeps rising;

At first I thought it was just the Redis connection that was never released, but it turned out that the MongoDB connection was never released!

Configure Redis connection

At first I thought it was a Redis connection problem, so I modified the Redis configuration:

 # Close the connection after a client is idle for N seconds (0 to disable)- timeout 0+ timeout 3600# TCP keepalive.## If non-zero, use SO_KEEPALIVE to send TCP ACKs to clients in absence# of communication. This is useful for two reasons:## 1) Detect dead peers.# 2) Force network equipment in the middle to consider the connection to be# alive.## On Linux, the specified value (in seconds) is the period used to send ACKs.# Note that to close the connection the double of the time is needed.# On other kernels the period depends on the kernel configuration.## A reasonable value for this option is 300 seconds, which is the new# Redis default starting with Redis 3.2.1.- # tcp-keepalive 60 + tcp-keepalive 60
  • Configure timeout to 3600: allow connections to be idle for up to an hour;
  • configure tcp-keepalive to 60 : enable long links, but check long link status every minute;

After the configuration is complete, restart the Redis service;

At first I thought the problem was solved, but after a few days, the tcp connection was still soaring;

View TCP connections

This time I checked carefully;

First, see which server has more IPs connected to at the same time:

 netstat -an|awk -F: '{print $2}'|sort|uniq -c|sort -nr|head

A large number of 127.0.0.1 were found, indicating that it was indeed a local TCP connection leak;

Then, see how many TCPs have established bidirectional connections:

 netstat -npt | grep ESTABLISHED | wc -l

It was found that there were thousands of them, and it was confirmed that the TCP connection was not released;

Then check the MongoDB port connection:

 netstat -an | grep :<mongodb-port> | sort

The result is directly swiped by a large number of connections in the ESTABLISHED state, confirming that it is a leak of the MongoDB connection;

Check the code and find that the mongodb client created in the node service has not released the connection;

So modify the code and add client.close() ;

pass:

 lsof -i:<service-port>

Find the process PID corresponding to the service port, kill it directly, and then restart the service;

After observing for a period of time, I found that the TCP connection is normal!

appendix

Reference article:

This article is reproduced from: https://jasonkayzk.github.io/2022/08/08/%E6%8E%92%E6%9F%A5tcp%E8%BF%9E%E6%8E%A5%E6%95%B0% E4%B8%8D%E6%96%AD%E5%8D%87%E9%AB%98%E7%9A%84%E9%97%AE%E9%A2%98/
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment