Detection and identification algorithm of network proxy behavior characteristics based on traffic and network connection
foreword
2022 Children’s Day is coming, let’s celebrate╰(○’◡’○)╮
This article discusses bilaterally from the perspectives of proxy and cybersecurity offensive and defensive technologies.
I gave it to you, what? people and how to prevent Do you still want to teach? (But the current products don’t seem to be very resistant to living.)
I hope you don’t fall down by Children’s Day╰(○’◡’○)╮ What should be repaired, what should be repaired, what should be run, should be run, should be replaced, should be spared.
Related technology introduction
According to references
- Analysis and understanding of TLS handshake protocol – analysis of an HTTPS request traffic packet
- SSL/TLS protocol parsing! What is SNI? SNI Identification?
You can simply learn about the relevant technologies yourself
To put it simply, in the current general situation of ordinary users
The following content falls under the scope of “encryption and confidentiality”:
- Path path
- Content web content
- Data content of cookies and other conversations
- User Agent User Agent
- HTTP Methods request method [not too sure, I remember it]
The following content does not belong to the scope of “encryption and confidentiality”:
- The IP, Port of the web visitor
- IP, Port of the web provider
- The host domain name of the web page (via SNI, the ashes of ESNI who tried to fight this have been lifted)
- DNS lookup for the domain name of the web page (unless encrypted DNS is used)
- time of visit
- The certificate of the webpage (without private KEY)
- data transfer volume
- data transfer bandwidth
At the same time, the certificate of the web page will expose the following content
- Issuer, CA organization
- the object being issued
- Issue Date, Expiration Date
- The domain name range supported by the certificate (especially the multi-domain, pan-domain shared certificate)
- certificate public key
Network proxy behavior characteristics for traffic and network connections
If you want to know how to identify, you must know yourself and the enemy. According to the current mainstream proxy model, except for IPsec, OpenV*N, L2TP and other outdated products, the common ones are S*s, S*R, V*y, V*s , T*n, X*y, you know all these things, don’t ask me what this is, I’m not dao.
(The following domestic refers to the mainland China and inland areas to which the PRC belongs to the People’s Republic of China, excluding Hong Kong, Macao and Taiwan and some territorial disputed areas, and the outside is the area on the earth other than the “domestic areas”, including Hong Kong, Macao and Taiwan)
1-4 Link example is the general access method:
1. Links for normal domestic users to access unrestricted domestic websites
- Zhang San –> Domestic ISP –> Domestic website server
2. Links to restricted domestic websites for normal domestic users
- Zhang San –> Domestic ISP –> Security Gateway (DNS hijacking/pollution, HTTP hijacking/blocking, IPPort/TCP/UDP/IP blocking/resetting, BGP hijacking, man-in-the-middle attack) –> Crawling
- Zhang San –> Silly Browser, Silly System –> Crawling
3. Links for normal domestic users to access unrestricted overseas websites
- Zhang San –> Domestic ISP –> Cross-border and other security gateways (release) –> Overseas ISP –> Overseas website server
4. Links for normal domestic users to access restricted overseas websites
- Zhang San –> Domestic ISP –> Cross-border and other security gateways (DNS hijacking/pollution, HTTP hijacking/blocking, IPPort/TCP/UDP/IP blocking/resetting, BGP hijacking, man-in-the-middle attack) –> Crawling
- Zhang San –> Silly Browser, Silly System –> Crawling
The 5-8 link example is an unconventional approach to restricted access to overseas websites:
5. Directly use the overseas server as a proxy to access the link of restricted websites
- Zhang San –> Domestic ISP –> Cross-border and other security gateways (release) –> Overseas ISP –> Overseas proxy server –> Overseas ISP –> Overseas website server
6. Use the domestic server to transfer the overseas server as a proxy to access the link of the restricted website
- Zhang San –> Domestic ISP –> Domestic server –> Domestic ISP –> Cross-border and other security gateways (release) –> Overseas ISP –> Overseas proxy server –> Overseas ISP –> Overseas website server
7. Use the cross-border network dedicated line as a link to access restricted websites
- Zhang San –> domestic ISP –> domestic transit server –> cross-border dedicated line segment –> overseas server –> overseas ISP –> overseas website server
8. Use VNC and other remote desktop technologies to achieve remote access to restricted websites
- Zhang San –> Domestic ISP –> Domestic transit server –> Cross-border and other security gateways (release) –> Overseas remote desktop server –> Overseas ISP –> Overseas website server
Of course, in addition to cross-border and other security gateways, domestic ISPs themselves also have some censorship and blocking functions, such as DNS hijacking
Through the example of link 1-8, normal people should know how this link goes.
Through the above part, normal people should know what is the encrypted and non-encrypted content in the HTTPS/TLS environment
Then to achieve detection and identification, it is necessary to make targeted breakthroughs for key link nodes, bypass methods, and non-confidential content.
The idea of the algorithm and the example method
traffic pattern
In any proxy behavior, downloading and online viewing of large media files requires a long time, large flow, and large bandwidth to download
There are several breakthrough points
- Whether it is a common/heavyly used domain name (for example, Bilibili’s digital media overseas node, Steam international node download domain name)
- Is it a common/heavyly used IP (eg CloudFlare CDN IP, Akamai CDN IP)
- Average bandwidth during the point-to-point period (for example, 1080P traffic is 5-10M, 4K video traffic is 20-50M, and IDM downloads are full at full speed)
- Traffic symmetry (requires the server or its upper-level network management to supervise) (for example, the general traffic and bandwidth of the proxy server are peer-to-peer)
- Visitors, usage and access time of the domain name + IP
- Detect whether the connection belongs to ordinary, normal, and compliant use through active detection (such as checking the corresponding content of the network, checking whether the domain name and its associated domain name/IP/ASN are included/whether there is a black and white list record/is there any keyword/is there a real name? /Where is the registrar/Domain extension risk level)
Practical application means of comparison include but are not limited to
- Block strange domain name suffixes (such as cf, ga, gq, tk, ml five free domain names, such as xyz, de and other cheap domain names, such as me, cc, top and other domain names with a lot of black history)
- Block strange primary domain keywords (eg airport, v*n, FQ, v*y)
- Block strange subdomain keywords (such as HK, TW, SG, US, Azure, AWS, Hinet, IPLC, IEPL, v*y, lv, yun, cu, cm, ct, ddns, az, cn2, gia, 9929, dmit, do, vu, vir, rn, pr, cloud, emby, drive, cdn, gd)
- Block the average bandwidth of multiple time periods is very close (for example, continuous 5~50M, continuous running full)
- Block IP and ASN with black history (such as CFCDN, AWS, Linode, DO, Vultr, Oracle that are abused daily)
- A small number of individuals consistently visit strange websites
- The content of the website is stupid and the traffic performance is abnormal (such as the periodic table, SpeedTest, pagoda start page, WordPress, Whmcs and other websites that do not have continuous/large traffic)
At the same time, for machines that can be monitored (including but not limited to, Tencent Cloud domestic and foreign servers regulated by China, and domestic and foreign servers of China’s three major operators), traffic identification can be performed by deploying network management, network security, and firewall equipment. Whether you are equal from top to bottom, you can’t say it’s difficult, you can only say that you have a hand.
network connection mode
First of all, according to the link example, the “domestic transit server” that may be used can be detected first
No matter what kind of dedicated line or an ordinary computer room, there must be network management/network security/firewall equipment. You can analyze it through this part, for example, look at these foolish users:
Maybe this is unscrupulous. Some write the Path directly using HTTP/WS, and even TM with keywords. It can only be said that this is really a master trick, and those who use HTTPS, who are they fooling the domain name?
So for HTTP/WS, nothing is kept secret. As long as you use this in mainland China, no matter whether there is a cross-border or not, it is all plaintext and ostentatious. Of course, it is not necessarily all sent. After all, all WS are free of flow still playing.
For HTTPS/WSS, some contents are encrypted, but there are still a lot of non-encrypted contents, such as
- Certificate (whether it is a spam, high-risk certificate, such as CF, self-signed, test , LE/TA and other free certificates, expired, domain name mismatch, black domain name sharing)
- Domain name and its content (consistent with the attack direction of the domain name above, analyze from the aspects of public network data, black and white history, negative records, active detection, main/subdomain keywords, manual marking, etc.)
- Server IP and ASN (consistent with the attack direction of IP/ASN domain name, analyze the black and white history of IP/ASN, abuse situation, connection situation, active detection and port scanning, etc.)
- Visitor IP (for example, in the recent DNS query and HTTP access of the IP, whether there is access to sensitive words or restricted content, such as Google, VPN, Youtube, whether the IP access object has a proxy access model*)
Note: The proxy access model*, typically has
- Global proxy: basically no access in China, continuous single (or a small number of different) access to a certain IP or domain name abroad, there may be DNS plaintext query of domestic and foreign websites
- Bypassing the mainland: normal domestic access, continuous single (or a small number of different) access to an overseas IP or domain name, there may be DNS plaintext queries of domestic and foreign websites
- Bypassing restricted websites: normal access at home and abroad, occasional single (or a small number of different) visits to a certain IP or domain name abroad, there may be DNS plaintext queries of domestic and foreign websites
For unknown (encrypted) traffic, it can only be said to dance at the tip of the knife, second only to the high risk and easy identification of plaintext HTTP.
Summary of the algorithm
You can refer to Stripe Rader’s risk control multiplication scoring rules to assign weights according to your actual situation
The reference scoring items are
- Server’s IP
- ASN of the server
- Other websites with the same IP server
- CDN provider identification
- domain name suffix
- Search engine indexing of domain names
- The real name of the domain name
- Domain name filing
- the registrar of the domain name
- Domain name registration time and validity period
- DNS resolution provider for domain names
- The relevant DNS resolution set of the domain name
- CNAME identification for DNS resolution of the domain name
- Primary Domain Keyword
- Keywords for subdomains
- Multi-period bandwidth analysis
- Actively detect the corresponding code
- Keyword Recognition for Actively Probing Corresponding Content
- Feature recognition of corresponding content of active detection
- Issuer of the certificate
- Issue time and validity period of the certificate
- The issuer of the certificate
- The shared domain name of the certificate
- Visitor’s IP
- Visitor’s DNS query
- Visitor’s historical behavior
- Visitor’s proxy access model
- unusually large flow
At the same time, traffic symmetry, corresponding content identification, proxy access model and other contents need further research by users.
Single-point breakthrough on cross-border dedicated lines
Because many people advocate how stable, anti-risk, not afraid of peak period and high wall period, in fact, the core issue, sellers are much more aware than buyers
As a cross-border dedicated line seller’s announcement stated:
In recent days, I have frequently received users’ concerns about IPLC disconnection and disconnection. Unified description.
**Cloud IPLC dedicated line (including forwarding), the intranet and public network are controlled by professional switching and routing equipment, the intranet is stable and free from attacks. IPLC inlet to outlet stable operation. We have repeatedly stated that there will be a protocol block between the user’s local and the entrance public network (the situation is that the entrance cannot be connected). In this case, if you change the local address and recover, 90% of the local operators will block you (including the same local and entrance. location). You can also use the mobile data connection test. In such cases, you need to change the protocol and replace the local ip to solve (encrypt, encrypt, encrypt the section from the local to the entrance public network, don’t think that using a dedicated line operator will not block you, don’t forget that you need to go to the public network from the local to the entrance .)
It just so happened that someone named Nathosts was advertised recently. I don’t know how to advertise it. In short, the result was that the line provided by his family was pulled out and could not be used. This is also the main risk of cross-border private lines: directly end the private line.
Because the single-point object is relatively obvious, the cross-border dedicated line must have a domestic entrance, and this entrance is very likely to have a real name, although the real name is very likely to be fake/purchased/not authentic (but there are also sales managers, other companies see The money is used for the dedicated line. After all, if you don’t play too much, it’s not a big problem to block your machine in time), but in China, it means that the probability is within the control range, such as computer room inspection, network police seizure, etc., even if N Layer forwarding also needs to have a domestic entry for users to connect to, and then there must be a public network detection path when users go to the traffic entry. For example, the classic case is the user of the HTTP plaintext request in the figure above.
About the wrong seal
Once the risk control system improves the level of risk control, there will inevitably be misjudgments, such as bill cutting, rejection, and payment that are common in Stripe and other foreign merchants.
Generally speaking, the restriction can be achieved by a variety of methods, and the step restriction is also the main behavior of the current GFW (except for the sensitive period).
For example, starting from irregular blocking of orientation, DNS pollution, IPPort blocking, domain name blocking, etc., and finally blocking IP and domain name.
As long as it doesn’t have a major impact, you can do it. At present, people are not afraid at all. GitHub’s RAW library, Cloudflare’s cdn/page/worker, and the main domain name of jsdelivr have been released for a while, and some have been released. If you want, just shut it down, and then add DNS pollution, domain name blocking, and high-frequency anti-fraud in Jiangsu, Zhejiang, and Quanzhou. Obviously, you don’t take foreign forces in your eyes (˵¯͒ བ¯͒˵)
end
Have a nice time Yue’s Children’s Day, and also think about how to guard against it, don’t wait to be directly put on the wall.
This article is reprinted from: https://www.blueskyxn.com/202205/6060.html?utm_source=rss&utm_medium=rss&utm_campaign=%25e5%259f%25ba%25e4%25ba%258e%25e6%25b5%2581%25e9%2587%258f% 25e4%25b8%258e%25e7%25bd%2591%25e7%25bb%259c%25e8%25bf%259e%25e6%258e%25a5%25e7%259a%2584%25e7%25bd%2591%25e7%25bb%259c%25e4% 25bb%25a3%25e7%2590%2586%25e8%25a1%258c%25e4%25b8%25ba%25e7%2589%25b9%25e5%25be%2581%25e6%25a3%2580%25e6%25b5%258b%25e4%25b8% 258e%25e8%25af%2586
This site is for inclusion only, and the copyright belongs to the original author.