How to Secure Cloud Services with Cloudflare – Literacy “WAF” and Other Tricks

Original link: https://blog.besscroft.com/articles/2023/cloudflare-triple-waf/

foreword

Before and after May Day, I took time to maintain my personal cloud service and reconstructed part of the cluster. Friends who visited in those days should be able to see a simple “maintenance page”. I won’t talk about the new personal cloud architecture here. After all, it’s not the content of this article. Interested friends can read my previous article (old architecture). This time, many unreasonable places before have been improved, and by the way, all IP and certificates have been replaced.

Dynamic k8s cluster implementation scheme

Target audience for this article

This article is mainly aimed at two types of readers, one is the heavy users of Cloudflare, and teaches you how to “target optimization and protection” under this premise; the other is readers who have never used or new to Cloudflare, of course, before reading this article, you At least one must have used cloud servers and be able to understand related concepts in cloud services.

If you are not familiar with Linux and cloud services, or have a slight lack of network skills such as CDN, HTTPS, DNS, etc., it is recommended that you buy a cheap cloud server and learn it briefly, and $4 per month is enough.

overview

Since we are literacy about Cloudflare and WAF, let’s introduce these two first.

Cloudflare is a global network designed to make everything you connect to the internet safe, private, fast and reliable.

This is the official introduction. To put it bluntly, it provides many services, but the most famous ones are CDN and DDNS services.

A WAF (Web Application Firewall) helps protect web applications by filtering and monitoring HTTP traffic between the web application and the Internet. It typically protects web applications from attacks such as cross-site forgery, cross-site scripting (XSS), file inclusion, and SQL injection.

Note: WAF in this article mainly refers to Cloudflare WAF.

Cloud service attack surface

Speaking of the attack surface, here we take the personal use of cloud services as an example (I have little contact with large-scale cloud application scenarios, sorry).

Network security: The attacker obtains your server IP, unsafe transmission protocols (such as HTTP), etc., and then conducts attacks.
Legitimacy of access: If every user can have unimpeded access to your service, it will inevitably increase security risks. The following will talk about how to deal with it. After all, we need to ensure the access of normal users.
Network attack: DDoS attack, which usually leads to downtime of cloud services. After all, a simple operation on a personal notebook can “instantly” drive a student computer into a black hole. Next, I will talk about how to “mitigate” DDoS to the greatest extent.

Those with a heart can’t prevent it, but personal cloud services can also set some rules to ease it. The real boss, there is no need to waste time and cost to attack us. In fact, most of the cases are defended against “indiscriminate attack” scripts.

Of course, what is often said about software security, data security, and supply chain security are beyond the scope of this article. If you are interested, you can also talk to me!

encrypted communication

Everyone must be familiar with HTTPS, but the certificate chain “trust” issue, and whether each link from the client to the server is encrypted with a trusted certificate. After all, once the source site is exposed, all kinds of annoying scripts will come. is you. But also to prevent data leakage, and data tampering.

Here is an official picture, we can see that there are 2 locks on the picture. The lock between the browser and Cloudflare is the Edge certificate, which is presented by Cloudflare to customers who visit your website or application. The lock between Cloudflare and the origin server is the origin server certificate (Origin certificate), which encrypts the traffic between the origin server and Cloudflare.

Cleverly using Alibaba Cloud OSS to play with free distributed storage , the principle is like this, ensuring that all traffic passes through CloudFlare to achieve “flow-free”.

Same Origin Policy

The same-origin policy here may be slightly different from the same-origin as everyone understands in development, but the principle is similar.

To put it bluntly, it is to combine CDN to set the same-origin policy for static resources, while improving the cache hit rate, and reducing the risk of “origin site” being discovered.

Why should you care about cloud security?

Cloud security (of course it can also be extended to data security, etc.), I don’t think I need to explain too much. For enterprises, it is not too much to say that it is related to “life and death”.

Personally speaking, I have summarized the following points:

For the sake of protecting personal privacy.
In consideration of saving time, after the configuration is completed, the service can be hung up, which can significantly reduce the maintenance time, so as not to waste time dealing with various troubles every day.
Consider it from a learning perspective. In this process, you can not only master theoretical knowledge, but also carry out actual combat. Why not do it?

common means of attack

Let me explain briefly, I can’t teach others to do this, not to mention my limited level, I can’t teach it either.

SOP-based attack and cross-domain resource detection

SOP, also known as the Same Origin Policy, is one of the most important security mechanisms in the Web domain. Generally speaking, attacks against SOP are all browser-based, that is, “bypassing SOP technology”.

We know that the same hostname, protocol and port can be regarded as the same source. So what are the SOP-based attack methods in “cloud services”?

Attack directly on resources (usually static resources are placed on CDN, which is why everyone often sees “sky-high bills”).
Detect cross-domain resources, find out the source site, and attack.

The premise of the attack is that the rules are misconfigured. For example, resources allow cross-domain, causing other sources to request resources crazily. Or there is a cache policy configuration problem, resulting in a low cache hit rate. Or it just caches resources, but does not hide the source site well. If properly configured, the attack can be greatly mitigated.

Downgrade HTTPS to HTTP

This is also called bypassing HTTPS. There is a premise here that the certificate cannot be trusted.

Theoretically speaking, the content encrypted by HTTPS will not be leaked during the “transmission process”. If the key is leaked, then reflect on yourself.

Here is a point that many people may overlook, that is, there is no mandatory access to HTTPS on the server, and even HTTP access is reserved, which is very risky. If users can access the service directly through HTTP, what “useful” information can they get, don’t I need to say more?

Many years ago, I also made this mistake without knowing it, and I corrected it later. Once I’m on HTTPS communication, there’s no reason to keep using HTTP. Using HTTP is undoubtedly telling others, hey, my source site is here, and there is no mitigation protection, come and attack me (doge

Distributed Denial of Service Attack (DDoS)

A distributed denial of service (DDoS) attack is a malicious act that floods the target server or its surrounding infrastructure with large-scale Internet traffic to destroy the normal flow of the target server, service or network.

To put it bluntly, there are many “users” who send traffic to you crazily, and then you can’t handle it anymore, so it goes down (it also affects the normal access to the service).

Of course, it is impossible for us to treat a large number of visits as attacks. Therefore, it is necessary to monitor and analyze the traffic flow of the service. For example, you can see if there is only a surge in traffic for a certain API or page, whether the IP address or IP range of the visitor is suspicious, and so on. Or is the user’s device type, geographic location, browser header, etc. repeated a lot.

“Discovery” server IP address and domain name enumeration

Many people may not understand what this means. Let’s first look at a picture.

The service of censys.io is used here, as long as a domain name is entered, it can query all the origin site information under the domain name (including sub-domain names). It will scan in batches and associate certificates (with a domain name in the certificate), and scan all your ports at the same time. Of course, there are other ways. If you really want to prevent it, you can directly close the access of its scanning subnet IP to your server in the firewall.

Although it is impossible to find out by using this website after closing it, this method is still not eliminated, and other platforms/individuals can still scan it out.

In the picture, the information of one of my servers was scanned (I only kept this one for demonstration, because the downtime of this server has no effect). Some friends may have to ask, here I have already configured it based on Cloudflare, why is the source site still leaked?

The analysis by platform is also a certificate leakage. Based on my comparison of other server settings, I am sure that the “monitoring tower” has leaked my certificate, but I don’t care whether it is my configuration error or other reasons. (Take a step back, I made a mistake, so I don’t need you, can’t I?). Then it matches the same Body Hash with the HTTP port of Aria2, and points to an Ngixn vulnerability , which is hard to prevent!

Challenge black hole attack (CC)

Challenge black hole attack (CC) is a type of cyber attack, similar to DDoS attack, but it targets a specific URL or page of the target website. The attacker uses a large number of computers or devices to send requests to the target URL or page, overloading it and failing to work properly, resulting in service interruption or unavailability.

What is the use of this attack? For example, if you use Cloudflare to proxy OSS traffic, although there is no need for outbound traffic charges, the number of requests needs to be billed, that is, anti-D is not anti-C.

IP spoofing and more

IP spoofing is the creation of Internet Protocol (IP) packets with source addresses modified either to hide the sender’s identity, to impersonate another computer system, or both. Malicious users often use this technique to launch DDoS attacks on targeted devices or surrounding infrastructure.

Responses

Finally, it’s time for actual combat. Let’s take a look at how to use various configurations of Cloudflare to mitigate various attacks and protect cloud services.

DDoS mitigation

Let’s first understand, what should we do first when dealing with a DDoS attack? The answer is: detection . That’s right, we must first judge whether it is an “attack” or the website is indeed “too popular” , we need to be able to distinguish attack traffic from normal traffic at scale.

On the service provider platform that provides DDoS mitigation, there are generally two ways to do it. One is the platform’s own automatic mitigation measures, that is, the form of full hosting, and the platform automatically judges whether it has been attacked and protects it. The second is to customize DDoS hosting rules. For example, CloudFlare can customize rules to match attack vectors at layer 7 (application layer).

As shown in the figure, first we find部署DDoS 替代.

Then fill in and select, and save. “Rule Set Operation” and “Rule Set Sensitivity” come according to your own needs.

It should be noted that HTTP DDoS attack protection supports customization, while network layer and SSL/TLS DDoS attack protection is automatically mitigated.

You say you don’t know your own needs? Then you are like me, just use platform customization.

Web Application Firewall (WAF) and API Protection

WAF is used to mitigate malicious request traffic, according to the location in the figure,安全性-> WAF ->创建规则.

Then start the configuration. For example, I need to judge the complete URI here, skip matching all requests under api.besscroft.com , and record it. Because my API is configured with a reverse proxy, if you directly enable “challenge”, a 403 error will appear, which means that the “challenge session” cannot be loaded on the page, which is the content shown in the figure below:

And the records we just checked will be recorded in the event column.

I enabled it directly on the whole site, because it saves trouble. You can understand that if you do this, the custom configuration will become a whitelist mode. And if you don’t know which rules you need, you can also enable the whole site first, then visit the website to see it yourself, and then according to the detailed information of the request in the event, you can debug and configure it in a targeted manner! The whole site is enabled in概述-> Under Attack 模式, just open it directly!

WAF’s rate limiting rules will be discussed below. As for hosting rules, free users don’t need to manage them. If you want more comprehensive protection, a full set of Cloudflare hosting rules and firewall analysis, you should pay for it!

WAF tool configuration, you can customize the rules and declare under which conditions users can access. For example, IP access rules, ChatGPT should use this configuration to block IP access in certain places, and many friends who have used it should be able to see the “5-second shield” (of course there is a way to break the shield). And in some places, accessing ChatGPT can even be said to be a two-way wall, I won’t say where .

However, I haven’t added any tool configurations, because I don’t have similar needs for the time being, and I don’t need to prevent crawlers and the like.

Encryption certificate management (leave it to Cloudflare, reduce risk (controversial))

Why did I add controversy to this title? Because if you think that Cloudflare is untrustworthy, then the risk is indeed quite large, especially for enterprises. But for our individual users, it is still usable, and there is no need for Cloudflare to smash its own brand (probably?

First of all, we find SSL/TLS ->概述, and then choose the encryption mode between完全and完全（严格） . If you can’t trust the platform’s certificate, you can use your own certificate, depending on your needs. Then, for the SSL/TLS suggestion program, I recommend you to open it, and sometimes you will indeed receive a reminder email (in case you missed the configuration, right?

Then配置规则in the figure, and the meaning just now, I also configured the api subdomain to skip.

DNSSEC

DNSSEC can resist forged DNS responses, that is, to protect your DNS resolution from being maliciously tampered with.

After we hand over the resolution of the domain name to Cloudflare for management, enable the setting in DNS ->设置, and configure the DS record at the domain name service provider as required!

Server firewall management (only allow traffic from Cloudflare)

Just look at the title here and you will know what to do. I suggest that it is best to configure it. First, we visit Cloudflare IP Ranges , and then add the addresses under all IP ranges to the firewall.

This measure is mainly used for API protection. As I said above, my API needs to be reverse-proxyed, so I must add this configuration. Of course, I’m just being lazy here. I used CloudPanel as the server management panel. It has a function called仅允许来自Cloudflare 的流量, just enable it!

Use Rate Limiting to Prevent Brute Force and Layer 7 DDoS Attacks

Sometimes we may need to prevent brute force cracking, or just want to prevent attacks. However, some requests seem to be very normal in disguise, and they cannot be distinguished at all. At this time, we can limit the request rate. As an ordinary user with normal access, you will not click fast…

As shown in the figure, we can limit the requests matching any URI, and the same IP can only be requested how many times in a few seconds. I have configured 50 times in 10 seconds here, and it is actually okay to be stricter.

Under Attack 模式mentioned above can also be enabled in安全性->设置. The query passing period is to come again after the set period, otherwise the access request cannot be continued. When I used ChatGPT in the second half of last year (2022), I could notice that this function was turned on, because after opening the Chat page for a period of time, after asking questions on the current page, ChatGPT could not give an answer, but said that it was related to network errors hint. It was also easy to solve at that time. Open a new browser tab, pass the verification, and then close it again, and you can continue to use Chat on the current page. Now, ChatGPT has obviously added a lot of rules, some paid and others, some of which I can try out, but some are the first time I have seen it.

Q&A

What is the difference between HTTP and HTTPS?

The data transmitted by HTTP is in plain text, which is not secure; while HTTPS uses encryption technology to protect data transmission, which is more secure. HTTP is suitable for transferring non-sensitive information, and HTTPS is suitable for transferring sensitive information.

When a browser communicates using HTTPS, can a proxy such as Cloudflare see the communication?

Generally speaking, it is impossible to see. With this end-to-end encryption, only the client and server can see the content of the data. However, request metadata can be seen, which is what we often say, you can’t see the content of the website you visit, but you know what website you visit.

But, Cloudflare acts as a Man-in-the-middle (MITM) and generally uses certificates issued by it, so… know what I mean?

What are reverse proxy and origin server?

The origin site can be understood as the “source” site, that is, the original server we hide behind the CDN. When a user makes a request, the CDN will first check whether it has cached resources, and if so, directly respond to the user. If not, it will request to the origin site, and follow the caching rules to see if it needs to be cached, and how to cache it. The resource responds to the user.

A reverse proxy is a server that sits in front of one or more web servers and intercepts requests from clients. It will accept the user’s request, forward it to the origin site, and “represent” the origin site to send a response to the user.

Why is there no real visitor IP in the log after using WAF?

Because the traffic is taken over by Cloudflare, all you see is the Cloudflare IP. We can see the client IP address in the HTTP request header , or restore the original visitor IP directly in the source server logs .

at last

In fact, this article only talks about a small part, and it is also a little bit of experience I have accumulated in the process of using it. I hope it can help readers and friends. There are also various settings and functions on the Cloudflare platform. It can do far more than these, and it is particularly powerful. My annual cloud service fee is only the domain name fee. Originally, it was tens of dollars a year, but I bought a new one this year. Now I have a domain name with a com and a dev suffix in my hand, and I only spend more than 100 yuan a year. It can be said that my entire access layer and part of the infrastructure are all managed based on Cloudflare, which is really easy to use and convenient.

The article is written here, and friends are welcome to communicate with me!

Three passions, pure but overwhelmingly strong, have governed my life: the longing for love, the search for knowledge, and an unbearable compassion for the suffering of mankind.

Three passions, simple but overwhelmingly strong, have governed my life: the longing for love, the search for knowledge, and unbearable pity for the suffering of humanity.

—— “Why I Live” Russell (philosopher, mathematician, thinker)

This article is transferred from: https://blog.besscroft.com/articles/2023/cloudflare-triple-waf/
This site is only for collection, and the copyright belongs to the original author.