Original link: https://jasonkayzk.github.io/2023/06/27/Haproxy%E7%AE%80%E4%BB%8B/

This article briefly introduces the use of HAProxy;

Introduction to Haproxy

HAProxy overview

HAProxy is a free load balancing software that can run on most mainstream Linux operating systems.

HAProxy provides L4 (TCP) and L7 (HTTP) load balancing capabilities with rich functions. The HAProxy community is very active, and the version is updated quickly. Most importantly, HAProxy has the performance and stability comparable to commercial load balancers.

Because of the above advantages of HAProxy, it is currently the first choice for free load balancing software;

In short, HAProxy is a load balancing software similar to Nginx;

HAProxy core functions

The core functions are as follows:

Load balancing: L4 and L7 modes, support RR/static RR/LC/IP Hash/URI Hash/URL_PARAM Hash/HTTP_HEADER Hash and other rich load balancing algorithms;
Health check: support TCP and HTTP two health check modes;
Session persistence: For application clusters that do not implement session sharing, session persistence can be achieved through Insert Cookie/Rewrite Cookie/Prefix Cookie, and the above-mentioned various Hash methods;
SSL: HAProxy can parse the HTTPS protocol, and can decrypt the request to HTTP and transmit it to the backend;
HTTP request rewriting and redirection;
Monitoring and statistics: HAProxy provides a web-based statistics page to display health status and traffic data. Based on this function, users can develop monitoring programs to monitor the status of HAProxy;

HAProxy key features

performance:

It adopts a single-threaded, event-driven, non-blocking model, reduces the consumption of context switching, and can process hundreds of requests within 1ms. And each session only takes a few KB of memory.
A large number of fine performance optimizations, such as O(1) complex event checker, delayed update technology, Single-buffereing, Zero-copy forwarding, etc., these technologies make HAProxy only occupy extremely low CPU resources under medium load.
HAProxy makes extensive use of the functional features of the operating system itself, enabling it to exert extremely high performance when processing requests. Normally, HAProxy itself only takes up 15% of the processing time, and the remaining 85% is completed at the system kernel layer.
The author of HAProxy conducted a test using version 1.4 8 years ago (2009). The processing capacity of a single HAProxy process exceeded 100,000 requests/s, and easily occupied 10Gbps of network bandwidth.

stability:

As a program that is recommended to run in single-process mode, HAProxy has very strict requirements for stability. According to the author, HAProxy has never had a bug that would cause it to crash in 13 years. Once HAProxy is successfully started, it will not crash unless the operating system or hardware fails (I think it may be somewhat exaggerated).

As mentioned above, most of the work of HAProxy is done in the operating system kernel, so the stability of HAProxy mainly depends on the operating system, fine optimization of sysctls parameters, and ensuring that the host has enough memory. In this way, HAProxy can continue to run stably at full load for several years.

Comparison between Nginx and HAProxy

Both of them can now support http/tcp/udp load balancing. Nginx adopts a configuration similar to a programming language, and uses a document structure to represent the configuration relationship, which looks relatively clear. The configuration of haproxy is a bit like a network device, definition and reference, sometimes To figure out a logic, you need to look up and down.

Nginx is a multi-process master-workers, each process is single-threaded, and multi-core CPUs can be fully utilized; haproxy is multi-threaded, and a single process can achieve ultra-high performance. Although haproxy can also have multiple processes, many online materials believe that multiple processes are enabled. It can’t improve performance, and it is not recommended to run multiple processes.

Even though the performance of nginx as a reverse proxy is slightly lower than that of haproxy, the performance of both is actually super high. I tested nginx on Alibaba Cloud 1c1g cloud host, and the http performance can reach at least 2000qps. After opening https, the performance is about 550 times. handshake/s. Performance issues are almost nothing to worry about.

HAProxy is a professional load balancing software. Compared with Nginx, although Nginx also supports 4-layer load balancing after version 19, in terms of performance and stability, HAProxy is more accepted by the market. nginx Excellent support for the web layer makes Nginx more suitable for layer 7 load balancing. HaProxy is comparable to hardware-level F5 load balancing devices for its stability and reliability.

Their respective characteristics are as follows:

nginx:

Use curly braces, hierarchical configuration file structure
In addition to the built-in map and if statements, simple logic can be realized, js/perl scripts are natively supported, and lua is unofficially supported
In addition to load balancing, it can also be used as a static web server and cache server (Haproxy does not work)
Modularization, on-demand compilation, because of modularity, many third-party extension modules can be selected
The open source version only has basic functions, and more functions need to toss third-party modules, or spend money to buy the official extended version of nginx plus

haproxy:

Definition and reference, imperative configuration structure
Supports acl, but does not support other scripting languages (someone in the comments said it can now be supported)
Better load balancing performance than nginx
There is a status stats page
Official support for session persistence, health check, etc. (nginx open source version does not include)
The basic function coverage is better than the nginx open source version, but it is not easy to expand and lacks third-party resources.

Install

Install using a package manager:

 apt install haproxy

Docker deployment:

 docker run -d --restart=always --name haproxy -p 18888:8888 -v /docker-data/haproxy:/usr/local/etc/haproxy haproxy:latest

Among them, 8888 is the port of the HAProxy state web that can be customized and configured;

Configuration instructions

Detailed key configuration

The HAProxy configuration file has 5 domains:

global : used to configure global parameters;
default : used to configure the default properties of all frontends and backends;
frontend : used to configure the front-end service (that is, the service provided by HAProxy itself) instance;
backend : Used to configure the instance group of the backend service (that is, the service behind HAProxy);
listen : The combined configuration of frontend+backend can be understood as a more concise configuration method;

global domain

The key configuration of the global domain:

daemon: Specify HAProxy to run in background mode, which should be used under normal circumstances;
user [username]: Specifies the user to which the HAProxy process belongs
group [groupname]: Specifies the user group to which the HAProxy process belongs
log [address] [device] [maxlevel] [minlevel]: Log output configuration, such as log 127.0.0.1 local0 info warning, that is, output logs from info to warning level to local0 of rsyslog or syslog. Where [minlevel] can be omitted. HAProxy logs have 8 levels, from high to low are emerg/alert/crit/err/warning/notice/info/debug;
pidfile: Specifies the absolute path of the file that records the HAProxy process ID. Mainly used for stopping and restarting the HAProxy process.
maxconn: The number of connections processed by the HAProxy process at the same time. When the number of connections reaches this value, HAProxy will stop receiving connection requests;

frontend domain

The key configuration of the frontend domain:

acl [name] [criterion] [flags] [operator] [value]: Define an ACL, ACL is the true/false value calculated by the specified expression according to the specified attribute of the data packet. For example, “acl url_ms1 path_beg -i /ms1/” defines an ACL named url_ms1, which is true when the request uri starts with /ms1/ (ignoring case);
bind [ip]:[port]: the port that the frontend service listens on;
default_backend [name]: the default backend corresponding to the frontend;
disabled: disable this frontend;
http-request [operation] [condition]: The strategy applied to all HTTP requests arriving at this frontend, such as rejecting, requiring authentication, adding headers, replacing headers, defining ACLs, etc.;
http-response [operation] [condition]: The policy applied to all HTTP responses returned from this frontend, roughly the same as above
log: Same as the log configuration of the global domain, only applied to this frontend. If you want to follow the log configuration of the global domain, configure it here as log global;
maxconn: the same as the maxconn of the global domain, only applied to this frontend;
mode: The working mode of this frontend, there are mainly two types of http and tcp, corresponding to two load balancing modes of L7 and L4;
option forwardfor: add X-Forwarded-For Header to the request, and record the client ip;
option http-keep-alive: provide services in KeepAlive mode;
option httpclose: corresponding to http-keep-alive, turn off the KeepAlive mode. If HAProxy mainly provides interface-type services, you can consider using the httpclose mode to save connection resources. But if this is done, the calling end of the interface will not be able to use the HTTP connection pool;
option httplog: enable httplog, HAProxy will record request logs in a format similar to Apache HTTP or Nginx;
option tcplog: Enable tcplog, HAProxy will record more attributes of the data packet in the transport layer in the log;
stats uri [uri]: Open the monitoring page on this frontend and access it through [uri];
stats refresh [time]: monitoring data refresh cycle;
stats auth [user]:[password]: the authentication username and password of the monitoring page;
timeout client [time]: refers to the timeout period for the client to continue not sending data after the connection is established;
timeout http-request [time]: refers to the timeout period when the client fails to send a complete HTTP request after the connection is established. It is mainly used to prevent DoS attacks, that is, after the connection is established, the request packet is sent at a very slow speed, resulting in HAProxy connection is occupied for a long time;
use_backend [backend] if|unless [acl]: used with ACL, forwarded to the specified backend when ACL is satisfied/not satisfied

backend domain

The key configuration of the backend domain:

acl: same as frontend domain;
balance [algorithm]: the load balancing algorithm among all servers under this backend, the commonly used ones are roundrobin and source, and the complete algorithm description can be found in the official document configuration.html#4.2-balance;
cookie: Enable cookie-based session retention strategy between backend servers, the most commonly used is the insert method, such as cookie HA_STICKY_ms1 insert indirect nocache, which means that HAProxy will insert a cookie named HA_STICKY_ms1 in the response, and its value is specified in the corresponding server definition value, and decide which server to forward to according to the value of this cookie in the request. indirect means that if the request already has a legitimate HA_STICK_ms1 cookie, HAProxy will not insert this cookie again in the response, and nocache means that all gateways and cache servers on the link are prohibited from caching responses with the Set-Cookie header.
default-server: used to specify the default settings of all servers under this backend. See the server configuration below for details.
disabled: disable this backend
http-request/http-response: same as frontend domain
log: same as frontend domain
mode: same as frontend domain
option forwardfor: same as frontend domain
option http-keep-alive: same as frontend domain
option httpclose: same as frontend domain
option httpchk [METHOD] [URL] [VERSION]: # Define the health check strategy in http mode. Such as option httpchk GET /healthCheck.html HTTP/1.1;
option httplog: same as frontend domain;
option tcplog: same as frontend domain;
server [name] [ip]:[port] [params]: # Define a backend server in the backend, [params] is used to specify the parameters of this server, commonly used include:
- check: When this parameter is specified, HAProxy will perform a health check on this server, and the check method is configured in option httpchk. At the same time, you can also specify the three parameters inter, rise, and fall after the check, which respectively represent the cycle of the health check. If it succeeds several times in a row, it will be regarded as server UP, and if it fails several times in a row, it will be regarded as server DOWN. The default value is inter 2000ms rise 2 fall 3;
- cookie [value]: used to cooperate with cookie-based session retention, such as cookie ms1.srv1 means that the request handed over to this server will write a cookie with the value of ms1.srv1 in the response (the specific cookie name is in the backend domain specified in the cookie settings);
- maxconn: refers to the maximum number of connections that HAProxy initiates to this server at the same time. When the number of connections reaches maxconn, new connections to this server will enter the waiting queue. The default is 0, which is infinite;
- maxqueue: The length of the waiting queue. When the queue is full, subsequent requests will be sent to other servers under this backend. The default is 0, which means unlimited;
- weight: the weight of the server, 0-256, the greater the weight, the more requests are allocated to this server. A server with a weight of 0 will not be allocated any new connections. The default weight of all servers is 1;
- timeout connect [time]: refers to the timeout time for HAProxy to try to establish a connection with the backend server;
- timeout check [time]: By default, the health check connection + response timeout time is the inter value specified in the server command. If timeout check is configured, HAProxy will use inter as the connection timeout time of the health check request, and timeout check The value of is used as the response timeout of the health check request;
- timeout server [time]: refers to the timeout period for the backend server to respond to the HAProxy request;

defalut domain

Among the key configurations of the frontend and backend domains above, except for acl, bind, http-request, http-response, and use_backend, the rest can be configured in the default domain.

If the items configured in the default domain are not configured in the frontend or backend domain, the configuration in the default domain will be used.

listen field

The listen domain is a combination of the frontend domain and the backend domain;

Therefore, all configurations in the frontend domain and backend domain can be configured in the listen domain!

Configuration case

Take the following configuration as an example:

 global log 127.0.0.1 local2 info # 日志输出配置maxconn 4000 # HAProxy进程同时处理的连接数，当连接数达到这一数值时，HAProxy将停止接收连接请求daemon #以后台形式运行ha-proxydefaults mode http # 工作模式：tcp是4层，http是7层log global # 沿用global域的log配置retries 3 # 健康检查。3次连接失败就认为服务器不可用，主要通过后面的check检查option redispatch # 服务不可用后重定向到其他健康服务器。 maxconn 4000# 监控界面配置listen stats bind *:8888 stats enable # 开启监控页面stats uri / # 在此frontend上开启监控页面，通过[uri]访问stats refresh 10s # 监控数据刷新周期mode http stats realm Global\ statistics # 统计报告格式stats auth admin:123456 # 登录账户信息# 前端入口定义frontend test bind :80 default_backend webservers# 后端服务定义backend webservers balance roundrobin server ng nginx:80 check fall 2 rise 2 weight 1 server httpd httpd:80 check fall 2 rise 2 weight 1

Two forwards are defined above:

<ha-ip>:8888 : The monitoring interface provided by HAProxy;
<ha-ip>:80 : forward the request to nginx and httpd according to the roundrobin load balancing algorithm;

As can be seen from the above example:

When using HAProxy, the main thing is to write listen, or frontend + backend logic!

Test Case

Here take the above configuration as an example, using Docker to operate;

Create network:

 docker network create --subnet 172.40.0.0/24 --gateway 172.40.0.1 my-net

Create nginx and httpd services:

 docker run -d --restart=always --name nginx --ip 172.40.0.10 --network my-net nginx:latest16:19docker run -d --restart=always --name httpd --ip 172.40.0.12 --network my-net httpd:latest

Create the HAProxy configuration file:

 vi /docker-data/haproxy/haproxy.cfgglobal log 127.0.0.1 local2 info maxconn 4000 #优先级低daemon #以后台形式运行ha-proxydefaults mode http #工作模式http ,tcp 是4 层,http是7 层log global retries 3 #健康检查。3次连接失败就认为服务器不可用，主要通过后面的check检查option redispatch #服务不可用后重定向到其他健康服务器。 maxconn 4000 #优先级中listen stats bind *:8888 stats enable stats uri / stats refresh 10s mode http stats realm Global\ statistics stats auth admin:123456frontend test bind :80 default_backend webserversbackend webservers balance roundrobin server ng nginx:80 check fall 2 rise 2 weight 1 server httpd httpd:80 check fall 2 rise 2 weight 1

Note: The above uses the container name instead of the IP;

Create an HAProxy container and mount the configuration:

 docker run -d --restart=always --name haproxy -p 18888:8888 -p 8849:80 -v /docker-data/haproxy:/usr/local/etc/haproxy --network zk-net haproxy:latest

Then access the monitoring interface (port 18888):

You can see our configuration;

Access the back-end service (port 8849), and switch between the above two pages as you access;

appendix

Reference article:

This article is reproduced from: https://jasonkayzk.github.io/2023/06/27/Haproxy%E7%AE%80%E4%BB%8B/
This site is only for collection, and the copyright belongs to the original author.