Original link: https://colobu.com/2022/05/22/use-ebpf-to-trace-rpcx-microservices/
ebpf is an innovative and revolutionary technology that can run sandboxed programs in the kernel without modifying the kernel source code or loading kernel modules. Making the Linux kernel programmable enables smarter, more feature-rich infrastructure software to build on existing (rather than adding new) abstraction layers without increasing system complexity or sacrificing Execution efficiency and security.
The first version of BPF came out in 1994. We actually use it when writing rules using the tcpdump tool, which is used to view or “sniff” network packets.
Using ebpf technology, you can provide new ideas and technologies in security, tracking & performance analysis, network, observation & monitoring, etc.:
- Security: Security checks can be performed from the system call level, packet layer, and socket layer, such as developing DDOS protection systems and writing firewall programs.
- Network: High-performance packet processing programs at the kernel layer can be developed. For example, Cilium provides load balancing at the kernel layer, pushing the service mesh to a deeper level and solving sidecar performance problems.
- Trace & Performance Analysis: Linux provides various types of probe points, such as Kernel probes, perf events, Tracepoints, User-space probes, User statically defined tracepoints, XDP, etc. We can write probe programs to collect these probes Pinpoint information, so we can track the program in this way and analyze performance.
- Observation & Monitoring: Continuous observation and monitoring of these probe points, we can enrich our trace program. The point is that we do not need to change the existing program, but observe from other programs through the ebpf method. In 2014, the well-known kernel hacker Alexei Starovoitov extended the capabilities of BPF. He increased the number of registers and the allowable size of the program, increased JIT compilation, and created a program that checks whether the program is safe. Most impressively, however, the new BPF programs are not only able to run while processing packets, but also to respond to other kernel events and pass information back and forth between the kernel and user space. Alexei Starovoitov’s new version of BPF is called eBPF (e stands for extended). But now, it has replaced all legacy BPF usages and has become so popular that it is still called BPF for simplicity.
You can write your own bpf program for customized logic processing and analysis, or you can use the tools written by the great gods to perform general performance analysis and tracking of the program. This article mainly introduces the use of some tools for general analysis of rpcx microservice programs. Since it is general, you can analyze other Go programs, and it is not limited to Go programs. You can analyze and trace other applications and even the kernel.
I am going to open a new article to introduce the bpf program by myself.
This time I mainly introduce the related tools and bpftrace provided by bcc.
bcc is a toolkit for creating efficient eBPF-based kernel tracing and manipulation programs, including some useful command-line tools and examples. BCC simplifies writing eBPF programs for kernel instrumentation in C, including wrappers for LLVM and front-ends for Python and Lua. It also provides high-level libraries for direct integration into applications.
bpftrace is a high-level tracing language for Linux eBPF. Its language is inspired by awk and C and previous tracers like DTrace and SystemTap. bpftrace uses LLVM as a backend to compile scripts to eBPF bytecode, and leverages BCC as a library for interacting with the Linux eBPF subsystem and existing Linux trace functions and join points.
Simple rpcx microservice program
Since we want to use the ebpf analysis program, first we have a program. Here I have chosen the simplest example of rpcx to implement a multiplicative minimal microservice.
The code for this program can be downloaded at rpcx-examples-102basic .
The server program is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
twenty one
twenty two
twenty three
twenty four
25
26
27
28
29
30
31
32
33
34
|
package main
import (
“context”
“flag”
“fmt”
example “github.com/rpcxio/rpcx-examples”
“github.com/smallnest/rpcx/server”
)
var (
addr = flag.String( “addr” , “localhost:8972” , “server address” )
)
type Arith struct {}
// Trace this service call using ebpf
func (t *Arith) Mul(ctx context.Context, args example.Args, reply *example.Reply) error {
reply.C = args.A * args.B
fmt.Println( “C=” , reply.C)
return nil
}
func main() {
flag.Parse()
s := server.NewServer()
s.RegisterName( “Arith” , new (Arith), “” )
err := s.Serve( “tcp” , *addr)
if err != nil {
panic (err)
}
}
|
Use go build server.go
compile the server
program and run it ( ./server
).
The client program is as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
twenty one
twenty two
twenty three
twenty four
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
package main
import (
“context”
“flag”
“log”
“time”
“github.com/smallnest/rpcx/protocol”
example “github.com/rpcxio/rpcx-examples”
“github.com/smallnest/rpcx/client”
)
var (
addr = flag.String( “addr” , “localhost:8972” , “server address” )
)
func main() {
flag.Parse()
d, _ := client.NewPeer2PeerDiscovery( “tcp@” +*addr, “” )
opt := client.DefaultOption
opt.SerializeType = protocol.JSON
xclient := client.NewXClient( “Arith” , client.Failtry, client.RandomSelect, d, opt)
defer xclient.Close()
args := example.Args{
A: 10 ,
B: 20 ,
}
for {
reply := &example.Reply{}
err := xclient.Call(context.Background(), “Mul” , args, reply)
if err != nil {
log. Fatalf( “failed to call: %v” , err)
}
log.Printf( “%d * %d = %d” , args.A, args.B, reply.C)
time.Sleep(time.Second)
}
}
|
The client will call Arith.Mul
once every second. The logic of the microservice is also very simple, which is to perform multiplication and return the result to the client.
Track and analyze microservices
As a demonstration, this article only tracks Arith.Mul
calls on the server side.
bcc provides a lot of bpf-based analysis programs, as shown in the following figure (the classic diagram organized by the great god Brendan Gregg)
Here we will select a few related tools to demonstrate how to use these tools to analyze running programs. Note that the program is running, we did not add some additional buried points to the program.
bcc kit
First of all, you have to install the bcc package, and your Linux kernel must be new enough. In the computer rooms of some big factories, there are some servers with kernel version 2.6.x, and these old kernel servers cannot support ebpf or new features of ebpf.
I tested it on a virtual machine of my Alibaba Cloud, and its version is:
- Linux lab 4.18.0-348.2.1.el8_5.x86_64
- CentOS Stream release 8
These tools can be installed directly by yum install bcc-tools
.
If you have another version of the operating system, you can refer to the bcc installation documentation for installation: bcc/INSTALL .
Before using the tool to analyze, you first need to know the name of your Arith.Mul
in the symbol table, you can use objdump to query:
1
2
|
[root@lab server] # objdump -t server|grep Mul|grep main
000000000075 a5e0 g F .text 00000000000000 d0 main.(*Arith).Mul
|
Its name is main.(*Arith).Mul
, and we will use this name to analyze this microservice below.
Make sure the server you just downloaded is always running.
funccount
funccount is used to count the number of calls of a function over a period of time.
Execute the following command in the directory where the server is located (if it is in a different path, you need to change the path of the program in the command parameter):
1
2
3
4
5
6
7
|
[root@lab server] # funccount -d 10 ‘./server:main.*.Mul’
Tracing 1 functions for “b’./server:main.*.Mul’” … Hit Ctrl-C to end.
FUNC COUNT
b ‘main.(*Arith).Mul’ 10
Detaching…
[root@lab server] #
|
Here we set the observation time to 10 seconds, and you can see that within these 10 seconds, this function was called 10 times.
It contains several parameters, such as you can watch continuously and output results every 5 seconds:
1
2
3
4
5
6
|
[root@lab server] # funccount -Ti 5 ‘./server:main.*.Mul’
Tracing 1 functions for “b’./server:main.*.Mul’” … Hit Ctrl-C to end.
18:08:29 _ _
FUNC COUNT
b ‘main.(*Arith).Mul’ 5
|
We can even use it to trace Go GC related functions:
1
2
3
4
5
6
|
[root@lab server] # funccount -d 10 ‘./server:runtime.*.gc*’
Tracing 21 functions for “b’./server:runtime.*.gc*’” … Hit Ctrl-C to end.
FUNC COUNT
b ‘runtime.(*gcControllerState).update’ 2
b ‘runtime.mallocgc’ 250
|
Or to trace the scheduling of the Go runtime:
1
2
3
4
5
6
|
[root@lab server] # funccount -d 10 ‘./server:runtime.schedule’
Tracing 1 functions for “b’./server:runtime.schedule’” … Hit Ctrl-C to end.
FUNC COUNT
b ‘runtime.schedule’ 20
Detaching…
|
funclatency
funclatency counts the time-consuming execution of the function.
If we want to analyze the execution of the Arith.Mul
method, we can use the following command, which will display the time-consuming distribution of this function call in the form of a histogram:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
twenty one
twenty two
twenty three
twenty four
|
[root @lab server] # funclatency -d 10 ‘./server:main.*.Mul’
Tracing 1 functions for “./server:main.*.Mul” … Hit Ctrl-C to end .
Function = b ‘main.(*Arith).Mul’ [ 359284 ]
nsecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | |
16384 -> 32767 : 7 |****************************************|
32768 -> 65535 : 3 |**************** |
avg = 31978 nsecs, total: 319783 nsecs, count: 10
|
We counted 10 seconds of data. You can see that this function is called 10 times during the period. It takes an average of 31 microseconds.
If we want to check whether there is a long tail in an online program, it is easy to analyze statistics with this tool.
funcslower
funcslower This tool can trace slow functions of the kernel and program, for example, use the following command:
1
2
3
4
5
6
7
|
[root@lab server] # funcslower -u 10 ‘./server:main.(*Arith).Mul’
Tracing function calls slower than 10 us… Ctrl+C to quit.
COMM PID LAT(us) RVAL FUNC
server 359284 44.75 0 ./server:main.(*Arith).Mul
server 359284 30.97 0 ./server:main.(*Arith).Mul
server 359284 33.38 0 ./server:main.(*Arith).Mul
server 359284 31.28 0 ./server:main.(*Arith).Mul
|
You can even print out stack information:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[root@lab server]funcslower -UK -u 10 ‘./server:main.(*Arith).Mul’
Tracing function calls slower than 10 us… Ctrl+C to quit.
COMM PID LAT(us) RVAL FUNC
server 359284 31.20 0 ./server:main.(*Arith).Mul
b ‘runtime.call64.abi0’
b ‘runtime.reflectcall’
b ‘reflect.Value.call’
b ‘reflect.Value.Call’
b ‘github.com/smallnest/rpcx/server.(*service).call’
b ‘github.com/smallnest/rpcx/server.(*Server).handleRequest’
b ‘github.com/smallnest/rpcx/server.(*Server).serveConn.func2’
b ‘runtime.goexit.abi0’
server 359284 32.23 0 ./server:main.(*Arith).Mul
b ‘runtime.call64.abi0’
b ‘runtime.reflectcall’
b ‘reflect.Value.call’
b ‘reflect.Value.Call’
b ‘github.com/smallnest/rpcx/server.(*service).call’
|
tcp series tools
bcc provides a bunch of tcp tracking situations, and we can choose to use corresponding tools for different scenarios.
- tools/tcpaccept: Track TCP passive connections (accept()).
- tools/tcpconnect: Track TCP active connections (connect()).
- tools/tcpconnlat: Track the latency of active TCP connections (connect()).
- tools/tcpdrop: Trace the drop details of the kernel’s TCP packets.
- tools/tcplife: Track TCP sessions (summary of lifecycle metrics).
- tools/tcpretrans: Track TCP retransmissions.
- tools/tcprtt: Track TCP round-trip time.
- tools/tcpstates: Track TCP session state changes.
- tools/tcpsubnet: Aggregates and aggregates TCP sends by subnet.
- tools/tcpsynbl: Displays the status of the TCP SYN backlog.
- tools/tcptop: Aggregate TCP send/recv throughput by host.
- tools/tcptracer: Trace TCP connection establishment/closing (connect(), accept(), close()).
- tools/tcpcong: Track TCP socket congestion control state durations.
For example, if we pay attention to the establishment of the connection, we can use tcptracer
:
1
2
3
4
5
6
7
8
9
10
11
|
[root@lab lib] # tcptracer
Tracing TCP established connections. Ctrl-C to end.
T PID COMM IP SADDR DADDR SPORT DPORT
C 360005 client 4 127.0 . 0.1 127.0 . 0.1 43126 8972
X 360005 client 6 [:: 1 ] [:: 1 ] 43010 8972
A 359284 server 4 127.0 . 0.1 127.0 . 0.1 8972 43126
X 360005 client 4 127.0 . 0.1 127.0 . 0.1 43126 8972
X 359284 server 4 127.0 . 0.1 127.0 . 0.1 8972 43126
C 360009 client 4 127.0 . 0.1 127.0 . 0.1 43130 8972
X 360009 client 6 [:: 1 ] [:: 1 ] 43014 8972
A 359284 server 4 127.0 . 0.1 127.0 . 0.1 8972 43130
|
There are also a bunch of xxxxsnoop
programs that can track specific system calls.
bpftrace
Sometimes, we want to use scripts to implement some custom tracking, such as tools like awk, which can provide simple scripting.
One such tool is bpftrace, which uses LLVM as a backend to compile scripts to eBPF bytecode, and leverages BCC as a library for interacting with the Linux eBPF subsystem and existing Linux trace functions and join points.
The bpftrace reference manual can be found at bpftrace reference_guide .
Taking our Arith.Mul
as an example, we can use the following command to add a probe to the function call and print the input parameters:
1
2
3
4
5
|
[root@lab server]# bpftrace -e ‘uprobe:./server:main.*.Mul {printf(“%s – %s: arg1: %d, arg2: %d\n”, comm, func, arg0 , arg1)}’
Attaching 1 probe…
server – main.(*Arith).Mul: arg1: 10 , arg2: 20
server – main.(*Arith).Mul: arg1: 10 , arg2: 20
server – main.(*Arith).Mul: arg1: 10 , arg2: 20
|
Why can arg0 and arg1 print the parameters? Simply put, our microservice parameters are exactly two int64 integers, which correspond to arg0 and arg1.
The service return value of rpcx is also passed in as a parameter, and it has not been set when the function is called, so if you print arg3, it is not the return value of reply.
At this time, we need to move the probe, add an offset, and how much cheaper? Through disassembly, we can see that the return value has been assigned when adding 92, so the return value can be printed by using the following command (the first parameter is overwritten at this time):
1
2
3
4
5
|
[root@lab server ] # bpftrace -e ‘uprobe:./server:main.*.Mul+92 {printf(“%s – %s: reply: %d\n”, comm, func, arg0)}’
Attaching 1 probe…
server – main.(*Arith).Mul: reply: 200
server – main.(*Arith).Mul: reply: 200
server – main.(*Arith).Mul: reply: 200
|
Reference documentation
- https://www.brendangregg.com/blog/2017-01-31/golang-bcc-bpf-function-tracing.html
- https://golangexample.com/library-to-work-with-ebpf-programs-from-golang/
- https://tonybai.com/2020/12/25/bpf-and-go-modern-forms-of-introspection-in-linux/
- https://networkop.co.uk/post/2021-03-ebpf-intro/
- https://medium.com/bumble-tech/bpf-and-go-modern-forms-of-introspection-in-linux-6b9802682223#db17
- https://blog.px.dev/ebpf-http-tracing/
- https://www.ebpf.top/post/ebpf_and_go/
- https://www.ebpf.top/
This article is reprinted from: https://colobu.com/2022/05/22/use-ebpf-to-trace-rpcx-microservices/
This site is for inclusion only, and the copyright belongs to the original author.