“The Road to Go Language Improvement: Programming Ideas, Methods and Skills from Novice to Master 1”

Original link: https://blog.frytea.com/archives/680/

75

The Road to Go Language Improvement: Programming Ideas, Methods and Skills from Novice to Master 1
Bai Ming
330 notes

◆ The first part is familiar with everything about the Go language

golang is only used to name the official website of the Go language. The reason why golang.org was used as the official domain name of the Go language at that time was because go.com had been occupied by Disney.

The Go language project was officially open-sourced on November 10, 2009, and this day was officially designated by Go as the birth day of the Go language.

Go programmers are also nicknamed Gopher (Gopher will be used directly to refer to Go language developers later), and the official technical conference of Go language is called GopherCon.

The most prestigious Go technology conference in China is also named after Gopher and is called GopherChina.


Figure 2-1 The ancestor of the Go language (the picture comes from the book “Go Programming Language”

The basic syntax of Go refers to the C language, and Go is a branch of the “C family language”; while the declaration syntax and package concept of Go are inspired by Pascal, Modula, and Oberon; some concurrency ideas come from Professor Tony Hoare’s CSP Programming languages ​​influenced by theory [1], such as Newsqueak and Limbo.

Today, the Go team has stabilized the version release rhythm to release major versions twice a year, usually in February and August. The Go team commits to support the latest two major stable releases of Go

The most intuitive combinatorial syntax element provided by the Go language is type embedding.

Interface is the real “magic” in the Go language. It is an innovative design of the Go language. It is just a collection of methods, and the relationship with the implementer is implicit. It minimizes the coupling between various parts of the program. At the same time, it is the “link” that connects the various parts of the program

In summary, the application of composition principles shapes the skeleton structure of Go programs. Type embedding provides vertical expansion capabilities for types, and interface is the key to horizontal combination. It is like the “joint” on the program body, giving the two parts connected to the “joint” the ability to “freely move” respectively, and the overall realization of a certain function

The designers of Go keenly grasped the trend of the development of CPUs towards multi-core. When they decided not to use C++ and create a new language, they decisively took multi-core and native built-in concurrency support as one of the design principles of the new language. one.

The operating system scheduler will schedule multiple threads in the system to run on the physical CPU according to a certain algorithm

The concurrency implementation of traditional programming languages ​​(such as C, C++, etc.) is actually based on the scheduling of the operating system, that is, the program is responsible for creating threads (usually implemented through function library calls such as pthread), and the operating system is responsible for scheduling. This traditional way of supporting concurrency has two major shortcomings: complexity and difficulty in scaling.

Although the cost of threads is much lower than that of processes, we still cannot create a large number of threads, because not only the resources occupied by each thread are not small, but also the cost of operating system scheduling and switching threads is not small.

For many network service programs, since a large number of threads cannot be created, it is necessary to multiplex the network in a small number of threads, that is, use the mechanism of epoll/kqueue/IoCompletionPort

Even with the help of third-party libraries such as libevent and libev, it is not easy to write such a program. There are a lot of callbacks (callbacks), which will bring a lot of mental burden to the programmer.

Goroutines occupy very few resources, and the Go runtime allocates only 2KB of stack space for each goroutine by default.

The switching of goroutine scheduling does not need to trap (trap) the operating system kernel layer to complete, the cost is very low

The program that puts these goroutines on the CPU according to a certain algorithm is called the goroutine scheduler (goroutine scheduler).

The structural design of concurrent programs should not be limited to the level of processing power in the case of a single core, but should fully improve the utilization of multi-cores and obtain a natural increase in performance as the ultimate goal in the case of multi-cores.

Go designers condense all engineering issues into one word: scale (I always feel that the translation of the word scale into any Chinese word cannot express its meaning vividly, and it is temporarily translated as “scale”).

Production scale: The concurrency scale of software systems built with Go, such as the number of concurrent concerns of such systems, the magnitude of data processed, and the number of services that interact with it concurrently.

Development scale: including the size of the code base of the development team, the number of engineers participating in the development and collaborating with each other, etc.

When dealing with dependencies, it is sometimes possible to avoid introducing more dependencies by allowing a part of the code to be duplicated

Package paths are unique, while package names don’t have to be unique

• Build and run: go build/go run
• View and obtain dependent packages: go list/go get/go mod xx
• Editor-assisted formatting: go fmt/gofmt
• Document viewing: go doc/godoc
• Unit testing/benchmarking/test coverage: go test
• Code static analysis: go vet
• View performance analysis and trace results: go tool pprof/go tool trace
• Auxiliary tool for upgrading to new Go version API: go tool fix
• Report Go language bugs: go bug

. German architect Ludwig Mies van der Rohe achieved extraordinary success after applying the philosophy of “less is more” to architectural design, and the Go language is one of the few programs in which this philosophy is used in the field of programming languages. Many practitioners.

“Less” is by no means the goal, “more” is its connotation

“High cohesion, low coupling” is an eternal criterion for managing complexity in the field of software development

There is a very famous hypothesis in the field of human natural linguistics – “Sapir-Wurf Hypothesis”. The content of this hypothesis is as follows: “Language affects or determines the way of thinking of human beings.”

The first Turing Award winner and famous computer scientist Alan J. Perlis (Alan J. Perlis), he proposed from another angle: “A programming language that cannot affect your programming thinking is not worth learning and using.”

To solve this problem, we can use Eratosthenes prime number sieve algorithm.

First use the smallest prime number 2 to sift out the multiples of 2; the next unscreened number is the prime number (3 here). Then use this prime number 3 to sieve, and sift out the multiples of 3… Repeat this until the sieve is finished

The Go version program implements a concurrent prime number sieve, which uses a concurrent combination of goroutines

The program starts from the prime number 2, and establishes a goroutine for each prime number in turn, which is used as a multiple of the prime number to be screened out

ch points to the source channel of the sieve goroutine where the latest output prime is currently located. This code comes from a sharing about concurrency by Rob Pike

The execution process of the Go version program can be three-dimensionally shown in Figure 4-2.

Figure 4-2 Schematic diagram of the Go version prime number sieve operation

C’s imperative thinking, Haskell’s functional thinking and Go’s concurrent thinking

Programming languages ​​affect programming thinking, or each programming language has its own native programming thinking

Any code that belongs to the high-quality category of a certain programming language must be code written under the native thinking of this programming language

The programming language usage, auxiliary libraries, and fixed usage methods centered on a programming language and aimed at solving engineering problems are called the native programming thinking of the programming language

[illustration]

◆ The second part project structure, code style and identifier naming

As of the Go project commit 1e3ffb0c (2019.5.14), the Go project structure is as follows:
$ tree -LF 1 ~/go/src/github.com/golang/go ./go ├── api/ ├── AUTHORS ├── CONTRIBUTING.md ├── CONTRIBUTORS ├── doc / ├── favicon.ico ├── lib/ ├── LICENSE ├── misc/ ├── PATENTS ├── README.md ├── robots.txt ├── src/  └── test/

1) The script source files for code construction are placed in the top-level directory under src.

2) Under the secondary directory cmd under src, the main directory of executable files related to the Go tool chain (such as go, gofmt, etc.) and their main package source files are stored.

3) The secondary directory pkg under src stores the packages that the above cmd toolchain programs depend on, the Go runtime and the source files of the Go standard library.

Russ Cox gave his thoughts on a minimal standard layout for a Go project structure [1] in an open source project issue. He believes that the minimum standard layout of a Go project should look like this:
// Under the root path of the Go project warehouse – go.mod – LICENSE – xx.go – yy.go …
or

  • go.mod-LICENSE-package1-package1.go-package2-package2.go…


Figure 5-1 Typical project structure of Go language (Go project for the purpose of building binary executable files)


Figure 5-2 Go language library project structure

The Go core team boils down this kind of problem into one word – scale, which is also the problem that the popular Go2 evolution solution will mainly solve in recent years

Go officially provides the -s option in gofmt. Gofmt -s can automatically convert the non-simplified code in the legacy code to simplified writing, and there is no side effect, so the general “-s” option will be the default option for gofmt execution.

We can perform expression-level replacement of the code through the -r command line option to achieve the purpose of refactoring.

The principle of gofmt -r is to search the source code for an expression that can match the pattern before reformatting the source code, and if so, replace all matched results with the replacement expression.

gofmt provides the -l option, which can output a list of files that meet the conditions according to the format requirements

goimports adds a maintenance function to the package import list based on the gofmt function, which can automatically add or delete packages from the import package list according to the latest changes in the source code.

Go and Vim are linked together via the vim-go plugin

There are only two hard things in computer science: cache invalidation and naming.
—Phil Karlton, Architect, Netscape

But simplicity does not mean choosing short names for identifiers, but choosing names that keep their purpose clear and unambiguous in the context in which the identifier is used

To do a good job of naming Go identifiers (including naming packages), at least two principles must be followed: simple and consistent; use context to assist naming.

For packages in Go, it is generally recommended to name them as single words in lowercase.

Variables in Go are divided into package-level variables and local variables (variables within functions or methods)

The Go language officially requires that identifiers be named using camel case (CamelCase)

When naming variables, types, functions, and methods, the primary principle should still be simple and short.

• Loops and conditional variables are mostly named with a single letter (see statistics above for details);
• The parameters and return value variables of functions/methods are mainly single words or single letters;
• Because the type information is bound when the method is called, the naming of the method is mainly based on a single word;
• Functions are mostly named with multi-word compound words;
• Types are often named after multi-word compounds.

Do not include type information in variable names

Keep variable declaration and use as close as possible, or declare the variable before its first use

Maintain consistency in the meaning of short-named variables

Constants are often named using a combination of multiple words

The interface in the Go language is an innovation of Go at the programming language level, which provides a powerful decoupling capability for Go code

In the Go language, single-word names are preferred for interface types.

For an interface with a unique method (method) or a combination of multiple interfaces with a unique method, the Go language convention is to use “method name + er” to name

The Go language recommends defining small interfaces as much as possible, and building programs through interface combinations

Go also has the convention of considering the context when naming identifiers, that is, on the premise of not affecting readability, taking into account the principle of consistency, and naming identifiers with short names as much as possible

◆ The third part declaration, type, statement and control structure

For the Go language, which is famous for being engineering-oriented and aims to solve large-scale problems, Gopher should try to keep the project consistent in the choice of variable declaration form.

Package variable (package variable): Variables visible at the package level. If it is an exported variable, the package-level variable can also be considered a global variable

Local variable (local variable): A variable declared in a function or method body, visible only in the function or method body.

Package-level variables can only use the variable declaration form with the var keyword, but there is still a certain degree of flexibility in the details of the form.

We generally put variable declarations of the same class in a var block, and put declarations of different classes in different var blocks; or put lazy-initialized variable declarations in a var block, and variables that will be declared and explicitly initialized in another var block. I call this “declarative clustering”.

There is another one of the best practices for variable declaration: the principle of proximity, that is, declare the variable as close as possible to the position where the variable is used for the first time. The principle of proximity is actually a means of minimizing the scope of variables

If a package-level variable is used in multiple places within the package, it is more appropriate to declare this variable at the head of the source file.

For lazily initialized local variable declarations, use the declaration form with the var keyword

Another common variable declared with the var keyword is the variable err of the error type (naming the variable instance of the error type as err is also a common usage in Go), especially when the closure function followed by defer needs to use err When judging function/method exit status

For local variables that are declared and explicitly initialized, it is recommended to use the short variable declaration form

Try to apply short variable declaration form in branch control


Figure 8-1 Flowchart of variable declaration form usage decision-making

In the C language, literal values ​​(literal) take on the role of constants (for integer values, enumeration constants can also be used).

In order not to let these magic numbers (magic numbers) flood the source code everywhere, the common practice of the early C language is to use macro (macro) definition notation to refer to these literal values

The constants defined by macros have many shortcomings, such as:
• It is only a literal value replaced in the precompilation stage, which inherits the complexity and error-proneness of macro replacement;
• is type-unsafe;
• Cannot output constant value by macro name during debugging.

The const in Go language integrates the three forms of macro definition constants, const read-only variables and enumeration constants in C language, and eliminates the shortcomings of each form, making Go constants a type-safe and compiler-friendly syntax element .

In most cases, Go constants do not explicitly specify the type when they are declared, that is to say, they use untyped constants (untyped constant)

Go requires that even if two types have the same underlying type (underlying type), they are still different data types and cannot be compared with each other or mixed in an expression for operation:

The designers of Go decided that the convenience brought by implicit conversions was not enough to offset the many problems it brought [1].

Go’s const syntax provides a mechanism for “implicitly repeating the previous non-null expression”

iota is a predefined identifier of the Go language, which represents the offset value (starting from zero) of the position of each constant in the const declaration block (including single-line declaration) in the block

GOROOT/src/sync/mutex.go (go 1.12.7) const ( mutexLocked = 1 << iota mutexWoken mutexStarving mutexWaiterShift = iota starvationThresholdNs = 1e6 )

iota predefined identifiers can assign initial values ​​to enumeration constants in a more flexible form

Go’s enumeration constants are not limited to integer values, you can also define floating-point enumeration constants

iota makes it easier to maintain lists of enum constants

Use typed enum constants for type safety

Keep the zero value available.
——Go proverb

Although newer versions of some compilers provide some command-line parameter options for zero-value initialization of variables on the stack, for example, GCC provides the following command-line options:

But this does not change the fact that the C language does not natively support zero-value initialization of local variables that are not explicitly initialized

Since the slice type in Go has the feature that the zero value is available, we can directly append it without the error of referring to nil

The designers of the Go standard library thoughtfully designed the zero value of the sync.Mutex structure as an available state, so that Mutex callers can omit the initialization of Mutex and use Mutex directly.

There are also certain restrictions on the availability of zero values. For example, in the append scenario, slice types with zero values ​​cannot be used to manipulate data in the form of subscripts.

Primitive types like map also provide no support for zero values

In addition, for types with zero values ​​available, pay attention to avoid value copying as much as possible

Sometimes, the zero value is not the best choice. It is necessary for us to assign an appropriate initial value to the variable to ensure that it will participate in the business process calculation in the correct state, especially some composite type variables in the Go language.

Composite types in Go include structures, arrays, slices, and maps

The composite literal syntax provided by Go can be used as the initial value constructor of composite type variables

A composite literal value consists of two parts: one part is the type, such as myStruct, [5]int, []int, and map[int]string on the right side of the assignment operator in the above sample code; the other part is wrapped by curly braces {} literal value.

If the source code uses a struct type imported from another package, but does not use an initial value constructor of the field:value form, this rule considers such a composite literal value to be fragile

It is not allowed to use unexported fields in structs imported from other packages as fields in composite literals, which will cause compilation errors.

Different from the structure type, the array/slice uses the subscript (index) as the field in the form of field:value, so as to realize the advanced construction form of the initial element value of the array/slice

But for the map type (this syntactic sugar was only introduced in Go 1.5), when the type of key or value is a composite type, we can omit the type in the composite literal in key or value

For scenarios where the zero value is not applicable, we need to assign a certain initial value to the variable

Whenever you spend a lot of time with a particular tool, it pays to dig into it and understand how to use it effectively.

Slice, translated into slices in Chinese, is an important abstract data type provided by Go language on top of arrays. In the Go language, slices are a perfect replacement for most occasions where arrays are required. And compared with arrays, slices provide a more flexible and efficient data sequence access interface.

Go language array is a fixed-length contiguous sequence that holds isomorphic type elements, so Go array type has two properties: element type and array length. Array types with the same properties are equivalent

Go arrays have value semantics, which means that an array variable represents the entire array, which is completely different from the C language.

In the Go language, a more idiomatic way is to use slices.
Slices are to arrays what file descriptors are to files

In the Go language, the array is more “retired behind the scenes” and assumes the role of the underlying storage space; while the slice moves to the “foreground”, opening a “window” for access to the underlying storage (array) (see Figure 13-1).

We can call a slice the “descriptor” of an array

//$GOROOT/src/runtime/slice.go type slice struct { array unsafe. Pointer len int cap int }

We can create slices that operate on existing arrays through the syntax u[low: high], which is called array slicing.

You can also create new slices based on existing slices through the syntax s[low: high], which is called slice reslicing

The newly created slice shares the underlying array with the original slice, and the modification of the array through the new slice will also be reflected in the original slice.

Go slices also support an important advanced feature: dynamic expansion

Such an append operation sometimes brings some confusion to Gopher. For example, for a slice created by slicing an array in the form of the syntax u[low: high], once the slice cap touches the upper bound of the array, the append operation is performed on the slice. The slice will be unbound from the original array

The append operation is a powerful tool, which allows the slice type to partially satisfy the concept of “zero value is available”

But from the principle of append, we can also see that the operation cost of reallocating the underlying array and copying elements is still quite high, especially when there are many elements. So how can you reduce or avoid the cost of excessive memory allocation and copying? An effective method is to estimate the capacity scale of the slice according to the usage scenario of the slice, and pass the estimated slice capacity data to the built-in function make in the form of cap parameter when creating a new slice

It can be seen from the results that the average performance (9250ns) of the append operation using the slice created with the cap parameter is about 4 times that of the slice without the cap parameter (36 484ns), and each operation only needs one memory allocation on average

The type of key should strictly define the behavior when used as the operand of the two operators “==” and “!=”

There are two ways to create a map type variable: one is to use a composite literal value, and the other is to use the pre-declared built-in function make.

Like a slice, a map is also a reference type. Passing a map type variable as a function parameter will not cause a large performance loss, and the modification of the map variable inside the function is also visible outside the function.

A best practice in Go is to always use the “comma ok” idiom for reading values ​​from maps.

Even if the data to be deleted does not exist in the map, delete will not cause panic

We see that the same map is traversed multiple times, and the order of elements traversed is not the same. This is because the Go runtime randomizes the starting position when initializing the map iterator

Never rely on the order of elements obtained by traversing the map.

If you need a stable traversal order, then a more general approach is to use another data structure to store the keys in the desired order, such as slices

The Go runtime uses a hash table to implement the abstract map type. The runtime implements all the functions of the map operation, including search, insertion, deletion, traversal, etc.

Considering that the map can be automatically expanded, the value position of the data element in the map may change during this process, so Go does not allow obtaining the address of the value in the map, and this constraint takes effect during compilation.

If possible, we’d better make a rough estimate of the size of map usage, and use the cap parameter to initialize the map instance

The average write performance of the map instance using the cap parameter is twice that of not using the cap parameter.

In the case where the final string length can be estimated, it is most efficient to use the pre-initialized strings.Builder connection to build a string

The average performance of strings.Join connection to build strings is the most stable. If multiple input strings are carried by []string, then strings.Join is also a good choice

The most intuitive and natural way to use operator concatenation, when the compiler knows the number of strings to be concatenated, using this method can be optimized by the compiler

Although fmt.Sprintf is not efficient, it is not useless. If a string of a specific format is constructed by a variety of variables of different types, then this method is the most suitable

Whether string is converted to slice or slice is converted to string, the conversion is costly. The root of these costs is that string is immutable, and new memory must be allocated for the converted type at runtime.

Go requires no circular dependencies between packages, so that the dependencies of a package form a directed acyclic graph. Since there are no loops, packages can be compiled individually or in parallel.

Like mainstream statically compiled languages, the construction of Go programs is simply composed of two stages: compile and link.

What the compiler must use in the compilation process is the source code of the package on which the compilation unit (a package) depends.

When the package name is different from the last directory name in the package import path, it is better to put the package name explicitly in the package import statement using the following syntax.

Support for declaring and initializing multiple variables on the same line (different types are also possible)

Support for assigning multiple variables on the same line

The order of evaluation of expressions is “difficult” in any programming language

Inside a Go package, the order of expression evaluation in package-level variable declaration statements is determined by the rules for initialization dependencies.

In a Go package, package-level variables are initialized in the order in which they are declared

If the initialization expression of a variable (such as variable a) directly or indirectly depends on other variables (such as variable b), then the initialization sequence of variable a comes after variable b.

A variable that is uninitialized and does not contain a corresponding initialization expression or an initialization expression that does not depend on any uninitialized variables, we call it a “ready for initialization” variable.

The initialization of package-level variables is carried out step by step, and each step is the process of finding the next “ready for initialization” variable in the order of variable declaration and initializing it. Repeat this step over and over until there are no “ready for initialization” variables left.

The order in which variables are declared within the same package but in different files depends on the order in which the compiler processes the files: variables in files processed first are declared before all variables in files processed later.

Let’s first look at the expression evaluation in the switch-case statement, which belongs to the category of “lazy evaluation”. Lazy evaluation means that the expression value is evaluated only when it needs to be evaluated. The purpose of this is to let the computer do less work, thereby reducing the consumption of the program, which is helpful for performance improvement

When the select execution starts, first all case expressions will be evaluated in the order in which they appear

With one exception, the expression receiving data from the channel (RecvStmt) on the left side of the case equal sign will not be evaluated

If the selection to be executed is a case that receives data from a channel, the expression to the left of the equal sign in the case will not be evaluated until it is received

An expression is essentially a value, and the order of expression evaluation affects the calculation result of the program.

The expression evaluation order in the package-level variable declaration statement is determined by the variable declaration order and initialization dependencies, and the package-level variable expression evaluation order has the highest priority.

Functions, methods, and channel operations in expression operands are evaluated in the normal order of evaluation, that is, from left to right.

The evaluation of the assignment statement is divided into two stages: first, the operands in the subscript expression on the left of the equal sign, the pointer dereference expression, and the expression on the right of the equal sign are evaluated according to the ordinary evaluation rules, and then the Variables are assigned values ​​in order from right to left.

Focus on the “lazy evaluation” rules for expressions in switch-case and select-case statements

Only with a deep understanding of Go code blocks and scope rules can we understand the real reason why this code outputs “1 2 3”.

Code block is the basic unit of code execution flow flow, code execution flow always jumps from one code block to another code block

There are two types of code blocks in the Go language, one is the explicit code block wrapped by a bunch of curly braces that we can see directly in the code, such as the function body of a function, the loop body of a for loop, a certain branch of an if statement, etc.

Universe code blocks

package code block

file code block

Each if, for and switch statement is considered to be in its own implicit code block;
• Each clause in a switch or select statement is considered an implicit code block.

Unlike switch-case, which cannot declare variables in the case clause, select-case can define new variables through short variable declarations in the case clause

Add the optional ability to be followed by label for break and continue;

Add type switch, so that type information can also be used as a condition for branch selection

Add a switch-case statement for channel communication — select-case

The so-called “happy path” is the code execution path of the success logic

• Return quickly when an error occurs;
• Do not embed success logic in an if-else statement;
• The execution logic of the “Happy Path” is always on the left in the code layout, so that readers can see the normal logic flow of the function at a glance;
• The return value of the “happy path” is usually in the last line of the function, just like in the pseudo code segment 1 above.

Be careful with reuse of iteration variables

Participating in the loop is a copy of the range expression

A slice is represented internally as a structure consisting of (*T, len, cap) triples

The *T in the structure representing the copy of the slice still points to the underlying array corresponding to the original slice, so modifications to the copy of the slice will also be reflected on the underlying array a

With for range, for string, the unit of each cycle is a rune, not a byte, and the first value returned is the position of the first byte of the code point of the iterated character

for range cannot guarantee that the order of elements in each iteration is consistent. At the same time, if the map is modified during the loop, it is uncertain whether the result of such modification will affect the subsequent iteration process.

Channel is internally represented as a channel descriptor pointer at Go runtime (the internal representation of channel will be described in detail later), so the copy of the pointer of the channel also points to the original channel

When channel is used as a range expression type, for range finally blocks on the channel expression in a blocking read manner, even for a buffered channel: when there is no data in the channel, for range will also block on the channel, Until the channel is closed.

The Go language specification clearly stipulates that the break statement (when not connected to the label) ends the execution and jumps out of the execution of the innermost for, switch or select in the same function where the break statement is located.

In the improved example, we define a label—loop, which is attached to the outside of the for loop and refers to the execution of the for loop. When the code executes to “break loop”, the program will stop the execution of the for loop referred to by the label loop

Continue and break with labels improve the expressiveness of the Go language, allowing the program to easily terminate the outer loop from the deep loop or jump to the outer loop to continue execution, so that Gopher does not need to design complex programs for similar logic struct or use goto statement

In the C language, the case statement is “fall through” by default, so there is a scene where each case must have a break attached to the end

Before using the fallthrough keyword in a program, think about whether you can use a more concise and clear list of case expressions instead

◆ Part IV Functions and Methods

From the perspective of program logic structure, package (package) is the basic unit of Go program logic encapsulation, and each package can be understood as an “autonomous”, well-encapsulated basic unit with limited external exposure

Constants, package-level variables, functions, types and type methods, interfaces, etc. are distributed in the basic unit of Go package

There are two special functions in the Go language: one is the main function in the main package, which is the entry function of all Go executable programs; the other is the init function of the package.
The init function is a function with no parameters and no return value

If a package defines an init function, the Go runtime takes care of calling its init function when the package is initialized. We cannot explicitly call init in a Go program, otherwise an error will be reported during compilation

The Go runtime will not call the init function concurrently, it will wait for an init function to complete and return before executing the next init function, and each init function will only be executed once in the entire Go program life cycle

Don’t rely on the execution order of the init functions.

The Go runtime follows the “depth-first” principle

The init function is like the only “quality inspector” before the Go package is actually put into use. It is responsible for checking the initial state of the package-level data (mainly package-level variables) inside the package and exposed to the outside. In the Go runtime and standard library, we can find many examples of init checking the initial state of package-level variables.

The Go language was born to “become a new generation of system-level language”, but in the process of evolution, it has gradually evolved into a general-purpose programming language that is oriented to concurrency and fits the development trend of modern hardware.

A method in Go is essentially a variant of a function

Essentially, we can say that a Go program is a collection of functions

If a programming language has no restrictions on the creation and use of a certain language element, we can treat this syntax element like a value (value), then we call this syntax element a “first-class citizen” of the programming language. “

Syntax elements with “first class citizen” treatment can be stored in variables, passed as arguments to functions, created inside functions and returned from functions as return values. In dynamically typed languages, the language runtime also supports checking for “first-class citizen” types

As Ward Cunningham puts it, “first-class citizens”, functions in Go can be created and used like ordinary integer values

In computer science, currying is the technique of transforming a function that accepts multiple arguments into a function that accepts a single argument (the first argument of the original function) and returns a new function that accepts the remaining arguments and returns a result

A closure is an anonymous function defined inside a function and allows that anonymous function to access the scope of the outer function in which it was defined

Essentially, closures are the bridge that connects the inside of a function with the outside of a function.

A functor needs to satisfy two conditions:
• The functor itself is a container type, taking Go language as an example, this container can be slice, map or even channel;
• The container type needs to implement a method that takes a function type parameter and applies that function to each element of the container, resulting in a new functor, leaving the element values ​​inside the original functor container unaffected.

Although functions as “first-class citizens” bring powerful expressive power to Go, if you choose an inappropriate style or perform functional programming for the sake of functionality, then the code will be difficult to understand and the code execution efficiency will not be high. (CPS requires the language to support tail recursion optimization, but Go does not currently support it)

defer的第二个重要用途就是拦截panic,并按需要对panic进行处理,可以尝试从panic中恢复(这也是Go语言中唯一的从panic中恢复的手段)

对于自定义的函数或方法,defer可以给予无条件的支持,但是对于有返回值的自定义函数或方法,返回值会在deferred函数被调度执行的时候被自动丢弃

defer关键字后面的表达式是在将deferred函数注册到deferred函数栈的时候进行求值的。

defer让进行资源释放(如文件描述符、锁)的过程变得优雅很多,也不易出错。但在性能敏感的程序中,defer带来的性能负担也是Gopher必须知晓和权衡的

使用defer的函数的执行时间是没有使用defer的函数的7倍左右

在Go 1.14版本中,defer性能提升巨大,已经和不用defer的性能相差很小了

和函数相比,Go语言中的方法在声明形式上仅仅多了一个参数,Go称之为receiver参数。receiver参数是方法与类型之间的纽带

这种直接以类型名T调用方法的表达方式被称为方法表达式(Method Expression)。类型T只能调用T的方法集合(Method Set)中的方法

方法集合决定接口实现

我们首先要识别出自定义类型的方法集合和接口类型的方法集合

对于非接口类型的自定义类型T,其方法集合由所有receiver为T类型的方法组成

而类型T的方法集合则包含所有receiver为T和T类型的方法

Go的设计哲学之一是偏好组合,Go支持用组合的思想来实现一些面向对象领域经典的机制,比如继承。而具体的方式就是利用类型嵌入(type embedding)。

不过在Go 1.14之前的版本中这种方式有一个约束,那就是被嵌入的接口类型的方法集合不能有交集

在结构体类型中嵌入接口类型后,该结构体类型的方法集合中将包含被嵌入接口类型的方法集合

优先选择结构体自身实现的方法。

如果结构体自身并未实现,那么将查找结构体中的嵌入接口类型的方法集合中是否有该方法,如果有,则提升(promoted)为结构体的方法。

如果结构体嵌入了多个接口类型且这些接口类型的方法集合存在交集,那么Go编译器将报错,除非结构体自己实现了交集中的所有方法。

结构体类型在嵌入某接口类型的同时,也实现了这个接口

在结构体类型中嵌入结构体类型为Gopher提供了一种实现“继承”的手段

• T类型的方法集合= T1的方法集合+ *T2的方法集合;

T类型的方法集合= T1的方法集合+ *T2的方法集合。

已有的类型(比如上面的I、T)被称为underlying类型,而新类型被称为defined类型。

基于接口类型创建的defined类型与原接口类型的方法集合是一致的

而基于自定义非接口类型创建的defined类型则并没有“继承”原类型的方法集合,新的defined类型的方法集合是空的。

方法集合决定接口实现。基于自定义非接口类型的defined类型的方法集合为空,这决定了即便原类型实现了某些接口,基于其创建的defined类型也没有“继承”这一隐式关联。新defined类型要想实现那些接口,仍需重新实现接口的所有方法。

Go预定义标识符rune、byte就是通过类型别名语法定义的:
// $GOROOT/src/builtin/builtin.go type byte = uint8 type rune = int32

类型别名与原类型拥有完全相同的方法集合,无论原类型是接口类型还是非接口类型。

虽然string类型变量可以直接赋值给interface{}类型变量,但是[]string类型变量并不能直接赋值给[]interface{}类型变量

Go语言不允许在同一个作用域下定义名字相同但函数原型不同的函数

如果要重载的函数的参数都是相同类型的,仅参数的个数是变化的,那么变长参数函数可以轻松对应;如果参数类型不同且个数可变,那么我们还要结合interface{}类型的特性。

如果参数在传入时有隐式要求的固定顺序(这点由调用者保证),我们还可以利用变长参数函数模拟实现函数的可选参数和默认参数

◆ 第五部分接口

Go语言推崇面向组合编程,而接口是Go语言中实践组合编程的重要手段。

接口是Go这门静态类型语言中唯一“动静兼备”的语言特性。

接口类型变量具有静态类型

支持在编译阶段的类型检查:当一个接口类型变量被赋值时,编译器会检查右值的类型是否实现了该接口方法集合中的所有方法。

接口类型变量兼具动态类型,即在运行时存储在接口类型变量中的值的真实类型。

接口类型变量在程序运行时可以被赋值为不同的动态类型变量,从而支持运行时多态。

装箱(boxing)是编程语言领域的一个基础概念,一般是指把值类型转换成引用类型

在Go语言中,将任意类型赋值给一个接口类型变量都是装箱操作

了前面对接口类型变量内部表示的了解,我们知道接口类型的装箱实则就是创建一个eface或iface的过程

接口越大,抽象程度越低。
——Rob Pike,Go语言之父

接口就是将对象的行为进行抽象而形成的契约。契约有繁有简,Go选择了去繁就简

契约的自动遵守:Go语言中接口与其实现者之间的关系是隐式的

实现者仅需实现接口方法集中的全部方法,便算是自动遵守了契约,实现了该接口

小契约:契约繁了便束缚了手脚,降低了灵活性,抑制了表现力。Go选择使用小契约,表现在代码上便是尽量定义小接口

接口越小,抽象程度越高,被接纳度越高

计算机程序本身就是对真实世界的抽象与再建构。抽象是对同类事物去除其个别的、次要的方面,抽取其相同的、主要的方面的方法

抽象程度越高,对应的集合空间越大;抽象程度越低(越具象,越接近事物的真实面貌),对应的集合空间越小

接口越小(接口方法少),抽象程度越高,对应的事物集合越大,即被事物接纳的程度越高。而这种情况的极限恰是无方法的空接口interface{},空接口的这个抽象对应的事物集合空间包含了Go语言世界的所有事物。

Go的设计原则推崇通过组合的方式构建程序

小接口更契合Go的组合思想,也更容易发挥出组合的威力

专注于接口是编写强大而灵活的Go代码的关键

越偏向业务层,抽象难度越高

有了接口后,我们就会看到接口被用在代码的各个地方。一段时间后,我们来分析哪些场合使用了接口的哪些方法,是否可以将这些场合使用的接口的方法提取出来放入一个新的小接口中,就像图27-6中的那样。
在图27-6中,大接口1定义了6个方法。一段时间后,我们发现方法1和方法2经常用在场合1中,方法3和方法4经常用在场合2中,方法5和方法6经常用在场合3中。这说明大接口1的方法呈现出一种按业务逻辑自然分组的状态。

空接口不提供任何信息。
——Rob Pike,Go语言之父

与Java的严格约束和编译期检查不同,动态语言走向另一个“极端”:接口的实现者无须做任何显式的接口实现声明,Ruby解释器也不做任何检查

在函数或方法参数中使用空接口类型,意味着你没有为编译器提供关于传入实参数据的任何信息,因此,你将失去静态类型语言类型安全检查的保护屏障,你需要自己检查类似的错误,并且直到运行时才能发现此类错误。

建议广大Gopher尽可能抽象出带有一定行为契约的接口,并将其作为函数参数类型,尽量不要使用可以逃过编译器类型安全检查的空接口类型(interface{})。

仅在处理未知类型数据时使用空接口类型;

其他情况下,尽可能将你需要的行为抽象成带有方法的接口,并使用这样的非空接口类型作为函数或方法的参数

如果说C++和Java是关于类型层次结构和类型分类的语言,那么Go则是关于组合的语言。
——Rob Pike,Go语言之父

“偏好组合,正交解耦”是Go语言的重要设计哲学之一。

正交性为“组合”哲学的落地提供了前提,而组合就像本条开头引用的Rob Pike的观点那样,是Go程序内各组件间的主要耦合方式,也是搭建Go程序静态结构的主要方式。

组合方式莫过于以下3种。
(1)通过嵌入接口构建接口

(2)通过嵌入接口构建结构体

(3)通过嵌入结构体构建新结构体

而通过接口进行水平组合的一种常见模式是使用接受接口类型参数的函数或方法。


图29-1 以接口为连接点的水平组合的基本形式

包裹函数(wrapper function)的形式是这样的:它接受接口类型参数,并返回与其参数类型相同的返回值

通过包裹函数可以实现对输入数据的过滤、装饰、变换等操作,并将结果再次返回给调用者。

由于包裹函数的返回值类型与参数类型相同,因此我们可以将多个接受同一接口类型参数的包裹函数组合成一条链来调用

适配器函数类型(adapter function type)是一个辅助水平组合实现的“工具”类型

它可以将一个满足特定函数签名的普通函数显式转换成自身类型的实例,转换后的实例同时也是某个单方法接口类型的实现者

在上述例子中通过http.HandlerFunc这个适配器函数类型,可以将普通函数greetings快速转换为实现了http.Handler接口的类型。转换后,我们便可以将其实例用作实参,实现基于接口的组合了

中间件就是包裹函数和适配器函数类型结合的产物

所谓中间件(如logHandler、authHandler)本质上就是一个包裹函数(支持链式调用),但其内部利用了适配器函数类型(http.HandlerFunc)将一个普通函数(如例子中的几个匿名函数)转换为实现了http.Handler的类型的实例,并将其作为返回值返回。

单元测试是自包含和自运行的,运行时一般不会依赖外部资源(如外部数据库、外部邮件服务器等),并具备跨环境的可重复性(比如:既可以在开发人员的本地运行,也可以在持续集成环境中运行)。

接口本是契约,天然具有降低耦合的作用

◆ 第六部分并发编程

并发不是并行,并发关乎结构,并行关乎执行。
——Rob Pike,Go语言之父

goroutine相比传统操作系统线程而言具有如下优势。
1)资源占用小,每个goroutine的初始栈大小仅为2KB。
// $GOROOT/src/runtime/stack.go const ( … // Go代码使用的最小栈空间大小 _StackMin = 2048 )

2)由Go运行时而不是操作系统调度,goroutine上下文切换代价较小。
3)语言原生支持:goroutine由go关键字接函数或方法创建,函数或方法返回即表示goroutine退出,开发体验更佳。
4)语言内置channel作为goroutine间通信原语,为并发设计提供强大支撑。

一条很显然的改进思路是让这些环节“同时”运行起来,就像流水线一样,这就是并发(见图31-5)。

并发在程序的设计和实现阶段,并行在程序的执行阶段。

发现了GM模型的不足后,Dmitry Vyukov亲自操刀改进了goroutine调度器,在Go 1.1版本中实现了GPM调度模型和work stealing算法[1],这个模型一直沿用至今,

计算机科学领域的任何问题都可以通过增加一个间接的中间层来解决

不要通过共享内存来通信,而应该通过通信来共享内存。
——Rob Pike,Go语言之父

Go始终推荐以CSP模型风格构建并发程序

• goroutine:对应CSP模型中的P,封装了数据的处理逻辑,是Go运行时调度的基本执行单元。
• channel:对应CSP模型中的输入/输出原语,用于goroutine之间的通信和同步。
• select:用于应对多路输入/输出,可以让goroutine同时协调处理多个channel操作。

goroutine的执行函数返回,即意味着goroutine退出

但一些常驻的后台服务程序可能会对goroutine有着优雅退出的要求

可以通过Go语言提供的sync.WaitGroup实现等待多个goroutine退出的模式

Go语言的channel有一个特性是,当使用close函数关闭channel时,所有阻塞到该channel上的goroutine都会得到通知

在上述代码里,将每次的left(剩余时间)传入下一个要执行的goroutine的Shutdown方法中。select同样使用这个left作为timeout的值(通过timer.Reset重新设置timer定时器周期)。对照ConcurrentShutdown,SequentialShutdown更简单,这里就不详细介绍了。

  1. 管道模式很多Go初学者在初次看到Go提供的并发原语channel时,很容易联想到Unix/Linux平台上的管道机制。下面就是一条利用管道机制过滤出当前路径下以”.go”结尾的文件列表的命令:
    $ls -l|grep “.go”
    Unix/Linux的管道机制就是将前面程序的输出数据作为输入数据传递给后面的程序,比如:上面的命令就是将ls -l的结果数据通过管道传递给grep程序。
    管道是Unix/Linux上一种典型的并发程序设计模式,也是Unix崇尚“组合”设计哲学的具体体现。Go中没有定义管道,但是具有深厚Unix文化背景的Go语言缔造者们显然借鉴了Unix的设计哲学,在Go中引入了channel这种并发原语,而channel原语使构建管道并发模式变得容易且自然,如图33-3所示。

case y, ok := <-c2: // 从channel c2接收数据,并根据ok值判断c2是否已经关闭

在上面的例子中,main goroutine创建了一组5个worker goroutine,这些goroutine启动后会阻塞在名为groupSignal的无缓冲channel上。main goroutine通过close(groupSignal)向所有worker goroutine广播“开始工作”的信号

func Increase() int { cter.Lock() defer cter.Unlock() cter.i++ return cter.i } func main() { for i := 0; i < 10; i++ { go func(i int) { v := Increase() fmt.Printf(“goroutine-%d: current counter value is %d\n”, i, v) }(i) } time.Sleep(5 * time.Second) }
下面是使用无缓冲channel替代锁后的实现:
// chapter6/sources/go-channel-case-6.go type counter struct { c chan int i int } var cter counter func InitCounter() { cter = counter{ c: make(chan int),

如果s是chan T类型,那么len(s)针对channel的类型不同,有如下两种语义:
◦ 当s为无缓冲channel时,len(s)总是返回0;
◦ 当s为带缓冲channel时,len(s)返回当前channel s中尚未被读取的元素个数。

select语句的default分支的语义是在其他分支均因通信未就绪而无法被选择的时候执行,这就为default分支赋予了一种“避免阻塞”的特性

面向CSP并发模型的channel原语和面向传统共享内存并发模型的sync包提供的原语已经足以满足Go语言应用并发设计中99.9%的并发同步需求了,而剩余那0.1%的需求,可以使用Go标准库提供的atomic包来实现。

原子操作由底层硬件直接提供支持,是一种硬件实现的指令级“事务”

atomic包更适合一些对性能十分敏感、并发量较大且读多写少的场合。

◆ 第七部分错误处理

C++之父Bjarne Stroustrup曾说过:“世界上有两类编程语言,一类是总被人抱怨和诟病的,而另一类是无人使用的。”

Go语言设计者们选择了C语言家族的经典错误机制:错误就是值,而错误处理就是基于值比较后的决策

错误是值,只是以error接口变量的形式统一呈现(按惯例,函数或方法通常将error类型返回值放在返回值列表的末尾)

Go 1.13及后续版本中,当我们在格式化字符串中使用%w时,fmt.Errorf返回的错误值的底层类型为fmt.wrapError

与errorString相比,wrapError多实现了Unwrap方法,这使得被wrapError类型包装的错误值在包装错误链中被检视(inspect)到

标准库中的net包就定义了一种携带额外错误上下文的错误类型

代码所在栈帧越低(越接近于main函数栈帧),if err != nil就越不常见;反之,代码在栈中的位置越高(更接近于网络I/O操作或操作系统API调用),if err != nil就越常见

panic和recover让函数调用的性能降低了约90%

Go提供了panic专门用于处理异常,而我们建议不要使用panic进行正常的错误处理

  1. 充当断言角色,提示潜在bug

针对每个连接,http包都会启动一个单独的goroutine运行用户传入的handler函数

39.3 理解panic的输出信息由前面的描述可以知道,在Go标准库中,大多数panic是充当类似断言的作用的。每次因panic导致程序崩溃后,程序都会输出大量信息,这些信息可以辅助程序员快速定位bug。那么如何理解这些信息呢?这里我们通过一个真实发生的例子中输出的panic信息来说明一下。
下面是某程序发生panic时真实输出的异常信息摘录:
panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8ca449] goroutine 266900 [running]: pkg.tonybai.com/smspush/vendor/github.com/bigwhite/gocmpp.(*Client).Connect(0xc42040c7f0, 0xc4203d29c0, 0x11, 0xc420423256, 0x6, 0xc420423260, 0x8, 0x37e11d600, 0x0, 0x0) /root/.go/src/pkg.tonybai.com/smspush/vendor/github.com/bigwhite/gocmpp/client.go:79 +0x239 pkg.tonybai.com/smspush/pkg/pushd/pusher.cmpp2Login(0xc4203d29c0, 0x11, 0xc420423256, 0x6, 0xc420423260, 0x8, 0x37e11d600, 0xc4203d29c0, 0x11, 0x73) /root/.go/src/pkg.tonybai.com/smspush/pkg/pushd/pusher/cmpp2_handler.go:25 +0x9a pkg.tonybai.com/smspush/pkg/pushd/pusher.newCMPP2Loop(0xc42071f800, 0x4, 0xaaecd8) /root/.go/src/pkg.tonybai.com/smspush/pkg/pushd/pusher/cmpp2_handler.go:65 +0x226

本文转自: https://blog.frytea.com/archives/680/
This site is only for collection, and the copyright belongs to the original author.