Maybe I Won’t Delete You: Duplicate Code Doesn’t Equal Bad Smell

January 04 00:00~01:10

截圖 2023-01-04 上午1.09.04

foreword

The Lunar New Year will be celebrated within a few days of work in January this year. At the end of last year, when arranging Teddy Software’s 2023 class schedule, I simply vacated the entire January. Originally, I wanted to find an opportunity to sneak out to play abroad again, but because I was too busy at the end of last year and had no time to arrange my schedule, I simply used the time to study at home.

I have read several books at the same time in the past few days, and one of them is ” Five Lines of Code: How and when to refactor ” which is very interesting. This is a book discussing Refactoring (refactoring). Different from traditional refactoring, Rule-Based Refactoring is used in the book to let developers know when and how to refactor by providing several very specific rules. For example, the first rule in the book is the title of the book: the code of each method cannot exceed 5 lines (excluding brackets).

Today Teddy is going to talk about another problem often encountered in refactoring mentioned in this book, which is Duplicated Code (duplicated code). Traditionally, it is believed that repetitive code is Bad Smell (bad taste, strange taste), and it should be eradicated. On this point, there is basically no controversy. If there are a lot of repeated codes in the program, once the requirements change involves these codes, the developer needs to modify each of the repeated codes, otherwise the system will cause errors. Which leads to another bad taste: Shotgun Surgery .

***

Duplicate code revisited

In the past few years, Teddy has learned Clean Architecture and Microservices, and found that it is a very common method to avoid dependencies between modules by repeating code . For example, the cross-layer principle of Clean Architecture requires that when the objects of the Entities Layer leave the Use Cases Layer, they must be converted into another data structure; if they are passed to the front end, they will be converted to DTO (Data Transfer Object), and if they are passed to the database, they will be converted to PO ( Persistent Object). Although DTO and PO are not exactly the same as their corresponding Entities Layer objects, and the only repeated part is data, but in a broad sense, this can also be regarded as a kind of repeated code.

The book ” Five Lines of Code: How and when to refactor ” explains this very well:

Sharing code increases global behavior-change velocity, while duplicating code increases local behavior-change velocity.

For a certain function, if all codes are shared (no duplicate codes), then when the system behavior changes, only one place needs to be changed, so the speed of “global behavior-change (global behavior change)” will be very fast . Conversely, if the same function is copied every time it is used, then the copied program code is independent of the original “deity” and the coupling is removed. In this case, the speed of “local behavior-change (regional behavior change)” will be very fast.

From a biological point of view, isolated regions, such as islands, deserts or mountains, often develop “endemic” organisms. At first the origin of these organisms was the same, but in order to meet “regional needs”, gradually evolved different characteristics. The same is often the case with code, because the shared code may have different regional requirements (called local invariants in the book) at the same time, if you insist on “sharing” at this time, it may lead to the need to go back and modify the shared program code, giving it the ability to adapt to different locale requirements (via configuration or dependency injection), thus increasing the complexity of the common code, which may eventually be so complex that it reduces its readability, which in turn reduces modifiability (originally Sharing hopes to increase the speed of global behavior-change, but in this case, it may reduce the speed of global behavior-change instead).

***

Questions from Shotgun Surgery?

At this point in the discussion, Teddy is not advocating the blind use of “copy and paste” to remove dependencies between modules, which is great. The method of “removing dependencies using repetition” still needs to be considered from the two perspectives of software architecture and boundaries . to consider. Assuming that the boundary where your repetition occurs is in the same method, this repetition is almost certainly a bad taste, and you can use Extract Method to remove it. If the repeated boundary is the same package, there is also a high chance of bad smell. However, as Teddy mentioned above, under the cross-layer principle of Clean Architecture, the boundary of repetition is already between architectural layers. At this time, the use of repetition to remove dependencies is less controversial.

Another common situation is that in the microservice architecture, the downstream microservice listens to the events of the upstream microservice and builds a read model (Read Model) on the local side to isolate the dependencies between the two microservices during execution.

***

in conclusion

In the process of developing ezKanban, Teddy also encountered a lot of design decisions ” whether to share or use repetition to remove dependencies “. For example, ezKanban supports two state storage methods, Event Sourcing and State Sourcing. At the beginning, ezKanban’s reports first supported State Sourcing, and then support Event Sourcing. In order to support these two storage methods, the team applied the Pluggable Adapter design pattern in the implementation of Repository. The whole design is simple and easy to understand.

After the development was completed, the team found that there were many reports except for the different methods of obtaining data (one used SQL and the other operated event streams), and the calculation logic for generating reports was roughly the same. In order to remove these duplications, it took a lot of time to refactor the system. After the refactoring is complete, the duplicate code is removed, but the price paid is that the design becomes more indirect (because there is an additional layer of abstract interface), and it is not so intuitive.

Teddy thinks this is an interesting topic, and re-examines the matter of “code sharing”. Sharing is not necessarily good, think about your middle platform… XD.

***

Yuzo’s inner monologue: To share, or not to share, that is the question.

This article is transferred from https://teddy-chen-tw.blogspot.com/2023/01/blog-post.html
This site is only for collection, and the copyright belongs to the original author.