World-class programming master Uncle Bob’s defense of “clean code” was questioned: times have changed, don’t ask us to use Clean Code!

Compilation | Chu Xingjuan, Nuka-Cola

I don’t think my “Clean Code” series of tutorials are really that bad.

Not long ago, Casey Muratori, a senior game engine developer, published an article on ” clean” code and poor performance , which sparked a lot of discussions among developers. Casey also posted a 20+ minute video: https://www.youtube.com/watch?v=tD5NrevFtbU

Subsequently, Robert C. Martin, the author of the classic “The Way of Clean Code” (became a professional programmer in the early 1970s, a world-class programming master, a pioneer of design patterns and agile development, and the younger programmers affectionately called it ” Uncle Bob”) also joined the debate between “clean code” and performance. He tweeted:

Someone recently equated Clean Code with over-engineering. This is of course an oxymoron. Overdesigned code is, by definition, unclean. It begs the question whether those who complain loudly have actually studied who they are complaining about.

In this regard, some netizens asked: Is it considered over-engineering to decompose a 150-line function into a bunch of small methods called only by this function? Uncle Bob responded, “It all depends on the goals of the engineer. If it is for readability and expressiveness, then such a decomposition is optional. But if it is for performance, then this decomposition may not be optimal.”

Uncle Bob then made a series of comments, then added that he forgot @Casey, and the two started a “conversation” back and forth. Netizens who have watched their conversations said, “Almost all the topics involving Casey Muratori and Jonathan Blow (Note: well-known independent game developers in the industry) can be summarized as follows:

    • Casey and Jon Blow work in the gaming specialty. They are all good at these things.

    • They consider the truths in their own domain to be universal.

    • The programming technology in the field of http server is different from the field of game engine or game rendering. It is meaningless to compare and discuss it. They all think that the technology of the other party is “wrong”. Most of the http server programming is “stateless”, and these technologies in the game field are pushed in front of them, which makes everyone start to lose their minds. Both are right in a sense, but both believe the other is wrong. Both sides think they are talking about the same topic, when in fact they are talking about different topics.

    • Repeat infinitely.

Finally, the netizen added something that made Bob feel sad: “However, this is not specifically defending Clean Code, that book is very bad.”

Later, some netizens also began to comment on Uncle Bob’s “The Way of Clean Code”: I am not from my parents’ generation, but I have read this book and have a suggestion: avoid reading it. “Strictly speaking, this book isn’t all bad. Most of it is pretty reasonable, and as far as I recall, some of the advice is actually good. The problem is, the only things that tell the good from the bad are the ones that don’t need it at all.” The rest of us are doomed to take it at face value, and we end up being steadfast zealots who blindly follow principles that make their programs 3-5x bigger than they should be (not even exaggerating) .”

The following is the “dialogue” between the two of them around Clean Code, expressing what they think “clean code” should look like in a reasonable way, which is full of dissatisfaction with each other.

first round

Casey: None of us are on the same channel.

Uncle Bob: I don’t feel it. What you said was inaccurate, but it doesn’t matter anymore.

Casey : Before we respond, let’s clarify a little bit. Most of the explanations you mentioned about clean code, I also mentioned in the video-such as preferring inheritance hierarchies instead of if/switch statements; not exposing object internals (aka “Demeter’s Law “)such as. But it seems that you are surprised by what I said, so before discussing type design formally, can you explain this problem first? Only in this way can I understand why we can’t get on the same channel all the time.

Uncle Bob : Can’t you get on the channel? I don’t feel that way. I only watched the first half of your video, and then I feel like I’ve figured it out. I sent a reply saying that your analysis is basically correct. I also said that your wording for “clean code” was inaccurate, but I don’t remember where it was inaccurate, and it doesn’t matter anyway.

In conclusion, what I’m trying to say is that the structure you’ve shown is not the kind of optimal performance design that squeezes out every nanosecond limit. In fact, these structures may waste a lot of nanoseconds, or the execution efficiency is not at the level of nanoseconds at all. Back in the days when every bit of headroom counted, we planned function call overhead and overhead very carefully. We even break the loop if we can, especially in embedded real-time environments.

Today, however, such environments are very rare. Most software systems consume less than 1% of the performance of a modern processor. What’s more, processors are cheap and readily available . These facts have changed the basic thinking of development work—the focus has shifted from program performance to development efficiency, and the ability of developers to build systems and keep systems running stably. It is these needs that make the concept of “clean code” take off in an all-round way.

For most organizations, saving time for programmers is more economically valuable than saving CPU cycles for computers. So if we have to say that we “can’t match the channel”, it may be in the priority judgment. Clean code is useless if you want to squeeze every nanosecond of speed up, but if you want to maximize the productivity of your development team per man-hour, clean code is often an effective strategy to achieve your goal.

Casey is obviously not convinced yet : can you be more specific so we don’t have any misunderstandings. Can you give some specific software examples? For example, assuming we are all familiar with Visual Studio and CLANG/LLVM, does this also fit with your earlier mention that the vast majority of software consumes less than 1% of the resources of a modern processor?

Uncle Bob : No, I think IDE is a very professional software system, which is a rare case.

IDEs are very interesting because they cover such a wide range of situations. Some of them need to be cut to a few nanoseconds, but some of them don’t care about performance fluctuations at all. Modern IDEs must be able to parse a large amount of code while the user is typing on the keyboard. This parsing process is very performance-critical, so as to ensure that it can keep up with the developer’s input speed. But on the other hand, the code in the configuration dialog part is not very efficient.

By the way, the efficiency emphasized by the IDE compilation engine is more about algorithmic efficiency than circular lean. Looping lean code can improve efficiency by an order of magnitude, but choosing the right algorithm can directly improve efficiency by several orders of magnitude.

Also, the kind of software I’m talking about that consumes less than 1% of the resources of a modern processor is actually a conventional system that programmers use a lot. Examples include a website, a calendar app, a process control dashboard (to manage simple processes), etc. In fact, almost any Rails application, Python or Ruby application, and even most Java applications fall into this category. And none of them live up to the demands of pushing performance to the limit.

My current language of choice is Clojure, which is 1/30 the speed of an equivalent Java program, and maybe 1/60 the speed of an equivalent C program. But I don’t care, after all I can always switch to Java if necessary. Plus, for many applications, switching to a more powerful processor is the cheapest and easiest solution. In short, I think helping programmers save time is the most important direction of cost reduction at present.

But don’t get me wrong. I am also an old assembler player and C user who grew up in the 70s and 80s of last century . When necessary, I will also carefully calculate the difference at the microsecond level (nanoseconds are too exaggerated, and humans can hardly grasp them). So I know the importance of looping lean code. But today’s processors are tens of thousands of times faster than the devices we used at the time, so for most software systems today, we are more inclined to “waste” a little CPU cycles in exchange for a programmer’s happy life.

second round

Casey: So you mean XXX?

Uncle Bob: I don’t quite agree.

Casey : If I understand correctly, you mean that software can be divided into two categories, and the specific situations need to be classified before analysis. From this perspective, most of the software I use on a daily basis actually falls into the “every nanosecond counts” category, such as Visual Studio, LLVM, GCC, Microsoft Word, PowerPoint, Excel, Firefox, Chrome, fmmpeg, TensorFlow, Linux, Windows, MacOS, OpenSSL, etc. Do you agree, that these software require a high focus on performance?

Uncle Bob : Not entirely agree. On the contrary, my experience is that most software still needs to be subdivided. Some modules need to execute in nanosecond cycles, while other modules can tolerate microseconds, milliseconds, or even longer response times. Yes, some modules are even fine with a response time of 1 second or less.

Most applications are composed of multiple modules, and different modules correspond to different application areas. Chrome, for example, has to render quickly. When populating complex web pages, every microsecond counts. On the other hand, the preferences dialog in Chrome is relatively performance-less, and can be as responsive as milliseconds.

If we plot a histogram of response times for the individual modules in a particular application, we should see some kind of non-normal distribution. Some applications may contain a large number of nanosecond-level modules and a small number of millisecond-level modules, while other applications may have most of the modules at the millisecond level and only a few at the nanosecond level.

For example, I’m currently working on an application where the vast majority of modules are fine with millisecond response; however, there are a few modules that require 20 times the performance of others. My strategy is to write millisecond modules in Clojure because it’s not fast, but it’s a very convenient language. The microsecond module is written in Java, which is faster but less convenient.

Some languages ​​and structures are actually the abstraction results of the bare metal, which can help programmers to focus on solving the problem itself. For example, programmers can write millisecond-level code much more efficiently without the distraction of optimizing L2 cache hit rates. Instead, they can pay more attention to business needs, especially whether others can understand the code and take over maintenance when others take over the project after a few years.

There are also some languages ​​and structures that map directly to the bare metal, so that programmers can more easily squeeze the remaining performance through algorithmic limits. Such structures tend to be more difficult to write, interpret, and maintain, but if you write an object that really needs nanosecond performance, you have to live with it.

Of course, these are just two extreme cases, and most software and languages ​​are actually somewhere in between. So as developers, we should understand these environments and know which environment is most suitable for the current practical problem.

About ten years ago, I wrote a book called “Clean Code”. These focus more on millisecond-level issues than nanosecond-level issues. In my opinion, until now, programmer productivity has been a relatively more important issue. The book discusses the trade-offs between polymorphism and switch statements, using a whole chapter. I would like to quote the concluding statement again, “Mature programmers know that all understanding of objects is unreliable. Sometimes, all we really need are simple data structures and procedures that can manipulate them.”

Casey : Well, let me adjust my statement again. Visual Studio, LLVM, GCC, Microsoft Word, PowerPoint, Excel, Firefox, Chrome, fmmpeg, TensorFlow, Linux, Windows, MacOS, and OpenSSL, for which “millisecond performance is enough” for at least some modules in the program ’, right?

Uncle Bob: Milliseconds? Of course it is. I also admit that there are many microsecond or even nanosecond modules in these programs. But most of the time, milliseconds will suffice.

third round

Casey: Okay, let’s talk about something else.

Uncle Bob: Let me give you a little “reminder”.

Casey : Great, it looks like we’ve reached an agreement on the software category. I would like to describe the characteristics of the coding practice you mentioned. Since the features you’re talking about are also applicable to software like LLVM, I’ll use that as a proxy. And LLVM happens to be open source, so we know exactly how it works and how it’s built (Visual Studio doesn’t).

I think you are emphasizing the same point in your reply, book, and lecture, which is that when programming large software like LLVM, programmers don’t have to care too much about performance. They should pay more attention to how to improve their development productivity. If the scenario is limited to the simple calendar application level you mentioned earlier, then there is no problem. But in LLVM, there is indeed a “nanoseconds/microseconds/milliseconds matter” situation, so sooner or later programmers have to think hard about performance optimization, otherwise they will definitely find that the program runs too slowly.

Suppose someone wants to build a really large program with LLVM, like Unreal Engine or Chrome, etc. In this case, assuming that the performance problem arises in some isolated parts of the code (that is, the “modules” you mentioned earlier), these parts should definitely be rewritten to be performance-oriented now.

That’s my understanding of your previous statement, including “if my Clojure code is too slow, I can always switch to Java”, which means that if a certain part needs more performance, you will rewrite it in Java.

Is my understanding correct?

The next day, before answering the question, Uncle Bob said:

By the way, I went back and watched your full video the other day. I figured that since we’ve decided to join the discussion, I should really take a hard look at your point. Let me talk about the conclusion first: I disagree with some of your remarks, and I will make some supplements in the form of “warm reminders”.

Uncle Bob : First of all, the area of ​​these geometric shapes can be calculated by the same basic formula (KxLxW), which is really cool. The beauty of it should only be appreciated by programmers and mathematicians.

Overall, I think your videos do a good job of explaining how programmers need to find their way out of resource-constrained environments. Obviously, in a resource-rich environment, one would not specifically choose the KxLxW solution, after all, one is not sure whether other shapes will be introduced in the scene. In addition, even if the scope of the problem is stable within the KxLxW-compliant situation, using a more traditional formula can lower the threshold of understanding for other programmers. After all, this is a rare algorithm, and people tend to get confused and even re-validate. I admit that this verification process is a wonderful, even “eureka” moment, but the office is busy and it’s best not to take up precious time needlessly.

I don’t know if you’ve read Don Norman’s work. A long time ago, he wrote a book called “The Design of Everyday Things”, which is quite worth reading. In his book, he states this rule of thumb: “If you think something is delicate and complicated, be careful-it may be the result of self-indulgence.” So in a resource-rich environment, I think the KxLxW solution belongs to This situation.

Uncle Bob : I’m one of the signers of the Agile Manifesto, and they strongly believe in the importance of up-front architecture and design.

According to the case you mentioned, then I may have to think about it again, including looking for possible performance problems and paying more attention to these modules. For example, I may have developed a stripped-down module and put it into a stress test to see if it works. Of course, my biggest worry will always be to spend a lot of time and energy, but because the wrong method is chosen, it will not meet the customer’s needs in the end (this has happened to me too).

All in all, for such complex problems, it is never enough to focus on a single factor for analysis. There is no one right way, and I have made this point many times in Clean Code.

fourth round

Casey: You didn’t answer my question after talking for a long time.

Uncle Bob: You have a point, but I talked about this in class yesterday, thank you.

Casey : While reading your reply, many questions have popped up in my mind. But you ended up saying it really well, so let’s start there.

In the conversation, you mentioned that there are several performance concerns in software architecture: IDE parsers pay more attention to “nanoseconds”, so “modules” should be divided into several response requirements levels such as nanoseconds/microseconds/milliseconds . You also suggest that programmers can create a “lite” module first, then analyze its operation effect through stress testing, and finally write software to ensure that performance remains within an acceptable range. You can even choose different languages ​​according to your performance needs, such as Clojure, Java and C in your example. In summary, your point is “so as developers, we should understand these environments and know exactly which environment is most suitable for the actual problem at hand.”

After talking so much, I want to return to the original question: Why do you seem surprised that I put “clean code” as the opposite of pursuing performance? You talked for a long time, but you didn’t mention this point of view at all.

Of course, you have also affirmed the importance of performance in your book and blog post. But in terms of quantity, all the things you just said accounted for a very small proportion of your previous expressions. For example, here’s a six-part, several-hour video collection, the Clean Code lecture series. In the nine hours of content, you never mentioned the content of the previous reply:

If you really attach as much importance to performance issues as you have shown in your reply, why didn’t you spend even one hour in the nine-hour course to explain to the audience the practical significance of optimizing performance and designing in advance ? For example, the code may have an impact on performance, so you should avoid programming structures that are harmful to performance, including the pre-established performance tests you mentioned in your reply.

Or from another perspective, do you think that the meaning of performance is taken for granted and needs no elaboration, so you did not give special emphasis. But wouldn’t your audience be unfamiliar with performance, and wouldn’t this bias prevent them from thinking about the right question at the right time? After all, you have started to make a summary like “as a developer…”, so you should mention it more or less?

Uncle Bob : Frankly, I think your criticism is valid. And as it happens, I was in class yesterday and I happened to spend a lot of time discussing performance costs and productivity gains in the discipline of software development. Thank you for your supervision.

But to say “accident”, I don’t think it is accurate. But after all, it’s just a modal particle, so there’s no need to dwell on it.

You asked me if I always feel that performance is always important, and there is no need to emphasize it. Through self-reflection, I think this may be the case. I am not an expert in performance, my specialty is to help software development teams efficiently build and maintain large and complex software systems, providing them with practices, principles, design ideas and architectural patterns. As the saying goes, “If you pick up a hammer, everything looks like a nail.” We may all be used to looking at problems from our own professional perspective.

As for the fact that I seldom emphasize the significance of performance, in fact, looking at it from another angle, isn’t this another story of a hammer and a nail? Because you are an expert in performance tuning, you always like to look at problems from this perspective, which is similar.

But I still have to admit that this discussion was more helpful than I originally expected, and it also changed my perspective-although the change is not particularly large, I don’t think my “Clean Code” series of tutorials is really that bad . In short, if you read these nine hours from beginning to end, you will find that I have mentioned performance issues many times in it, and at least two or three times have reached a level of emphasis that you can recognize.

Because as you guessed, I do think performance issues are important and require forecasting and planning.

fifth round

Casey: That’s it, I don’t want to talk.

Uncle Bob: You enlightened me on performance issues caused by long single lines.

Casey : To be honest, performance tuning is everything to me 🙂 No kidding, while editing this reply on GitHub, I noticed that the page was starting to freeze due to typing too many lines in a paragraph. There are only a few hundred words, but because there are too many layers in the system, the operations that should be completed instantaneously are so slow that they affect the use. It’s one of the reasons I put so much emphasis on performance — right now, even those very simple software features are often too slow to be useful . This is by no means me making this up, you can check out this video I recorded of how the page freezes when I type a reply:

I’m using a Zen2 chip and it’s super fast! So I will take every opportunity to promote the meaning of performance, which may contain the possibility of improving the experience. A lot of organizations would never think about performance ratings like “nanoseconds/microseconds/milliseconds/seconds”, but I would say, please think about it. Just being able to put performance in their heads and help them acquire problem-solving skills would be a major improvement in the world.

So I think the topic to be discussed is almost here. If you want to continue the chat, then it may be extended to the field of architecture. That’s a wider discussion space than performance, and if you want it, I’m happy to go with it .

Uncle Bob : Your video is too exaggerated, the cps may not even be 25. I would like to ask what browser you are using. I’m using Vivaldi (a fork of Chrome) and while not as horrible as yours, the input lag is exaggerated. So I did some experiments, and it turns out that the delay has nothing to do with the file size, but with the paragraph length. The more I type in the same paragraph, the longer it gets and the worse the delay.

So why is this happening? First off, I think we’re all typing the same JavaScript code, after all nobody’s going to use tools written in the browser anymore. Second, I don’t think the author of this code ever imagined that someone would put the entire paragraph on a single line (note the line numbers on the left). Even so, at a rate of 25cps, the lag becomes noticeable around 200-300 characters. What’s going on here?

Could it be that the programmer is using some bad data structure, allocates a new block of memory for it every extension, and then copies the data into the new block? I remember that’s how the old Rouge Wave C++ library handled ever-growing strings. In short, this delay is too exaggerated.

Of course, this is more of an algorithm question than a purely performance question. After all, if the software is running too slowly, the first thing you want to check is the algorithm. But your point does make sense, the programmers who wrote this code had no idea how their functionality would be used by users, and so were poor at handling unexpected loads.

So maybe I should just hit a carriage return at the end of every line from now on.

In short, your reply inspired me to realize the performance problem caused by this single line being too long. Now I will limit the characters in a single line to less than 80, so that no matter how low-end the chip is, it should not freeze anymore.

The conversation between the two ended here. But in a tweet yesterday, Casey said:

I have bad news for you all. There will be… more videos. The article that caused the uproar was just something that was cut out of the prologue of my class. If we’re talking clean code and performance all over the place, get ready because I could be doing this for the next month.

“Regardless of my personal feelings about Bob or Casey, I really like the way two people who disagree can collaborate on a conversation by editing a shared conversation file. I really want to try this myself now Way.” A netizen said.

The text and pictures in this article are from AI Frontline

loading.gif

This article is transferred from https://www.techug.com/post/uncle-bob-a-world-class-programming-master-was-questioned-in-his-defense-of-clean-code-theb3e89d2ea3a07f7efbb1/
This site is only for collection, and the copyright belongs to the original author.