# Revisiting P and NP after 16 years

In 2006, I posted “What are P-Problems, NP-Problems, and NPC-Problems” on my blog (then MSN Space). This is something I jot down when I was doing an informatics competition in my second year of high school, and it was one of the first articles on my blog. Today, I accidentally discovered that this “popular science”, which I can’t wait to rewrite, still has a relatively large amount of reading. time flies. “StarCraft” (StarCraft) has a sequel, the German team beat the host Brazil 7-1, the guy in “The Apprentice” (The Apprentice) became the president, and there was an even bigger epidemic after SARS. It’s already 2022. During these 16 years, I went to college, published a book, married a wife, and raised a child. If I were to write a popular science article on the same topic now, what would I write? Coincidentally, there is some relevant content in my new book “The Magical Calculations: An Idle Book About Algorithms”. I took some excerpts from different chapters in the book, sorted them out, and came up with the following article, which may answer the question just now.

One day, my wife and I went to the supermarket for a big shopping. As always, after checking out, we need to carefully plan the order in which we put things into our shopping bags to prevent them from getting crushed. This is not an easy task, especially considering that there is no necessary connection between the weight of each object and the weight it can bear. Eggs and milk are very heavy, but they are also very afraid of pressure; towels and toilet paper are light, but they can withstand a lot of pressure. So, I suddenly thought of such a problem: given the respective weights of n objects and the maximum weight they can bear, determine whether they can be stacked in a pile, so that all objects will not be crushed (an object does not To be crushed means that the total weight of the objects on it is less than or equal to the maximum weight that it can bear).

It turns out that this is a very interesting question – after listening to this question, my wife doesn’t think about eating and drinking all day, and at night she dreams about the test data she constructed. Here, we might as well give a set of data for everyone to play with. Suppose there are four objects A, B, C, and D, among which: the self-weight of object A is 1, and the maximum load-bearing capacity is 9; the self-weight of object B is 5, and the maximum load-bearing capacity is 2; the self-weight of object C is 2, and the maximum load-bearing capacity is 4 ; Object D has a self-weight of 3 and a maximum load of 12. In this example, the safe stacking is the only way, can you figure it out?

Answer: C is at the top, followed by B, followed by A, and D at the bottom. Note that in this optimal solution, the topmost object is neither the smallest self-weight nor the smallest load-bearing limit, but the sum of its own weight and the load-bearing limit is the smallest. In fact, the four objects in the optimal solution are arranged according to the size of this sum! For an object in a certain stacking scheme, the result of subtracting the actual load from its maximum load may be called its safety factor. We can show that this stacking strategy, ordered by the sum of its own weight and carrying capacity, makes the most dangerous objects as safe as possible, that is, the smallest safety factor is maximized. If the safety factor of all objects is non-negative at this time, then we have a stacking scheme that satisfies the requirements; if there is still a negative safety factor at this time, then we will never find a way to make all objects safe the stacking scheme.

Assuming that in a certain stacking scheme, there are two adjacent objects, the weight and maximum load of the object above are W 1 and P 1 respectively, and the weight and maximum load of the object below are W 2 and P 2 respectively . Suppose again that the sum of the weights of all objects above them is W, which is the weight that both objects have to bear. If W 1 + P 1 is greater than W 2 + P 2 , then swap the positions of these two objects, what will happen? The original safety factor of the object below was P 2 − W − W 1 , and after moving to the top, the safety factor becomes P 2 − W, which undoubtedly makes it safer. The original safety factor of the object above was P 1 − W. After moving to the bottom, the safety factor becomes P 1 − W − W 2 . Although this value is reduced, it is still greater than the original safety factor P 2 of another object. − W − W 1 (the assumption of W 1 + P 1 > W 2 + P 2 is used here). Therefore, after exchanging two objects, instead of refreshing the lower limit of the safety factor, it can sometimes improve it upwards.

So, we can keep swapping objects with a smaller sum of their own weight and carrying capacity onto it, and it won’t make the situation worse anyway. The optimal solution finally obtained is consistent with the conclusions we gave earlier.

In order to solve a certain type of problem, we often need to give a strategy, or a plan, or a processing procedure, or a series of operating rules, or more appropriately, a set of calculation methods. This is called an “algorithm”.

Let’s wrap it up. Our problem is: given the respective weights of n objects, and the maximum weight each object can bear, determine whether they can be stacked in a stack so that all objects will not be crushed. Its algorithm is: sort by the sum of its own weight and the maximum load, and then check whether this can keep all objects from being crushed, and its answer determines the answer to the whole question. With the algorithm, the actual calculation process is generally handed over to the computer to complete. The job of a programmer is to write code and tell the computer about the algorithm.

It’s easy to have the computer add up each object’s own weight and its maximum weight. If there are n objects, then there are n additions. When sorting them, we do the following in sequence:

• Go through n numbers one by one and find the smallest number (n operations)
• Put the smallest number in the first position (1 operation)
• Go through the remaining n − 1 numbers one by one to find the smallest number (n − 1 operations)
• Put the smallest number in the 2nd position (1 operation)

This takes n + 1 + (n − 1) + 1 + … + 1 + 1 = (n 2 + 3n) / 2 operations. How an object swings from top to bottom is now known. In order to check whether something is crushed, it is necessary to accumulate the weight of the object from top to bottom, and compare it with the load-bearing of the next object while adding. Considering that the topmost object does not need to be tested, let’s say the number of operations is 2(n − 1). In this case, the entire algorithm requires (n 2 + 9n) / 2 − 2 operations. Of course, when writing a program, a lot of extra operations may be required in some details. However, for extremely fast computers, this is negligible.

Computers can easily process thousands of data, so one or two more operations will not fundamentally affect the overhead of the entire processing process. In order to evaluate the efficiency of processing data, we should pay more attention to the “level” of the total number of operations. We can even put (n 2 + 3n) / 2 operations, (n 2 + 9n) / 2 − 2 operations, and 2n 2 + n + 1 operations, 10n 2 − 13n + 500 operations, etc., all It’s called “the number of operations grows by the level of n 2 “, and they all mean the same thing: when n is large, if n becomes k times the original, then the total overhead will roughly become the original k 2 times. In 1894, the German mathematician Paul Bachmann proposed a very convenient “big O notation” (big O notation), which is really suitable here. If some function f(n) does not grow faster than the order of n 2 , then we can note that f(n) = O(n 2 ). Therefore, as n increases, the growth rates of (n 2 + 3n) / 2 , (n 2 + 9n) / 2 − 2 , 2n 2 + n + 1 , 10n 2 − 13n + 500 can be calculated in O(n 2 ) to indicate. Similarly, n 3 / 10, (n 3 + 3n) / 2, 10n 3 − 6n 2 + 100 are equivalent to O(n 3 ).

The number of operations required to complete the algorithm just now is O(n 2 ). Assuming that each operation takes the same time, the time required to run this algorithm should also be O(n 2 ) level. More professionally, the “time complexity” of the entire algorithm is O(n 2 ). The 80386 chip, introduced by Intel in 1985, can execute 2 million instructions per second, the 1999 Intel Pentium III processor can execute 2 billion instructions per second, and the 2012 Intel Core i7 processor can execute 100 billion+ instructions. Let’s assume that when n = 10, with the help of the above algorithm, the computer only needs 0.1 milliseconds to get the answer. The time complexity of the algorithm is O(n 2 ), which means that when n increases by 100 times, the time required to complete the operation will increase by 10,000 times. Therefore, if n becomes 1000, the computer only needs 1 second to get the answer. Even if n were increased to 100,000, the computer would only need 10,000 seconds to get the answer, which is roughly 2 hours and 47 minutes.

In fact, in order to judge whether these objects can be safely stacked, it seems that we don’t have to go to so much trouble at all. We also have a more basic approach: enumerate all possible stacking orders and see if any meet the requirements. There are a total of n! different stacking orders for n objects, and each inspection takes O(n) time. So, to get the answer, the worst case time complexity is O(n · n!). So why don’t we use this crude and bold algorithm? The main reason is probably that the time complexity of this algorithm is too high. However, since the computing speed of the computer is so fast, the time complexity of O(n · n!) must not be a problem, right? Let’s take a look. Still assuming n = 10, the computer only needs 0.1 milliseconds to get the answer. Surprisingly, if it really grows at the level of O(n · n!), when n = 15, the time required to complete the entire process of the algorithm has increased to 54 seconds. When n = 20, the whole process of the algorithm takes 134 million seconds, which is equivalent to 1551 days, or 4.25 years. When n = 30, the whole process of the algorithm takes 700 trillion years, and the current data shows that the Big Bang was only 13.7 billion years ago. If n = 100, the computer would need to work around the clock for 8.15 × 10 140 years to get the answer. According to current cosmological theories, by that time, the entire universe will be dead.

Why is the difference between O(n 2 ) and O(n · n!) so big? The reason is that the former belongs to polynomial growth after all, while the latter has exceeded exponential growth.

Exponential growth is really scary, and while it may seem normal when n is small, it can quickly go beyond your imagination and get completely out of control. A sheet of paper folded in half will become 2 layers, and folded again will become 4 layers… and so on, the number will double every time you fold it in half. So after folding a piece of paper in half n times, you get 2 n layers of paper. When n = 5, the number of paper layers is 2 n = 32; when n = 10, the number of paper layers instantly becomes 1024; when n = 30, there will be 2 30 = 1 073 741 824 layers of paper in front of you! The thickness of a piece of paper is about 0.1 millimeters, and these more than one billion pieces of paper are superimposed together, and there are more than 100,000 meters. The Kármán line, located at an altitude of 100 kilometers, is the boundary between outer space and the Earth’s atmosphere defined by international standards. This shows that after folding a piece of paper 30 times, its total height will exceed the Earth’s atmosphere and reach outer space.

The stories recorded in the Persian epic “Book of Kings” also vividly illustrate the ferocity of exponential growth. A wise man invented chess and the king wanted to reward him and asked him what he wanted. The wise man said: “Put one piece of rice in the first square of this chessboard, two grains of rice in the second square, four grains of rice in the third square, and so on. There are twice as many rice as the previous one. All 64 squares of rice are the reward I want.” The king thought it was easy, and readily agreed. As everyone knows, even if you only look at the rice in the 64th grid, there are already 2 63 ≈ 9.22 × 10 19 . If this rice was distributed to everyone in the world at that time, everyone would get thousands of tons of rice. Fortunately, there are only 64 squares on a chess board. If there were 300 squares on a chess board, there would be more rice grains than the total number of atoms in the universe.

Therefore, in the field of computer algorithms, we have a basic assumption: all practical, fast and efficient algorithms should have a polynomial time complexity. Therefore, when writing programs for computers to solve practical problems, we often hope that the time complexity of the algorithm is at the polynomial level. The word “problem” here is too broad and can cause a lot of trouble, so we stipulate that the problems mentioned in the following are all “decision problems” (decision problems), that is, after some data , asking for a “yes” or “no” answer.

In complexity theory, if a certain decision problem can be solved by a computer in polynomial time, we say that the problem is a P problem, or that it belongs to the set P. Here, P is the first letter of the English word polynomial for “polynomial”. The previous problem of stacking things is a P problem.

There have been at least two problems in history, they looked very difficult and very unlike the P problem, but under the unremitting efforts of people, they finally successfully joined the P problem family. One is linear programming, an operations research model that originated during World War II. In 1947, George Dantzig came up with a very nifty algorithm called the “simplex algorithm” that performed extremely well on random data, but required worst-case Takes exponential time. So, for a long time, people wondered whether there was a polynomial-level algorithm for linear programming. It was not until 1979 that people ushered in the first polynomial-level algorithm for linear programming, which was proposed by the former Soviet mathematician Leonid Khachiyan. Another problem is the primality test: to determine whether a positive integer is a prime number (prime), or whether a positive integer cannot be divided into the product of two smaller positive integers. Various polynomial-level algorithms for prime number determination have been proposed, but they are either based on probability, or based on certain assumptions, or have a certain scope of application. In 2002, M. Agrawal, N. Kayal and N. Saxena from IIT Kanpur published an important paper “PRIMES is in P” , gives the first deterministic prime number determination algorithm with polynomial time complexity, and the prime number determination problem is also classified into the set of P problems.

At the same time, we also have many problems outside the set P. Internet security is highly dependent on an encryption system called the RSA algorithm, which uses a very sharp point: in the integer world, the difficulty of composition and decomposition is not equal. Pick any two large prime numbers, such as 19 394 489 and 27 687 937. We can easily calculate the result of multiplying them together, and it is equal to 536 993 389 579 193; however, if you were asked the other way round, which two prime numbers can be decomposed into 536 993 389 579 193, it is very difficult to answer quickly . Decomposing a large integer into the product of several prime numbers is known as the “integer factorization problem”. At present, people have not found a fast and efficient integer factorization algorithm. That is, it is not yet known whether the integer factorization problem belongs to P. (Here we are also referring to the deterministic version of the integer factorization problem, that is, given a positive integer N and a positive integer M less than N, determine whether N is divisible by some number up to M.) One speculates that integers Decomposition is most likely not part of P. This is exactly why the RSA algorithm is currently sufficiently secure.

Another well-known problem is called the “subset sum problem”: given a set S of integers and a large integer M, determine whether you can choose some numbers in S such that their sum is exactly M ? For example, suppose the set S is

`{38, 71, 45, 86, 68, 65, 82, 89, 84, 85, 91, 8}`

And the big integer M = 277, then you need to judge whether you can select a number of numbers in the above row so that they add up to exactly 277. To solve this type of problem, one of the algorithms is to enumerate all possible alternatives and see if any meet the requirements. If we use n to represent the number of elements in the set, then there are O(2 n ) kinds of all possible selection schemes, and it takes O(n) time to check each scheme, so the time of the whole algorithm is complicated The degree is O(n · 2 n ). Although people have found algorithms with lower time complexity, none of the algorithms have a polynomial time complexity. It is conjectured that subsets and problems are likely not to belong to P either.

In a 1964 paper on a matrix problem, American computer scientist Jack Edmonds also mentioned a concept similar to the P problem: “Of course, given a matrix, consider all possible , we are sure to come across a subsection that meets the requirements, but the amount of work required for this approach is staggeringly large. We are looking for an algorithm that makes the workload just algebraic as the size of the matrix increases As with most combinatorial problems, it is easy to find a finite algorithm, but not so easy to find an algorithm that satisfies the above conditions and thus can be used in practice.” Next, Edmund This vaguely touches on a new concept: “Given a matrix, how many independent sets can its columns be divided into? We try to find a good way to describe it. As for what is called ‘good The characterization of ‘, we use the ‘absolute supervisor principle’. A good characterization should reveal some information about the matrix, so that a supervisor can easily verify that this is true after the assistant finds a minimal partitioning scheme. It’s the smallest possible subdivision. Having a good characterization doesn’t mean there is a good algorithm. The assistant will probably still have to work hard to find that subdivision.”

What Edmonds said later is not to design a polynomial-level algorithm to find the answer, but to design a polynomial-level algorithm to verify the correctness of the answer. For many problems, this thing is easy to handle. To show people that it is indeed possible to keep all objects from being crushed, we just need to give a stacking order of objects that satisfies the requirements, and let the computer verify that it does meet the requirements in O(n) time. . In order to show people that some numbers in the set add up to 277, we just need to provide a way to choose the numbers and let the computer sum it up in O(n) time.

For some questions, if the answer is yes, we may not have a very obvious and efficient way to test this. However, it is easy to see that finding a polynomial-level answer checking algorithm is by no means easier than finding a polynomial-level answer- obtaining algorithm. For many seemingly difficult problems, a polynomial-level answer verification algorithm is found first, and then a polynomial-level answer acquisition algorithm is found. A classic example is the prime number determination problem. If some number is indeed a prime, how can you prove it in polynomial time? In 1975, Vaughan Pratt gave one such method in his paper “Every Prime Has a Succinct Certificate”, which undoubtedly promoted the determination of prime numbers. Algorithm development.

Some problems are so difficult that not only have no polynomial-level answer acquisition algorithms been found, but there is no known polynomial-level answer checking algorithm. For example, the classic “Kth largest subset problem”: Given a set S containing n integers, a large integer M, and an integer K not exceeding 2 n , determine whether there is at least K different subsets such that the sum of the elements in each subset does not exceed M? If the answer is yes, an easy-to-think verification method is to list all the K subsets that meet the requirements and submit them to a computer for review. Unfortunately, the number of subsets is exponential, so reviewing will take exponentially longer. It has been speculated that there is probably no polynomial-level test for the K-th largest subset problem.

In complexity theory, whether a question has an efficient answer checking algorithm is also a very important research topic. For a decision problem, if there is a polynomial-level algorithm such that every time the answer should be “yes”, we can input an appropriate “evidence” to the algorithm, so that the algorithm can be confident after running The answer is indeed “yes”, and we say that the problem is an NP problem, or that it belongs to the set NP. To explain the origin of the name “NP problem”, we have to mention another equivalent definition of NP problem: a problem that can be solved in polynomial time on a machine with stochastic capabilities.

If the computer’s instructions are allowed to conflict, for example, in the instruction set, there are “if you encounter situation A, then perform operation B” and “if you encounter situation A, then perform operation C”, we think that such a computer has random function. This new type of computer is called a “nondeterministic” machine. Once the machine encounters a contradiction, it randomly selects an instruction to execute. You can think of every random choice the machine faces as a fork in the road to parallel worlds, so that the whole machine can try all branches at the same time, automatically exploring all possibilities. If you’ve seen the movie Next, starring Nicolas Cage, you’re probably familiar with this scene. As long as the machine answers “yes” in any branch, the whole machine also counts as answering “yes”.

On such a powerful machine, many problems are not a problem anymore. In order to determine whether all the objects can be kept from being crushed, we just need to let the machine choose one of the remaining objects at random each time, and see if there is any one of the resulting n! meet the requirements. In order to determine whether some numbers in the set add up to 277, we just need to let the machine randomly decide whether to choose each number or not, and finally see if any of the selected numbers add up to 277. In fact, the problems that can be answered in polynomial time on a nondeterministic computer are exactly those that can be checked in polynomial time on a deterministic computer, for the simple reason that if a problem can be solved in If the solution is obtained on a nondeterministic computer, the choices made along the way by the branch that found the solution become the strongest evidence of the correctness of the answer; conversely, if we can check on a deterministic computer that the answer is indeed “yes”, We can then randomly generate the evidence required for the check on a non-deterministic computer, and see if there is one piece of evidence that actually passes the check among all possible evidence. The English word for “nondeterministic” is nondeterministic, and its first letter is N; the English word for “polynomial” is polynomial, and its first letter is P. That’s how the NP problem gets its name.

It is easy to think that all P problems must be NP problems, but the reverse is not easy to say. For example, the subset sum problem is NP. However, as we have discussed before, not only have people not found a polynomial-level solution to the subset-sum problem, but they also believe that there is probably no polynomial-level solution to the subset-sum problem at all. Thus, the subset sum problem is likely to be of the type of problem that belongs to the set NP but not to the set P.

In 1971, Stephen Cook published one of the most important papers in computer science, “The Complexity of Theorem-Proving Procedures.” In this paper, Stephen Gook asks the famous question: Is P equal to NP? If I have to use the simplest and most intuitive words to describe this problem, that is: Does being able to efficiently test the correctness of a solution mean that a solution can be efficiently found? Numerous scholars have attacked this question countless times over the decades. According to Gerhard Woeginger’s statistics, since 1996 alone, more than 100 people claim to have solved this problem, of which 55 claim that P is equal to NP, and 45 claim that P is not equal to NP, and several others claim that the problem is theoretically impossible to solve. But unsurprisingly, all of these “proofs” were false. So far, no one has really proved that P = NP, no one has really proved that P ≠ NP, and no one has really proved the insolvability of this problem. This question is without a doubt the biggest unsolved mystery in computer science. In the Millennium Prize Problems published by the Clay Mathematics Institute in 2000, P and NP problems ranked first. The first person to solve the problem will receive a \$1 million prize.

Let’s take a look at how scientists think about P and NP problems. John Conway, the British mathematician and inventor of the Game of Life, argued that P is not equal to NP, and it will be proven in the 2030s. “I don’t think this should have been a problem, but the theory came too late, and we haven’t come up with any tools to solve the problem,” he said. American computer scientist Richard Karp, winner of the 1985 Turing Award (Richard Karp) also believes that P is not equal to NP. “I think traditional proof methods are not enough,” he said. “An absolutely novel approach is needed to solve this problem. My gut tells me that this problem will eventually be solved by a young scientist who is not bound by traditional thinking.” Jeffrey Ullman, an American computer scientist and author of Introduction to Automata Theory, Languages ​​and Computation, also believes that P does not equal NP. He said: “I think this problem is comparable to those mathematical problems that have plagued mankind for hundreds of years, such as the four color theorem. So I guess it will take at least 100 years to solve this problem. I am sure that solving it The tools needed for the problem have yet to emerge, not even the name of the tool. But don’t forget, that’s what happened in the first 30 years of the greatest mathematical problems being posed. .”

You might notice that everyone seems to tend to think that P ≠ NP . In fact, according to a survey by William Gasarch, more than 80% of scholars believe that P ≠ NP. There are at least two main reasons for this. First, it seems easier to prove that P = NP than to prove that P ≠ NP, but even then there is still no indication that P = NP. In order to prove that P = NP, we only need to construct a super-universal polynomial-level solution algorithm that can be applied to all NP problems. In that epoch-making paper, Stephen Gook proved a somewhat unexpected conclusion, making the constructive proof of P = NP even more readily available. Give you a bunch of positive integers and ask if you can divide them into two piles of equal sums. This problem is called the “partition problem”. It is easy to see that the partitioning problem can be transformed into a special case of the subset sum problem, you only need to set the objective in the subset sum problem to be half the sum of all the numbers. This shows that the subset sum problem is a more general “big problem” than the partition problem, and it can be used to solve many “small problems” including the partition problem. Stephen Gook proved that in the set of NP problems, there is at least one “largest” problem, which is so expressive that all NP problems can be transformed into this in polynomial time. A special case of the problem. It is easy to think that if such a “ultimate NP problem” has a polynomial-level solution algorithm, all NP problems will have a polynomial-level solution algorithm. Such problems are called “NP-complete problems”. In the paper, Stephen Gucker constructed a specific NP-complete problem, which involves a lot of low-level logical operations of the computer, and it is not very strange that it can contain all NP problems.

Later, many other NP-complete problems were found. In 1972, Richard Karp published “Reduction among Combinatorial Problems”. This is another landmark paper in complexity theory, where terms such as “P-problem”, “NP-problem”, “NP-complete problem” were born. In this paper, Richard Karp lists 21 NP-complete problems, some of which seem to be “normal” and “natural”. The subset sum problem just mentioned is one of them . In 1979, Michael Garey and David Johnson published the first textbook on the theory of NP-complete problems, Computers and Intractability: An Introduction to NP-Complete Theory ( Computers and Intractability: A Guide to the Theory of NP-Completeness). Over 300 NP-complete problems are listed in the appendix of the book, which together take up 100 pages, almost one-third of the book. If any of these NP-complete problems has a polynomial solution algorithm, all NP problems will automatically obtain a polynomial solution algorithm, and P is equal to NP. However, over the years, no one has found any kind of polynomial solution to any NP-complete problem. This leads people to believe that P is not equal to NP.

Another reason people believe that P ≠ NP is that this assumption has stood the test of practice. Many aspects of industry and life rely on the assumption that P ≠ NP. If one day scientists proved that P = NP, finding a solution was as easy as verifying a solution, the world would be very different. Russell Impagliazzo vividly described this in 1995.

First, all kinds of NP problems, especially the most difficult NP-complete problems, will all have polynomial-level solutions. Almost all optimization problems in industry and management have immediate and efficient solutions. In fact, we don’t even need to program to tell the computer how to solve the problem, we just need to program to tell the computer what kind of solution we want, and the compiler will automatically make an efficient solution system for us. Secondly, many artificial intelligence problems have also been solved. For example, in order to give a computer the ability to process Chinese, we can prepare a training set containing a large number of sentence samples, each with one of two marks: “grammatical” and “ungrammatical”. Next, we ask the computer to construct a program with the shortest code length so that when these statements are entered into the program, the program can correctly figure out whether they are grammatical. Obviously, this optimization problem itself is an NP problem (there is a premise here that such a program exists and is polynomial level), so the computer can find this minimal program in polynomial time. According to Occam’s razor, we have reason to believe that the algorithm behind this program is the same algorithm that is being used in the human mind, so it can be applied to other sentences than the given material, and has a self-learning Function. Much of the work of mathematicians can also be left entirely to computers. Finding a counterexample is as easy as verifying a counterexample, and all kinds of false conjectures will be quickly overturned. In fact, finding a mathematical proof and verifying the correctness of a proof become just as easy, so various correct propositions can quickly find a minimal proof.

Finally, don’t get too excited – a world of P = NP would also be a very insecure world. Electronic signature technology no longer works because forging a valid signature becomes as easy as verifying that it is valid. The RSA algorithm also no longer works, because finding a prime factor becomes as easy as judging divisibility. In fact, inventing any new cryptographic algorithm is futile—a computer can deduce an algorithm for generating ciphertext from a large sample of plaintext ciphertext (as long as the algorithm itself is polynomial). Further, on the web, you can no longer distinguish people from people, or even from computers. Not only can computers easily pass the Turing test, but they can also accurately imitate a particular person. If you can collect all the online chat records of a certain person, and submit all the conversations between this person and the netizens to the computer, the computer will quickly learn how to imitate this person. The identification of the network must rely on physical means. Even in a sci-fi story, such a setting is quite crazy, showing how unlikely P = NP is.

Despite all the evidence that P ≠ NP, we still cannot rule out the possibility of P = NP. In fact, if P is really equal to NP, but the time complexity is very large, very large, very large, the foundation of cryptography is still unlikely to be shaken, and our world is still unlikely to change greatly. Donald Knuth, a computer science master known as the “father of algorithm analysis”, also proposed such a possibility: In the future, someone will prove P = NP by means of proof by contradiction, so we know that all NP problems have polynomial-level problems. algorithm, but what is the time complexity of the algorithm, we still know nothing!

For many people, the inability to find any polynomial-level algorithm for an NP-complete problem should also not be a reason to question P = NP. Mitsunori Ogihara believes that eventually people may succeed in proving that P = NP, but this is by no means something that can be accomplished by one or two people entangled in an NP-complete problem. In this regard, he made a more detailed imagination.

In the late XX century, a very brief, never-published draft inadvertently opened the way to solving P and NP problems. This draft came from FOO, a 20th-century math genius who was trying to build a new theory of some kind of algebraic structure. Apparently, FOO quickly abandoned the idea and never picked up the subject again. The draft remained stuck in a gap in FOO’s bedroom floor until it was discovered by workers during a later renovation of the old house. Because the draft was too brief, it didn’t attract much attention at the time. Decades later, however, a group of mathematicians discovered that if a certain variation of one of the assumptions on the draft held true, a lot of interesting things could come out. Although this is not the assumption that FOO was supposed to make, it is still called the “FOO assumption” in honor of the mathematical genius.

About 100 years later, a group of computer scientists discovered that a particular algebraic problem Q would be an NP-complete problem if the FOO assumption held. A few years later, computer scientists discovered again that if the (then well-known) BAR conjecture held, then Q would have a polynomial-level algorithm. Combining the two, we get: If both FOO and BAR hold, then P = NP.

Both conjectures were fairly well studied over the next two centuries. People keep coming up with various partial results, and the remaining special cases are transformed into new forms, until each conjecture is reduced to a very concise, very specific proposition. In the end, two groups of people declared that both propositions were correct. The first group announced three weeks before FOO’s birthday that they had finally overcome the last link of the BAR conjecture. Three months later, the second group announced that the FOO conjecture was officially resolved.

What will happen in the future? let us wait and see.