Why doesn’t Baidu do Sora?Explore Baidu’s unique AI development path and future direction, the reason behind the reason is that you can’t eat grapes to say grapes are sour?

Why doesn’t Baidu do Sora?Is it that grapes are sour if you can’t eat them? Hello everyone, this is the YouTube channel of Old Fan Storytelling.

November 12, Baidu World Conference revealed that Baidu has never wanted to do Sora such a world model, or video model, never thought. Baidu has always wanted to follow its own multimodal path, and it doesn’t want to get involved in the world model competition with companies like OpenAI. Although up to now, Sora has not come out, but there are still a lot of people running behind, especially Jitterbug and Shutterbug, have launched their own video models, and there are also a lot of international vendors also scrambling to run forward on the video model.

But Baidu said, I’m not doing this, I have more important things to do. This is Baidu Robin Li himself said. So what exactly is the direction of Baidu’s efforts? Baidu’s efforts are directed at eliminating illusions. The big models are hallucinating, and Baidu says, “As a Chinese company, you can keep quiet, but it’s very troublesome to say the wrong thing, so we can’t hallucinate and make sure that what we say is right. And this is in a variety of perspectives, all the judging criteria to see, have to be right, there can not be any problem. Because there are times when you say what you feel is right, but others feel unhappy, and that doesn’t work either.

So what about Baidu, as a leading AI company with Chinese characteristics, they’re moving towards eliminating the illusion. How are they going to go about eliminating it? They launched a very interesting thing called IRAG at the Baidu World Congress on November 12, and you have to know that RAG is a technology that we use a lot in our AI agent, or AI intelligences, called Search Enhanced Generation. That is, we search first, after searching, according to the search content and then to generate, in this way, to ensure that the generated thing has no illusion, is within the scope of your given to generate. It does not necessarily guarantee that what is generated is right, but then, it guarantees that what you give me is what I generate is what.

So, IRAG is a what kind of thing? This front of the I in the end is to do a what kind of word put in it? I this word it, is the image (image), image-based RAG. what do you mean by image-based RAG it? That is, under normal circumstances, we RAG are doing the text or form.

After the search is done, the text and tables are all made into a point in a vector database. Then find the point closer to him to take out, to generate the answer related to the question. This is the standard process of RAG. IRAG, that is, Baidu said I have a lot of pictures, I will be all these pictures, as well as picture recognition out of a variety of information, directly take to do embedding, and then form a historical database. In this to search, after the search and then go to re-generate the picture. What does this mean? It means that you go and train to say that this person’s name is Zhang San, Zhang San looks like this, Zhang San sits, Zhang San stands, Zhang San is happy, Zhang San cries, Zhang San eats. He’s trained all these things into a vector database. When you ask him to generate a picture next time, say Zhang San is wearing what kind of clothes, where he is standing, what kind of action he is doing, what kind of expression he has, what kind of style he has, he can find out all this information you want from the vector database. Zhang San looks like this, I have it; then, what kind of clothes to wear, I’ll check it in the quantity database. After checking, oh, the clothes look like this, I also have it. He can draw what action to do very accurately. He made such an amazing technology out of it, but after I saw the presentation I said, “Hey, this is fun haha, I’ve got to try it.” And then I ran and tried it. First of all, I ran to Baidu Wenxin Yiyin’s website, test it, and found that Wenxin Yiyin 3.5 version of the usual crotch pulling, still in the nonsense, still in the words of the words, we do not have any expectations of him. Then Wenxin Yiyi 4.0 still needs to be charged, forget about testing it. So let’s draw the picture. The process of drawing was a little bit scary. First of all, let him draw a car, you let him draw all kinds of cars, are very accurate. Say I Maybach which model, in Paris under the Arc de Triomphe, oops, that do very beautiful, a photo absolutely a mess real. Volkswagen this car, in addition to the license plate is not too clear, is also very similar. Unfortunately, I asked him to draw the Xiaomi Sioux 7, but he didn’t come up with it. I guess he didn’t have enough material for the Xiaomi Sioux 7, or he didn’t use a lot of Xiaomi Sioux 7 pictures when he trained the model, or he didn’t have so many Xiaomi Sioux 7 pictures in the vector library of the IREG. Every time he is asked to draw Xiaomi Su7, what he draws out is M5, which can’t be helped. Then when he was asked to draw a person, he was asked to draw Guo Degang.

Oops, my goodness, it’s almost like taking a picture and posting it right up. What do you think Guo Degang is doing? He’ll make you an exact replica in no time. It’ll be a perfect replica. But if you ask him to draw Yu Qian, it’s impossible. The one who drew it is also Guo Degang. Have you all figured it out yet? Why is that?

Say why did I ask him to draw Yu Qian and this IRAG produced a result of Guo Degang? Because it’s very simple, all the photos of Yu Qian that you search for in Baidu images, Guo Degang is standing next to them. Yu Qian, Guo Degang, Guo Degang, Yu Qian, you Guo Degang stood more photos, then he thought that is not Yu Qian should also look like this. But this way of operating, in fact, he showed us that there is still no way to avoid the illusion with this technique of IRAG. You ask him to draw Yu Qian, and he draws Guo Degang.

There was a picture, and I told him, “Here, draw me a picture of Guo Degang and Yu Qian doing comedy in De Yun She. After the drawing, there were two Guo Degangs, both very similar. If you take out any one of them individually, it’s like a fake. The two Guo Degangs stood on the stage and did their comedy, and it became like this. Asked to draw others, not so similar, such as Guo Qilin, Musk, this is not so easy to recognize. The others I didn’t dare to test again, and I’ll probably be warned if I do.

But then, his entire set of IRAG’s system is still quite scary. If you want to let him go to you to generate some advertising pictures or some fake pictures, that is, Guo Degang went out to do some embarrassing things, absolutely false to the real, the drawing of the extreme like, is already can reach a certain commercial use. Especially you, for example, I do some store decoration or this kind of e-commerce picture generation, this thing can still be.

In addition to this IRAG, this time on the Baidu World Congress it, also released no code tool “second da”. A second, two seconds of seconds, da is a mouth a reach. The so-called no-code tool “second da”, in fact, similar to byte jumping codes, right? It also allows everyone to put together the intelligent body, and then form an AI agent to work. Only then, the second da now it is not open to use, still let the enterprise to sign up for queuing. It is said that there are already a lot of people queuing up. These companies don’t know why they can’t think straight, Codes are free to make right now, so why would you want to use Seconda? People like me who are slightly more hands on can use Defi.

Let’s not go that far with this one. This year, in addition to the IRAG we talked about earlier and the second da, what else has been released? This cow must still be blown. What is the bull now? It’s the Wenxin Yiyin big model, with an average daily call volume of 1.5 billion. We have already counted up, last year was 50 million, now up 30 times. This 1.5 billion, we note that there is no unit, 1.5 billion times, 1.5 billion people, it is impossible to 1.5 billion people, China does not have. 15 once, this is also slightly less good to assess, how to count once? Then we will be a little conservative assessment of it, we write this unit as TOKEN, that is, every day can generate 1.5 billion TOKEN.

Oops, a lot of people say this number is so big ah, Baidu Wenxin Yiyin good, so many people use it, generated so much content. But you have to think about it, 1.5 billion TOKEN according to Baidu’s charges, how much money can be earned? Baidu Wenxin Yiyin 4.0 Turbo, according to the price per thousand TOKEN multiplied by 1.5 billion words, a day’s income is probably less than 100,000 dollars. Then you think a company like Baidu, such a project, worth coming up to speak? If this is his AI future, Baidu earns 30 million or 40 million dollars a year, what’s that enough for?

So what, the figure is basically negligible, he is just playing a word game with people, 1.5 billion a day, so big, so big. You multiply it by the number of money, you see how much. In addition to bragging about yourself, of course, you have to point out the direction and say which direction is the future development of AI? Two major directions, one is the intelligent body, should also be the AI Agent we just talked about such a thing; the other one is called industrial application, that is, the government has money or large enterprises have money, you are willing to pay for this thing, you are the future direction. This is the two directions that Robin Li pointed out for AI China.

And then, to ensure that Baidu himself will not go to do super APP, in fact, he does not have this ability, so simply can not eat grapes say grapes are sour, I do not do this. Then what, to go to build millions of super APP, I do not know how Robin Li thought. Super APP can not have millions, to millions, after this thing is not called super APP, you do not have so many users, called what super APP, but that means, is the dimensional blow, this is a lot of Internet people like to talk about a kind of saying.

You’re a two-dimensional creature, I’ll take you out in three dimensions; you’re a three-dimensional creature, I’ll take you out in four dimensions. This is a word from The Three Bodies. What is this so-called descending blow? It is that you all go to roll Super APP to go, I want to be your father. All the apps made under Baidu are super apps, I’m a level above you.

Of course, also showed some intelligent body, including Baidu’s own Wenxin intelligent body platform. On this one, it is claimed that 150,000 enterprises use it, and there are 800,000 developers, but it has not been seen to smash out anything loud. If a super APP is created, the general public should be able to perceive it. We don’t have a sense of it now, let alone millions of them, and we haven’t seen a single one. Then also showed some super intelligent body, what the legal question and answer ah, basically that is to say that we through Baidu’s Wenxin research to do some AI Agent, how to be able to solve a little bit of the actual problem, this is also to show you a little bit.

In addition, the fashion is still to catch up, catch what fashion? Baidu smart glasses, Zuckerberg did it, we have to do it too. That’s what was handed out at the Baidu World Congress this time around. Then let’s go back to say, Baidu why don’t they do Sora? In fact, the most essential reason is that Baidu itself does not have a video platform. Although Baidu has video, Baidu has love Qiyi what these things, but Baidu themselves do not have a platform like jittery, fast hand. You like the domestic now that the dream with the spirit, the roll of that called dead, every day two people roll around. That is the dream behind the byte jumping is jittery, can spirit behind is fast hand. After generating the video, it is placed on our Jitterbug and Racer platforms, so we can publicize it, and we can use it directly. Baidu doesn’t have this stuff on its own, so it says then I’m not going to bother with you.

Baidu and Sora, in fact, are two completely different paths. what is Sora’s path? Sora’s path is scaling low, vigorously out of miracles. A lot of things in the middle we do not go to study, we just pile up the material, data pile up, algorithms pile up, plus enough arithmetic, burning money waiting for it to emerge. The original traditional way, we don’t think about it, think differently. we don’t have to think about whether we want faster carriages or not, we go directly to build airplanes, not cars. That’s what Sora is doing, it’s what a bunch of idealistic people are doing.

And what’s more, it may not come to fruition. In fact, up until now, Sora has not shown any signs that it is going to make it. And what is it that they do at Baidu? Is in the existing technology category, to meet the existing needs. This, then, is a typical Chinese innovation. What is the requirement? Certainty is high. Do we roll? The roll must have a high degree of certainty. In what way should it be certain? First, the technical route has to be determined. A bunch of old scholars, they come and go to determine the technical route, can not let the young people on. Young people, you have no experience. What if you go the wrong way? Don’t move on this one. The second one, the cost should be determined. How much money I put in, what kind of result can I get. After the cost is determined, the benefit should also be determined. I’ve got to make something that people use, I’ve got to sell it, before I can do it. It is more realistic. This is the path that Baidu has taken. Baidu, the requirement is to have a market to make money, so it is, Baidu is considered a little more conservative than the traditional Chinese innovation a company.

So now there are a lot of people to say, scaling low now in the end is not feasible? There are a lot of universities, a lot of organizations in the United States, and even some famous scientists have come out to say that scaling low is not working, and that there is a problem with the law of scale. Is there a problem with this law of scale? Is it impossible to pile things up further? The thing is, let’s just say that from the first day of scaling low’s existence, the skepticism has never stopped. Why? Because that thing that SCALING LOW counts on is the end result of SCALING LOW success is called emergence. What does this word mean? It means you’re not sure if he’s coming or not, you’re not sure which time it’s going to work and which time it’s not going to work, and you’re not sure to say how much I’m really adding. There’s this one time in the future, because the emergence of this thing, it’s got to be discontinuous. It’s not like I put on 10 video cards and something comes out; I put on 11 video cards and something comes out; I put on 12 video cards and something comes out. This is discontinuous. You have the possibility of 10 graphics cards and you finally figure out that one figure works, and then what happens, 11, 12, 13 don’t work. And then you realize that when you get to the 100th graphics card, you’ve got another result, you’ve got another jump in innovation, you’ve got another step forward. So you say let’s pile it up, let’s pile it up to 1,000 cards, let’s do it again, and we find that hey, it seems like there’s a little bit of improvement, but it’s not that significant. Oops, that doesn’t seem right.

But does this thing just not work? Not necessarily, because where the next node is is anyone’s guess. That’s what’s called emergence. If you know where the next node is, let’s say there’s this much data piled up to get the result, then the next node, let’s say it’s 10 times, 20 times, 30 times or 1,000 times, it’s not known. This is the real scaling law, that is, we just pile up the data, the future is unpredictable, uncertain and discontinuous.

This is something that people have questioned since the day it started. This process is actually very similar to what? It’s a story we all read when we were kids, called Pony Crossing the River. What does it mean? This pony was carrying a load of goods across a river, and people told him, “You can’t cross this river, it’s very deep, you’ll drown. Different people tell him different things. That is, every old scientist or some accountant who conducts cost accounting, when they see the scaling law, they will tell him: “The pony crosses the river, you can’t cross it, there is something wrong with you.” What can we do about this? One has to wade forward, and after wading through it to find the next node, and there’s not much one can do about that.

Is it right for Baidu to think this way? Baidu said I’m not going to die Sora, I’m going to do IRAG, I’m going to eliminate the illusion, to do innovation with Chinese characteristics, is this right? In fact, Baidu is not ashamed to think this way, as a mature business enterprise, thinking this way is a normal business logic. However, if, according to Baidu’s own take, he is the leader of China’s AI industry, so think of the problem, it is a bit sad.

But what’s good about it? The good thing is that China does AI this piece, at least in the various products I tested, I think Baidu basically still can’t be ranked. Baidu claimed to be the leader of the AI industry in China, this thing, let Baidu happy on their own, closed the door to claim the king can be, let us look at Baidu every day how to think about the problem. I think a lot of the way he thinks about things still has significance and value. However, as a national AI leader, it’s best to have a little dream, be willing to work hard, and take a step forward, and it’s possible that you’ll come to a different place.

Well, that’s it for this installment, thanks for listening, and please help out by liking, clicking the little bell, and joining the Discord discussion group.

We also welcome those who are interested and able to join our paid channel. See you soon.