The gaps and challenges of “domestic ChatGPT”-Expert Roundtable

Original link: https://www.52nlp.cn/%E5%9B%BD%E4%BA%A7%E7%B1%BB-chatgpt-%E6%89%80%E5%AD%98%E5%9C %A8%E7%9A%84%E5%B7%AE%E8%B7%9D%E4%B8%8E%E6%8C%91%E6%88%98-%E4%B8%93%E5%AE% B6%E5%9C%86%E6%A1%8C

Content source: ChatGPT and large model seminars

Reprinted from CSDN manuscript

After experiencing cold winter, smog, and even when everyone can’t see hope, ChatGPT is like a spring rain, bringing new hope to those who do research on AI and even NLP.

On March 11, the ” ChatGPT and Large Model Symposium ” hosted by the Chinese Association for Artificial Intelligence, co-organized by the NLP Special Committee of the Chinese Association for Artificial Intelligence, ZhenFund, and Daguan Data, and supported by Yunda, CAICT, was officially held. In the round table dialogue, well-known experts and scholars from academia, industry and investment circles conducted high-end dialogues on the new AI wave triggered by ChatGPT, the “basic model” theory of large models, and the gaps and challenges of “domestic ChatGPT”. These experts are:

Ma Shaoping, Vice Chairman of the Chinese Association for Artificial Intelligence and Professor of Tsinghua University
Zhou Ming, founder and CEO of Lanzhou Technology, vice chairman of CCF China Computer Federation
Zong Chengqing, Researcher of Institute of Automation, Chinese Academy of Sciences, IEEE/ACL Fellow
Dai Yusen, Managing Partner of ZhenFund
Yang Hao, Huawei artificial intelligence scientist and doctor of Beijing University of Posts and Telecommunications
Cao Feng, Deputy Director of Artificial Intelligence Department, Institute of Cloud Computing and Big Data, CAICT (host)

1. The popularity of ChatGPT has brought new hope to AI

Moderator (Cao Feng): What is the reason why ChatGPT has attracted so much attention? What is the value and significance of triggering another wave of artificial intelligence after AlphaGo?

Ma Shaoping : ChatGPT can be successful, personally feel that it has something to do with these three aspects:

The first is the ability to understand intentions, in short, it is a breakthrough in understanding the problem;
The second is language generation ability;
The third is the ability to manage multiple rounds of dialogue;

From AlphaGo, it can be seen that AI can do a good job in special tasks, and now the large model also performs well in relatively general tasks, which may be the reason why everyone pays special attention.

Zhou Ming: In the past few years, AI has become more and more cold. Last year, the domestic investment in the AI field was basically equal to 0. Just when everyone felt that the eyes were full of haze , ChatGPT brought people the light of hope and illuminated the progress path of. The explosion of ChatGPT has brought a lot of confidence to those who do NLP, indicating that there is definitely a chance to get out along this road.

At an appraisal meeting of Harbin Institute of Technology, I once said: “Natural language is a jewel in the crown of artificial intelligence.” At that time, the experts and scholars present felt that this sentence summed up the status of NLP in the field of AI very accurately. Therefore, this sentence did not come from Bill Gates, but after Bill Gates finished speaking, when we went out to speak again, it started to work.

Zong Chengqing: I think there are two reasons why ChatGPT attracts attention. One is from the perspective of life. Nowadays, everyone has one or more mobile phones, and everyone likes to watch something new on the Internet; secondly, from the perspective of natural language processing. Look, in the process of experiencing ChatGPT, people find that the sentences generated by the dialogue system are very similar to human words. Compared with the previous dialogue systems, the content generated by ChatGPT is indeed very good, the effect is even amazing, and the application field is also very wide. In such fields, including education, law, and academia, the information consultation that is really needed will be affected, and the social impact will be very large. In addition, ChatGPT has a very accurate understanding of users’ intentions, and can almost accurately grasp most of the questions users want to ask.

Dai Yusen: I think there are three main points:

First, the experience threshold is extremely low and universal. In the past, it was difficult for people to experience the magic of autonomous driving and AlphaGo if they did not play Go and do autonomous driving. But ChatGPT, as long as you can speak, you can experience it yourself, and it can be applied in many fields, not just text continuation or poetry recitation, it has strong universality;

Second, dissemination. It can be disseminated through simple chat screenshots, and a large number of screenshots are flying all over the sky, so that everyone can find that it has many magical abilities;

Third, it leaves a lot of room for imagination. Because language is a carrier of human thinking, even the embodiment of thinking itself. After seeing ChatGPT, everyone will think about its impact on their own industry and work and how to improve efficiency. This kind of imagination has a very large space. But everyone will have a brain hole when they see ChatGPT. This kind of brain hole will spread and communicate, so it will bring more attention.

Yang Hao: Let me add one more point here. ChatGPT converts the link between AI and toB into the toC end, which can be experienced by everyone and brings a lot of confidence to everyone; the second point is that a lot of manual feedback data enters the system, which will make This system evolves better, so this application scenario is more meaningful. For example, in some ICT scenarios and network equipment operation and maintenance logs recently, if this question was not included in my answer, the answer seemed silly and completely irrelevant; but now that there is no problem with its overall intention, you may ask Complementing some field data has improved the connection of ordinary users to artificial intelligence, and the AI ceiling has been directly raised by a large amount.

2. A variety of jobs may be replaced by ChatGPT but there is no need for myths

Moderator (Cao Feng): We see that ChatGPT does not have many industry characteristics or industry application trends. Can you show us which industries are most likely to be widely used or may be widely used in the future driven by ChatGPT and large models? Subversive?

Zhou Ming: Our company is currently making a large model, called “Mencius Large Model”, and then we walk on two legs. The left leg is the large model that I really want to train, and the right leg is a large model that I can get from anywhere, and I picked it up from the Internet. Yes, or the API you bought is fine, how to make good use of the large model. Of course, in the end, I hope to use my own large model. Before using my own large model, it is best to separate the legs a little, so as not to stumble on each other.

Training a large model requires wisdom, and using a large model also requires wisdom, and the two wisdoms are not necessarily the same. People who use large models stand on the perspective of users and industries, and in turn put forward requirements for large models. Sometimes, big model people keep touting that big models have to be big to be effective, but that comes at a price, and big models mean too many servers. However, the user’s needs may not require such a large model, and may require a smaller or weaker model.

First of all, how to make a good model in the vertical field, reduce the size of the model, and do not need to pursue full intelligence capabilities such as ChatGPT, which has good applications in all walks of life. For example, finance, finance is an industry that pays great attention to cost reduction and efficiency increase. From customer service, marketing, copywriting contract review, robo-investment research, robo-advisor, search map, everything will think that a large model must be used, so a financial institution It is best to have a large model suitable for each of your business scenarios. This large model is not necessarily 175B, it may be 10B or even 1B, but it must be easy to access for other people’s data and business scenarios. Various business departments can easily access this large model, and then quickly provide answers and feedback, and then iterate continuously. Maybe new data will come in 3, 5 days or 1, 2 months, and then iterate.

Second, ChatGPT pays attention to the self-enclosure of data, and there will be no data after 2021, and this is not applicable to the financial industry. The financial industry needs real-time, an interface that can dynamically access financial databases, dynamically access various marketing activities, and then Make quick recommendations to users, etc. When landing, it is necessary to open the large model and all business scenarios in a timely, fast and safe manner. If this can be done well, many customers in the financial industry can use it.

The same is true for other industries, because it requires a lot of cognitive intelligence, natural language processing understanding, problem solving, database access, dynamic tracking, and customer recommendation. In fact, they all have many of the same content requirements. Therefore, the same technology can be promoted to form an influence on the entire industry.

Zong Chengqing: Which industry will be impacted first? In fact, it is not easy to answer this question specifically, because it can be used in any field and any industry, and it may be affected. In fact, people who are most likely to be impacted are NLP researchers. When ChatGPT came out, many people asked me: ChatGPT is doing so well, what is the use of NLP research? Of course, I am not worried about unemployment. On the one hand, ChatGPT is not so good that there are no problems to study; on the other hand, any low-end repetitive jobs are replaced by AI technology, which is an irreversible trend.

Dai Yusen: I have some small summaries:

First, it’s “Super Stitcher”. More than 95% of the work we do may be doing “stitching monsters”. For example, many things designers do are stitching existing things together, and programmers are doing code components that have already been written. Stitching together, the writer stitches together a lot of corpus that already exists. When the generative model becomes very powerful, everyone will pay more attention to the value of original things in the future. To truly create original things that are not in AI, because language models and diffusion models can instantly stitch together what all human beings already have, so the first One problem is that the emergence of “super stitch monsters” makes original thinking particularly important.

Second, it’s the super interface. In the past, people had to adapt to machines, we operated computers, PCs, and mobile phones, and humans had to obey the paradigm of computers, keyboards, mice, or touch screens. But the core interaction of people is actually language. Everyone can communicate in language, but before, it was impossible to realize real natural language communication with Siri, because it was stuck in many places such as semantic understanding and multiple rounds of dialogue. But after the emergence of ChatGPT, let us Seeing that humans and machines can truly communicate, it is not necessary for humans to obey the paradigm of machines more, but for machines to obey the paradigm of humans more.

The third one is super companionship. Our value to others in life is often reflected in language, and now we have someone to play with, chat with, or even someone we have never met. In the past two years, the concept of “metaverse” has been very popular, but later found that the metaverse is meaningless, because there is no one in the metaverse, and the metaverse is barren. Before, everyone thought that Meta human might look like a human, but in fact the most important thing is that it can communicate like a human. So someone was shocked after seeing ChatGPT’s chat records, because in the process, they saw that the machine became more and more human-like, or more and more difficult to be distinguished, which is the meaning of the Turing test.

In games, social interaction, or accompanying care for the elderly and children, the accompanying value played by people can be replaced or partially replaced. This is a goal that technology has not achieved before, and now we see possible trends . Of course, this may be a big brain hole, but from the perspective of investment institutions, at least ChatGPT allows us to go from the impossible to the present possibility.

Yang Hao: I think the next one to be replaced must be doing repetitive work. But from a positive point of view, as long as you keep learning new things and try to do “dirty work”, we all see good things. Have you found a few cases where ChatGPT is not good? Did you find it bad? Where is it bad? What is the possible reason? You really go and try it.

Now there is indeed a bottleneck in China. The bottleneck of computing power brought by ChatGPT is very high. It is actually very difficult for those who are really capable of reproducing this model and those who really look at the problem. So how to find the surrounding resources, cooperate with industry, academia and research institutes, build an environment, and analyze the bad cases in it is a big breakthrough point. It’s not that if others say yes, you also say yes, then you will be eliminated. Others say it is good, but you find it is not good, and then analyze the bad, then you will achieve a greater breakthrough.

Ma Shaoping: Regarding this question, because I have been in school, I know relatively little about applications. I think from a principle, it is the application principle of artificial intelligence. I think this should be the same principle:

First, if it makes any major mistakes, it will not cause any harm to my system. When ChatGPT first came out, people asked me what application I had. The first thing I thought of was to chat with the elderly. It doesn’t matter if it’s wrong, it doesn’t matter which movie’s leading actor is wrong, or it’s related to the game, it’s okay to be wrong.

Second, it can provide some decision-making or several solutions as an assistant, and the final decision-maker still depends on the user himself. The example I gave at the time was like the input method. The input method inputs a string of pinyin, and it gives you several choices. It is up to you to choose which word to use in the end. Only such an input method can be used. If the input method automatically enters this sentence and does not give you the right to choose, this input method cannot be used. So it’s just an aid, and the final decision is made by people.

Third, a certain error is allowed in specific applications, but the amount of this error is up to you, whether it is one thousandth or one ten thousandth, as long as it is within your principles. For example, in the past, the high-quality publishing industry had a one-ten-thousandth error, including product testing on the production line, as long as the error rate could be met.

In the specific application, first, whether these principles are satisfied or not, and second, if not satisfied, is there a way or use other knowledge to make it satisfy.

3. The current large model has not yet reached the “basic model” status

Moderator (Cao Feng): We have also seen Li Feifei and other scientists call the large model the “basic model” at the beginning. How do you understand the conceptual change from the “big model” to the “basic model”? Second, if it really becomes the foundation, will there be any other changes to technology research and development, industrial application, and industry promotion?

Yang Hao: I think there are two points:

The first point is that a certain paradigm is formed as a basic model, or that all artificial intelligence models are basically based on Transformer, which greatly reduces the open bottleneck of applications and promotes the development of the industry. It is equivalent to turning primary school into six years, then three years in junior high school, and three years in high school. This standardized operation produces greater value.

The second point is that it promotes the upgrading of upstream and downstream industries. For example, everyone is more concerned about Huawei’s chips, and there are some explorations on it. For specific algorithms, the energy consumption during data interaction between GPU and CPU is greatly reduced. Two typical applications, one is that the battery life of mobile phones is longer, and the other is calculation. It gets up faster and doesn’t get hot, so the algorithm is not the cheaper the better, but the easier it is to use, the better. Sometimes many commercial products are not as refined and good as academic products, but it is because it is simple and easy to use.

Dai Yusen: From the perspective of our investment, we feel that after we have a basic application, we can make applications and middle layers. This is our intuitive feeling, and this is a very academic definition.

For example, an AI company has to train its own model from scratch, and then do vertical integration in it. Today, ChatGPT and OpenAI have achieved APIs very well, and everyone can apply them very quickly, but they don’t need to train the model themselves, as long as they can mobilize its capabilities. This brings about the process of technology building blocks and Lego. With this base, you can build applications on it, which is especially helpful for application scenarios.

Everyone was in the academic stage before, but now they have really entered the application and commercial stage. This is my understanding of the “basic”, from an academic point of view.

Zong Chengqing: The “basic model” here is trained based on common commonsense public data published on the Internet. It is similar to a general practitioner who can do everything. If you have a headache or a cold, you can find a doctor to prescribe medicine. But when it is really used, it has to be a specialist doctor to solve the problem, especially for some fields that require deep professional knowledge.

Zhou Ming: I have different views on the basic model. Although Li Feifei proposed it two years ago, ChatGPT did not exist at that time, and GPT3 just came out, so everyone thinks that there are N basic models in the world. The ideal is that some of our big companies use it After it is built, just like China Power, these villages will not build their own hydropower plants. Everyone will build new applications on my basis. The idea is actually quite good.

In fact, so far, even ChatGPT does not dare to call it a “basic model”. Let me tell you the following points:

First, what should the base model be? I think at least the following must be met:

The function is relatively powerful;
Stable: For example, like electricity, the power cannot be cut off all the time, otherwise no one dares to use it;
Safe: Anyone can use it without harming others;
Ethical: ChatGPT is unethical in many ways right now. It may even conform to the ethics of the United States, but not to the ethics of China;
There are also many places such as speed, concurrency, and timely updates;
Comprehensive support for vertical fields;
Support for all aspects of user programming without code.

I personally think that no model has reached the state of the basic model, so everyone should not be superstitious about ChatGPT, it is far from the great ideal proposed by Li Feifei.

Second, the base model does matter. Any country, as big as China, has a unique 5,000-year culture, and must establish its own basic model system to achieve a lot of things such as security and concurrency. At present, no one has given you a conclusion on how to do this. You can only rely on yourself to explore a set of basic models suitable for your own national conditions and market. This is the first step in the Long March. Maybe 10 or 20 years from now, we can roughly form a set of basic models that everyone can rest assured and use stably, instead of just one.

Ma Shaoping: I very much agree with Mr. Zhou’s point of view that our country must at least have its own basic model. In the long run, it may indeed develop into an infrastructure to some extent. This infrastructure is like electricity. Electricity cannot be relied on abroad. Like before when there was no oil, it had to find its own Daqing Oilfield.

4. From catching up to surpassing, you must first learn to look at OpenAI

Host (Cao Feng): The last question, the birth of ChatGPT this time can also see the gap between our country and foreign countries. I would like to invite experts to talk about the current development of ChatGPT or the big model in our country? What are the difficulties? Some of you here are engaged in technology, some in industry, and maybe students. Please share your suggestions for future development.

Zhou Ming: Now is a very good stage for young people in the technology field and the NLP field. We have been engaged in natural language for more than 30 years. In the past, there was nothing. The code had to be written line by line, and our eyes were full of tears. No one supports it.

But today, with the support of big data and computing power, ChatGPT has verified the feasibility. And we emphasize independent intellectual property rights, so no matter how well the United States does, it has nothing to do with us. We still have a vast world to expand.

So I would like to share a message to everyone here, including friends from the investment, industrial, and research circles. There is still a long way to go! Choose the right path and go on bravely, that’s your plan!

Zong Chengqing: Someone asked more intuitively, why didn’t China make ChatGPT? I said that any high-tech can be used to ask, for example, why didn’t China make its own high-end chips? Why doesn’t China have an operating system? Why doesn’t China have its own database? We admit that there is a gap with the United States, but I personally think that the gap between the direction of natural language processing and the United States is much smaller than that of other fields, and the field of natural language processing has made great progress in recent years.

Of course, original technologies belong to others, and we admit this. From the perspective of market application, natural language processing is not behind in the Chinese market, including compared with the United States. We have made considerable progress, so we are very confident in doing China’s own affairs well.

The key is how do we do it well? Now everyone is too impetuous. After ChatGPT came out, the whole of China is speculating on this matter from top to bottom. Why is no one speculating on blockchain and Metaverse? Immediately turned to ChatGPT class research. We should calmly think about what we can do now and what we can do better, and don’t speculate on those concepts every day. It is the last word to do well what we should do in a down-to-earth manner to meet the needs of the country.

Dai Yusen: I would like to share a few views:

First, investment is a Bayesian process, and our perception of the world must change as we acquire information. Before the iPhone came out, there was no way to invest in mobile Internet and mobile development. When the iPhone came out, everyone voted, so it is normal to have impetuousness, bubbles, and loud voices in a short period of time, because of our understanding of the world. , It is normal for estimates of the future to vary greatly, but we hope that there is beer under the foam. We have seen that this technological change has brought very direct application value. In the United States, not only Amazon itself, but many have achieved obvious commercialization results, so I believe this wave of trends will be long-lasting and long-lasting.

Second, to learn to catch up with OpenAI, you must first look at OpenAI. The opinions heard in the past few months are divided into two schools. One is to deify OpenAI, and feel that OpenAI is far away. We have shortcomings in corpus, chips, and algorithms, and we may not be able to do it. The other is the theory of quick victory. We not only have, but also have a lot of research in this area. This month, we can achieve or even surpass ChatGPT.

In the process from catching up to surpassing, we need to give some time to the domestic large model. First, the gap between our own natural language model is not big. Models with 100 points and 100 points are useful, and now we see that the language model has passed a threshold, allowing it to work in many places. We have also recently used the ChatGPT product of 4 or 5 teams. Although there is a gap, it is better than the previous attempt, and some applications have already been implemented. Emerging abilities, thinking chains, etc. gradually began to be available, and some even did better than ChatGPT in Chinese tasks, which was related to corpus and algorithms.

What we are facing now is a generation gap, not an insurmountable gap. We’re looking forward to the future, but it’s certainly not something that’s going to happen anytime soon. The cycle of our angel investment is a long-term investment of more than 10 years, not stocks.

Yang Hao: Tell me three points about my own thoughts on ChatGPT:

First, from the comparison of vertical fields, such as machine translation, it is found that ChatGPT is not as good as the current specially trained machine translation model;

The second point is that we have been doing quality assessment, but to be honest, compared with professional translators, there is still a lot of room for development, so we still have a lot of opportunities to do it, and ChatGPT has not solved it, so there is no need for myths it;

The third point is that technology is good. ChatGPT will drive the overall upstream and downstream and chip thinking. Our company has an “M+D” ecology, Mindspore deep learning platform. Now people don’t use Tensorflow much, and Pytorch is more, but there are many problems in real industrial implementation. D refers to the content related to D chips. The price of some of our chips is 1/4 of that of the other party, but the overall performance is 1/2 of it. At this time, it will be profitable to manufacture these chips. In addition, add the large model. The Huawei Pangu large model was originally trained based on the 3D chip, not entirely based on the GPU, so there is a lot of space in it.

The road will become wider and wider. I believe that those who split society and block technology are only a few people. This road will become wider and wider through industry-university-research and even close cooperation with domestic and foreign experts.

PS: Friends who want to watch this seminar can move to the “Daguan Data Video Account” to view the live playback.

This article is reproduced from: https://www.52nlp.cn/%E5%9B%BD%E4%BA%A7%E7%B1%BB-chatgpt-%E6%89%80%E5%AD%98%E5%9C %A8%E7%9A%84%E5%B7%AE%E8%B7%9D%E4%B8%8E%E6%8C%91%E6%88%98-%E4%B8%93%E5%AE% B6%E5%9C%86%E6%A1%8C
This site is only for collection, and the copyright belongs to the original author.