The first batch of large models passed the filing, and more radical investment began

Original link: https://www.latepost.com/news/dj_detail?id=1841

On the evening of August 30, the Baidu public relations team worked overtime to prepare promotional materials. The goal was to release the news of the full opening of the large-scale model application Wenxinyiyan as soon as the clock crossed 0 o’clock, and finally released it successfully at 0:02.

At the same time, Baidu also deployed a large amount of computing power in advance to support large-scale applications such as Wenxin Yiyan to cope with the possible surge in usage after this full liberalization.

Immediately afterwards, at 1:44 am on the 31st, the large-scale model start-up company Zhipu AI announced that the large-scale model application “Zhipu Qingyan” was officially launched; Serve.

More news spread early this morning: ByteDance, SenseTime, MiniMax, Chinese Academy of Sciences, Shanghai Artificial Intelligence Laboratory and other companies and institutions also announced that their large models have passed the filing and will begin to officially provide services to the outside world.

“LatePost” learned that large models developed by HKUST Xunfei, Huawei, Tencent and Alibaba are also among the first to pass the record. Ali’s Tongyi Qianwen will soon open its services to the public.

“For us, today’s node is more important than the release of the large model on March 16.” A Baidu source said. He and many colleagues remember that 167 days have passed since Baidu released the large model.

With the first batch of large-scale model applications going online through filing, China’s artificial intelligence large-scale model market has entered a new stage of competition. Products developed by technology companies and institutions based on large-scale models can provide services to all users. test.

“The evolution of the large model is highly dependent on user feedback. After more people use it, more data feedback will be used to improve the large model.” Yu Huan, director of Baidu Technology and Social Research Center, said that Baidu is trying to increase the iteration speed of the large model. “It was originally planned to release a new version of the model at the end of the year, but now we are speeding up and releasing it as early as possible.”

Wang Xiaochuan, founder of Baichuan Intelligent, told LatePost that Baichuan Intelligent will release a 100 billion parameter model in the fourth quarter of this year, and launch a “super application” in the first quarter of next year. It is understood that HKUST Xunfei, which released a new version of the large model half a month ago, will also speed up the promotion of large model applications after passing the record.

The new environment has turned the large-scale model competition into a comprehensive ability test: the winning factor will no longer be only a company’s technical strength in training large-scale models, but also its ability to understand market needs, develop matching applications, and operate well .

More aggressive investment in attracting new users and customers, growth and products is about to begin. When the “speed limit” is lifted, it’s time to see who is going faster and who is slower.

After the policy is implemented, a batch of large models will be launched publicly



The “Interim Measures for Generative Artificial Intelligence Service Management” was officially implemented on August 15th, which is a key node for China’s large-scale model companies to pass the record. A large-scale model practitioner said that after this, the relevant departments began to call some large-scale model companies to hold meetings, conduct filing training and issue filing material templates.

It is understood that during the filing process, the regulatory authorities pay attention to data security and data source issues, such as whether the data violates intellectual property rights or privacy; the regulatory authorities also suggest that when the large models of each company complete the chat task, “the refusal rate should not be too high.” .

Before and after the start of the large-scale model filing work, the large technology companies that have passed the filing this round have more or less released the progress of the large-scale model:

  • At the end of July, Tencent began to test the Hunyuan large model in multiple business lines, and is expected to announce new progress next month. Two months ago, Tencent CEO Ma Huateng said that he was not in a hurry to produce semi-finished products early.
  • At the beginning of August, ByteDance publicly tested the large-scale model application “Doubao”, and the underlying model was the “Skylark” model that passed the record this time.
  • On August 4, Huawei announced that it will integrate the Pangu large model into the Hongmeng system, and will provide functions such as generating emails and automatically controlling mobile phone software through the voice assistant in the mobile phone.
  • On August 15, HKUST Xunfei released Xunfei Xinghuo Large Model 2.0 version, which increases the ability to generate and understand images and codes, and jointly launched Xinghuo all-in-one machine with Huawei to provide solutions for government and enterprise customers to deploy large models locally.
  • A few days ago, Baidu sent a group text message to remind Wenxinyiyan internal test users to obtain the qualification of “Baidu Search AI Partner” internal test, and can use functions similar to New Bing through Baidu App and Baidu search engine.

Among the first batch of start-up companies that passed the large-scale model filing, Zhipu AI, Baichuan Intelligent and MiniMax are also rapidly iterating their own large-scale models recently. In June, Zhipu AI upgraded and launched the ChatGLM2 series, adding 3 models with different parameter specifications, which can handle up to 32,000 tokens (tokens are proportional to the amount of word processing).

Baichuan Intelligent, which was established in April this year, has quickly launched 3 models in the past 4 months, two open-source and one closed-source, with a maximum parameter of 53 billion. Founded at the end of 2021, MiniMax completed a major version upgrade of its own model ABAB in July, and improved performance on a weekly basis.

There are also big companies behind the start-up companies. Meituan participated in Zhipu AI’s B-2 round of financing this year. Tencent has also invested in MiniMax in June this year. Zhipu AI and MiniMax have both become unicorns valued at over US$1 billion.

At present, most of the large-scale model companies that have passed the record have announced that they are open to the public. However, passing the filing itself may not be a long-term advantage in large-scale competition.

Many people in the industry who participated in the filing held the view that more and more large-scale model companies will pass the filing in succession. “There will not be only the first batch, but there will also be the second and third batches.”

The promotion is no longer restricted, and the commercialization of large models is accelerated

After the application of “recording through generative artificial intelligence” to the large model, the most direct change is that the product can directly provide services to the public.

Before this, most companies were relatively restrained in promoting large-scale model applications. Their products for individual users were all in the form of internal testing and invitation testing. General users could not directly register and use them, and companies would not actively advertise large-scale model products. , which inhibits product diffusion.

The implementation of the policy will promote the company to invest resources to promote the large model, and ultimately accelerate the commercialization of the large model. At present, there are four main monetization modes in the large-scale model industry:

  • Develop large-scale dialogue applications and charge users monthly/yearly. For example, OpenAI’s ChatGPT Plus service.
  • Sell ​​large model API interfaces, and charge companies or developers according to the number of calls, such as the cooperation between MiniMax and Kingsoft Office WPS.
  • Directly sell large-scale model development services, and export large-scale model industry solutions to traditional enterprises to make money, such as industry large-scale model solutions vigorously promoted by Baidu, Tencent, Xunfei, and Huawei.
  • Companies with large models can also use large models to transform existing businesses, improve product competitiveness and obtain more commercial returns. For example, companies such as Google and Baidu are using large models to optimize search products; DingTalk has integrated large models into product functions; Ali has said that it will use large models to transform e-commerce businesses, etc.

One of the most obvious changes in the market after the large models have passed the filing one after another is that there will be more and more active products that are directly oriented to individual consumers.

It is reported that MiniMax will launch products for the public next, but no details have been disclosed yet.

Wang Xiaochuan said that in the first quarter of next year, Baichuan plans to launch the first “super application” for individuals. He said at a media communication meeting in early August that Baichuan Smart “will not only have one super app in the future, but (more products) are on the way to research and development.”

Large-scale product promotion, and at the same time can attract business for enterprise customers. A Baidu official said that Baidu will not charge for the Wenxinyiyan product for the public for the time being, but it is a good way to demonstrate technical capabilities and “help attract enterprise users”.

The enterprise-level market is the direction that the entire industry has been focusing on before the large-scale model was filed. Tencent, HKUST Xunfei, and Huawei have previously mentioned on different occasions that they have released dozens or even hundreds of large-scale model solutions for more than a dozen industries. MiniMax also announced that its open platform for enterprise customers has been connected to more than 100 paying customers.

The technology race for the big model itself continues. A person from Baidu said that the company is trying its best to accelerate the development of a new version of the large model, hoping to release it ahead of schedule. Baichuan Intelligent said that it will successively release the 7 billion parameter and 13 billion parameter versions of Baichuan2 according to the previous research and development plan, and plans to launch a large model with 100 billion parameters by the end of the year. HKUST Xunfei plans to launch a large-scale model that surpasses Chinese and English skills equivalent to ChatGPT on October 24, and will benchmark GPT-4 in the first half of next year.

Large-scale model development enters a new stage

Up to now, China has hundreds of large models with more than 1 billion parameters. The other side of policy implementation and accelerated commercialization of large models is that participants will face more intense and comprehensive competition. Only when the “speed limit” is released can the limit of the leader be tested, and those who run slowly may face elimination.

From the perspective of overseas markets where large-scale models are developing faster and with post-regulation, the competitiveness of large-scale models is mainly reflected in three aspects:

  • computing infrastructure. When a large model application acquires a large number of users, it will consume a lot of computing power. OpenAI once suspended the registration of paying users and strictly limited the number of times users can call GPT-4. The core reason is that the computing power cannot keep up with the growth rate of users.
  • proprietary data. Most pre-trained large models on the market are trained with the same architecture, public datasets and similar methods. The key to a large model’s ability to differentiate is what kind of data is used to fine-tune it. The quantity and quality of these data will directly determine the capability of a large model.
  • commercial application. It is not difficult to make applications based on large models, but if you want to make applications based on models with tens of billions or even hundreds of billions of parameters, you need a large number of GPUs for inference calculations. An industry insider judged that for a large model with hundreds of billions of parameters, the cost of training and reasoning is about 1:9. This means that it is necessary to find a scenario where the commercial value is large enough and profitable enough to make the application of large models cost-effective. In larger application scenarios, large model suppliers can also get feedback from more users and continuously improve the model.

The competition of large models will benefit large companies with strong capital and a large number of users, such as Baidu, Tencent, Huawei, Alibaba, HKUST Xunfei and ByteDance.

However, an entrepreneur who developed a generative writing application based on the large model of other companies told “LatePost” that he is not very worried that after filing, large companies will increase investment in the application layer and squeeze small and medium-sized companies. “The craze has receded a lot before, and many applications have entered the process of deep integration of AI, that is, AI itself is not a selling point, the key is to grasp user needs and scenarios.” On this point of competition, he believes that large and small companies There are opportunities, and representative products include Notion and DingTalk.

There are also many start-up companies that are also establishing partnerships with companies with large user bases to enhance their strength. For example, MiniMax and Zhipu AI are connected to the WPS of Kingsoft Office.

It is understood that before Meituan invested in Zhipu AI, it has spent tens of millions of yuan to purchase its large-scale model authorization, and plans to explore related applications on this basis.

The next major test for all large-scale model companies is: how to find a truly profitable and sustainable large-scale model business model.

“We can’t just push AI and not have a business model to support it,” Frank Slootman, CEO of cloud database company Snowflake, said on an earnings call in August. “A lot of company executives describe their attempts to get into big models as experimental, exploratory, and they’re still trying to figure out how challenging it is,” he said.

So far, the money that has been made from the big-model wave has been almost exclusively the “buy the shovel” companies. Such as Nvidia. In the past second fiscal quarter, Nvidia’s GPU-related business revenue increased by 171% year-on-year to US$10.3 billion, and the company’s net profit increased eightfold year-on-year to US$6.2 billion.

The implementation of this round of policies may also allow Internet advertising platforms to make a fortune first. A large model practitioner in Beijing said that they are waiting for the filing to be completed, and then they will restart product launches on short video and search platforms. In the previous period, when the products mainly existed in the form of testing, the company believed that large-scale investment was not economically cost-effective. Before that, their monthly product advertising cost reached one million yuan.

“It has not yet reached the level of super application.” A large model practitioner believes that it may take two to three years, and there are only some signs at present, “Wait until the technical capabilities are stronger, the application effect is good enough, and the cost is low enough , the real super application may appear.”

Zhu Likun also contributed to this article.

Title picture source: Chariots of Fire

This article is transferred from: https://www.latepost.com/news/dj_detail?id=1841
This site is only for collection, and the copyright belongs to the original author.