HKUST Xunfei can not miss the big model

Original link: https://www.latepost.com/news/dj_detail?id=1809

This is the third large-scale model conference held by iFLYTEK in the past 100 days. More than 1,200 seats were packed, and the aisles on both sides and the back row of the venue were also full of people. “More than 20 senior executives of HKUST Xunfei gave up their seats to university student developers,” said Liu Qingfeng, founder and chairman of HKUST Xunfei.

On August 15, HKUST Xunfei released the Spark Cognitive Model 2.0 version, focusing on the demonstration of programming capabilities and multimodal capabilities, that is, the ability to automatically generate and understand codes and pictures based on text or voice requirements. HKUST Xunfei announced that these functions will be synchronized to its multiple products and businesses, including Xunfei AI learning machine, as well as solutions for education, automotive, financial and office scenarios.

During the 140-minute press conference, the 50-year-old Liu Qingfeng was uncontrollably excited. He maintained high spirits from beginning to end, kept waving his arms, and introduced the progress of iFLYTEK’s large-scale model in a hoarse and loud voice. During the on-site demonstration in charge of Liu Cong, Dean of iFlytek Research Institute of HKUST, Liu Qingfeng, who was originally watching from the sidelines, was confident and a little nervous. He would explain from time to time to further demonstrate technical details and product capabilities to the audience.

The last time HKUST Xunfei received so much attention was about six years ago. Taking advantage of the wave of artificial intelligence at that time, its market value approached 100 billion yuan. This has brought popularity to HKUST Xunfei and its status as the number one artificial intelligence stock in China, as well as controversy.

HKUST Xunfei is one of the few companies in China that directly makes profits from artificial intelligence technology products. Some Internet companies with equally strong artificial intelligence capabilities do not allow users to pay directly for technology in more cases, but use artificial intelligence capabilities such as recommendation algorithms and search to aggregate traffic, and then make money through advertising. In the past few years, HKUST Xunfei has successively launched artificial intelligence hardware products such as translation machines, learning machines, and office notebooks, and has also sold artificial intelligence technology solutions to customers in the fields of education, medical care, and finance.

However, this business model of directly making money through artificial intelligence has not brought huge revenue and profits to HKUST Xunfei in the past, and the multiple markets it has entered are not on the fast track. At that time, the technology hotspot of the artificial intelligence wave was computer vision, which was not the field of speech and natural language that iFLYTEK was good at. At that time, a voice in the market was: HKUST Xunfei’s basic market could hardly support a market value of 100 billion.

The boom in large models that started at the end of last year is indeed a new variable in the supply of technology: it may bring a different opportunity to iFLYTEK, and this is a big opportunity that it may really seize with long-term technology accumulation.

The attitude of the capital market partly reflects this possibility: Since the beginning of this year, the market value of HKUST Xunfei has increased by more than 90%, reaching 136 billion yuan. Investors have a simple assumption: When a game starts, the most advantageous group may be those who have already started.

Before the opportunities for large models emerged, iFLYTEK, established in 1999, has been developing artificial intelligence technology for more than 20 years, especially in speech and language technology. “NLP (Natural Language Processing Technology) has been accumulated for more than 20 years, and there are many cross-modal works on text and speech, and the quality of the corpus is very high.” An employee of HKUST Xunfei summarized the advantages of HKUST Xunfei. HKUST Xunfei moved quickly after the big model boom:

The special research on the large model was launched half a month after the launch of ChatGPT; version 1.0 of the Spark large model was released half a year later, and the goal with a specific timetable was disclosed-on October 24 this year, to develop a Chinese level that surpasses ChatGPT, and an English level that is comparable to that of ChatGPT. It is quite large model application.

The version 2.0 of Xinghuo Large Model, which has enhanced code and multimodal capabilities, is an important node in the above plan. “The ability to code is the key support for connecting the digital world; and the multi-modal ability can enable general artificial intelligence to empower various specific scenarios in the industry and enter every home in the future.” Liu Qingfeng said at the press conference.

On the same day that HKUST Xunfei held the press conference, the “Interim Measures for the Management of Generative Artificial Intelligence Services” promulgated by the State Cyberspace Administration of China and other departments was officially implemented. The competition of large models in China has entered a new stage-the landing of large models will no longer be just a small one. The scope of the test, but with the possibility of large-scale promotion, the competition will become more intense and complex. In this competition, model technology itself is still important, but how to make large models no longer “nonsense” and how to make large models into products that more efficiently solve problems in specific scenarios have become the prerequisites for the promotion of large models.

Two major upgrades in three months, gradually approaching ChatGPT

On May 6 this year, iFLYTEK released version 1.0 of the Xinghuo Cognitive Big Model, and demonstrated on the spot that it can generate text (generate speeches, emails, press releases, etc.), understand text (check grammar, analyze sentiment, translate), and The ability to do commonsense reasoning and scientific reasoning in different scenarios. Before and after this, many Chinese companies that released large-scale models mostly used screen recording to demonstrate, while Xunfei demonstrated the real machine.

At the release site of Spark Model 2.0 yesterday, HKUST Xunfei still chose the real machine networking demonstration. Liu Cong, Dean of the Research Institute of iFlytek, University of Science and Technology, first demonstrated the progress of the basic capabilities of large models – the generation of long texts is no longer just a general introduction, but a summary of the facts that have occurred; the answer to math problems has been able to deal with the more complicated high school geometry problem.

When introducing the latest code generation capabilities, Liu Cong used the large model of Spark to create a programming assistant that relied on less than 10 segments of voice and text input—for example, “I plan to use Python to develop this function, please provide specific step-by-step implementation instructions.” steps, and tell me which packages need to be imported”, generating hundreds of lines of Python code. Running these codes can generate a “volley handwriting” application: it can call the laptop camera to recognize the trajectory of people passing through the air through gestures.

When the Spark Model 1.0 was released in May, Liu Qingfeng said frankly that the programming ability “has just started, and only has preliminary data”. But now he said that after testing with 2,000 employees internally in the past 2 to 3 weeks, they found that using the Spark large model to assist programming can increase efficiency by 30%.

5d3a8f84e3407e12de50b354b78bdf90.jpeg

HKUST Xunfei live demo generated code. The picture is from HKUST Xunfei.

There is a small episode in the real-time demonstration of the network, which also shows the reality of the demonstration. When demonstrating the multi-modal capabilities of the large model, Liu Cong took a photo of the press conference on the spot with his mobile phone, trying to make the large model understand. The first time he uploaded the picture was unsuccessful, Liu Cong changed his mobile phone to test it, and the big model then described what it “saw”: “This picture shows a large event, with a large group of people sitting at a long table…in the scene , we can see a big screen with some information displayed on it. The whole scene gives people a lively and formal feeling.”

This press conference fulfilled the timetable promise that HKUST Xunfei released the first version of the Spark model in May. At that time, Liu Qingfeng said that “we must pay tribute to and learn from OpenAI, and at the same time, we must quickly catch up and strive to surpass”, and announced the specific upgrade nodes:

  • June 9: Breakthrough in open-ended questions and answers, multi-round dialogue ability upgrade, and math ability upgrade.
  • August 15th: Breakthrough code capability, multi-modal interaction and then upgrade.
  • October 24: The general model is benchmarked against ChatGPT (Chinese surpasses, English is equivalent).

Generally, companies will not announce the timetable that is accurate to the date. HKUST Xunfei is the only company among the Chinese large-scale model players to do so. This may come from its clear definition of tasks and the step-by-step disassembly of the path.

When the research and development of the large-scale model was started at the end of last year, the large-scale model research team of HKUST Xunfei began to cooperate with the State Key Laboratory of Cognitive Intelligence, and designed 7 important directions for general artificial intelligence based on the 48 main task capabilities demonstrated by ChatGPT (text generation, language comprehension, knowledge question answering, logical reasoning, mathematical ability, coding ability, multimodal ability) detailed test method, used to evaluate the ability and gap of the large model.

In May of this year, Liu Cong talked about how iFLYTEK developed the Xinghuo large model in half a year, saying: “With the clear goal of comprehensively benchmarking against ChatGPT, under the guidance of a clear technical path, we will achieve a breakthrough from 0 to 1. “

Before the release of Xinghuo Cognitive Big Model 2.0, the “Artificial Intelligence Big Model Experience Report 2.0” released by the China Enterprise Development Research Center of the Xinhua News Agency Research Institute on August 12 showed that iFlytek’s Xinghuo big model is in terms of basic ability, IQ ability, and emotional intelligence. In the evaluation of the four aspects of ability and tool efficiency, the total score reached 1013 points. It was the closest to human performance (1014 points) among the large models developed by 8 Chinese companies participating in the evaluation.

Liu Qingfeng said that HKUST Xunfei can achieve this level thanks to its continuous investment in the past 24 years and its “sufficient technical accumulation and talents”. Speech technology is the starting point of HKUST Xunfei’s entrepreneurship. Language understanding and processing is a difficult problem that must be overcome to further develop speech technology and make more complete products. Liu Qingfeng introduced that since 2011, iFLYTEK has undertaken the construction of the National Engineering Laboratory of Speech and Language Information Processing, and the construction of the State Key Laboratory of Cognitive Intelligence in 2017, and proposed the “Ultrain 2030 Plan” in 2022 to develop general artificial intelligence Technology: For many years, iFlytek has continued to win championships in related global technology competitions, such as speech synthesis, machine reading comprehension, and scientific knowledge reasoning.

To promote the widespread implementation of large models, only technology is not enough

At present, there are four mainstream monetization models for large models:

  • Develop large-scale dialogue applications and charge users monthly/yearly, such as ChatGPT Plus released by OpenAI.
  • Sell ​​large model API interfaces and charge companies or developers according to the number of calls, such as OpenAI, Claude, Google, etc.
  • Directly sell large models and custom development services, and export large model industry solutions to traditional enterprises to earn money. For example, some start-up companies and cloud computing companies.
  • Use large models to transform the company’s existing business, improve the competitiveness of products and solutions to obtain commercial returns, such as Microsoft, Google, etc.

Among them, the first and second business models focus on the technical capabilities of large-scale models; the third focuses on the ability to serve customers, and the last requires large-model companies to have products or business scenarios that are suitable for applying large-scale technology. HKUST Xunfei’s past business layout has given it room to explore the above-mentioned main large-scale model realization methods.

In terms of improving the competitiveness of existing products and services, iFLYTEK announced at yesterday’s press conference that it will deploy the new code capabilities and multi-modal generation capabilities of large models to existing products such as learning machines and industries such as education In the scene, such as helping teachers generate courseware; also released commercial products based on code processing and multi-modal capabilities-iFlyCode1. practice speaking), and open the trial application.

In terms of outputting large-scale industry solutions, iFLYTEK and Huawei jointly launched the large-scale computing hardware “Spark All-in-One Machine” using Huawei’s Kunpeng CPU and Ascend GPU. The all-in-one machine is characterized by privatization and deployment of large models, which is suitable for large customers with data security requirements. “The all-in-one machine allows all enterprises to privatize and deploy large models more conveniently, independently, more securely and controllably on the platform of domestic independent innovation.” Liu Qingfeng said.

HKUST Xunfei will start thinking about application and commercialization in the early stage of research and development of new technologies. Unlike most companies that release large-scale models separately, the three large-scale model conferences held by HKUST Xunfei will not only test the capabilities of large-scale models on the spot, but also launch landing applications simultaneously.

Liu Cong once said at an event this year that iFLYTEK established a “1+N” large-scale model system when it started to tackle key problems in large-scale model technology-“not only to do basic general large-scale models, but also to simultaneously do practical education, Product applications in medical and office scenarios. ‘N’ data and scenarios can be updated into ‘1’ to promote the iteration of general large model capabilities; ‘1’ capabilities can also be integrated into ‘N’ product applications to achieve faster Landing on the ground.” He called it “a systematic project that starts with the end in mind.”

Cutting-edge technology R&D and commercialization in parallel is a business strategy gradually formed during the entrepreneurial process of HKUST Xunfei. When it was established in 1999, HKUST Xunfei already had the country’s leading voice technology. However, in the first few years of starting a business, iFLYTEK lost tens of millions of investment, and when it was most difficult, it only had 70,000 yuan in its account, and it was on the verge of bankruptcy.

Over the years, relying on the technology that has been formed, HKUST Xunfei has tried various monetization methods that now seem incredible, such as “sound email”, which allows users to listen to information on the Internet through the phone, but the results are not ideal.

“In the growth of Xunfei, we made countless mistakes.” Jiang Tao, senior vice president of iFLYTEK, called this exploration experience “bent straight line” in 2013, “In the process of making these products, we also slowly I know how to make products, how to do engineering, and how to study user needs.”

Technology is fundamental, but not everything. If a cutting-edge technology can’t play a role in practical application, it is difficult to say that it is meaningful. HKUST Xunfei later earned income by undertaking projects such as remote education projects for rural primary and secondary schools in Anhui Province, and multimedia network computer classrooms for primary and secondary schools in Anhui Province. In 2008, it became the first entrepreneurial project for Chinese college students to go public.

This year, HKUST Xunfei established a research institute to continue research and development of cutting-edge technologies, and simultaneously promote how to realize the technology. After more than ten years, iFLYTEK has established a business map of both ToB/G and ToC around the continuous improvement of voice and artificial intelligence technology:

  • ToB/G’s business is to provide digital and intelligent transformation solutions for customers in the fields of education, medical care, automobiles, office, finance, and smart cities. In 2022, it will earn 14.18 billion yuan in revenue, accounting for 75% of the total revenue.
  • ToC is the development of hardware products such as recording pens, smart office books, and translators, as well as software products such as Xunfei Input Method, Xunfei Hearing APP, and virtual humans. The revenue in 2022 will be 4.64 billion yuan, accounting for 25%.

Compared with many start-up companies and even large Internet companies, iFLYTEK’s advantage is that it has a ready-made large-scale application carrier and many years of experience in serving government and enterprise customers.

When the Spark Model 1.0 was released on May 6, iFLYTEK applied it to learning machines and other hardware products and medical scenarios. Now the effect has been initially verified-according to Liu Qingfeng, the GMV of Xunfei AI learning machine increased by more than 100% in May compared with the same period last year, and increased by more than 200% in June.

A clearer development path and a more complex competitive landscape

At yesterday’s press conference, Liu Qingfeng announced a more challenging new goal – to release products that benchmark GPT-4 in the first half of next year.

HKUST Xunfei will continue to maintain its own pace of investment in large models, and at the same time expand the developer ecosystem.

HKUST Xunfei’s previous R&D investment strategy is: 70% of its power is invested in technologies that can support the company’s strategic business; 20% is invested in the integration of the entire technology link; 10% is invested in forward-looking technologies. As for the large model, Liu Cong said that HKUST Xunfei will make efforts in the three directions of “721” at the same time: training a general large model, continuous iteration is 70% of the part, and it is a strategic business; using a general large model to cover subdivisions The 20% part of the industry is technology integration, and the 10% part of innovation at the neural network level in the whole process is forward-looking technology. “

HKUST Xunfei is also working hard to expand the large-scale model ecology. In addition to opening algorithm capabilities or exporting API interfaces, which are routine operations of large-scale model companies, Liu Qingfeng said that HKUST Xunfei will also provide some corporate customers with design references for industry solutions. Take the lead to jointly build a large-scale industry model.”

Individual users without a technical foundation can also be builders of the Xinghuo large model ecology. On June 9th, when iFlytek released version 1.5 of the Spark Model, it launched the “Xunfei Spark Assistant” function, allowing users to set the mode of the Spark Model to answer questions according to their own intentions, such as answering each question. , analyze and give examples from both positive and negative sides. In June, there were more than 1,000 such little assistants built by ordinary individuals. At yesterday’s press conference, Liu Qingfeng announced that in two months, 7,800 new assistants have been created by users: “These assistants are concentrated on the Xunfei platform, allowing more people to make good use of large models.”

Liu Qingfeng said that after the official implementation of the “Interim Measures for Generative Artificial Intelligence Service Management”, HKUST Xunfei will promote the Xinghuo model more, and the assistant ecology will also become richer.

The official implementation of the policy also complicates the competition of large models to a certain extent. “Both development and safety must be taken into consideration,” Liu Qingfeng said when talking about the implementation of the policy at the press conference. When large models are deeply applied in various industries, the primary key issue is “content security and controllability.” Misleading.

He believes that HKUST Xunfei has a “unique” advantage. The Speech and Language National Engineering Technology Center it undertakes has a variety of tools for cleaning text data, such as language discriminators, quality discriminators, privacy discriminators, and security discriminators. High-quality training data; HKUST Xunfei will also combine the large model and the “industry knowledge base”. When answering specific industry questions, the large model will extract the content of the knowledge base to make the answer more reliable. HKUST Xunfei also cooperates with “People’s Daily” and other organizations to train large models to generate safer and more value-oriented content.

“We have made significant progress in these aspects (safety and preventing hallucinations), and I think strategic opportunities for cognitive large models to empower thousands of industries are beginning to come.” Liu Qingfeng said.

It has been more than 9 months since the release of ChatGPT at the end of last year. The development of China’s large-scale model industry has moved from the original pursuit of large-scale model capabilities to the stage of simultaneously promoting commercialization and trying to achieve scale. Large-scale model competition is no longer just a competition of technology and products, but a competition that mixes service capabilities and ecological capabilities.

Title map source: HKUST Xunfei

This article is transferred from: https://www.latepost.com/news/dj_detail?id=1809
This site is only for collection, and the copyright belongs to the original author.