The AI industry has once again come to a crossroads. The high cost of computing power and the paper-thin customized business model have put AI companies in the dilemma of “negative profits” collectively.
Enterprises often rely on the capabilities of AI to achieve a digital transition, but as a new generation of infrastructure, the AI industry itself has also encountered new problems: algorithms require huge investment from research and development to deployment, and a large number of algorithms are implemented in applications. It didn’t go well.
From the source, cloud computing may be a good medicine, because it can provide lower-cost computing power and low-threshold development services, and companies with insufficient algorithm research and development capabilities can directly call the algorithms provided by cloud manufacturers on the cloud. No need to reinvent the wheel.
As a well-known market analysis organization, Gartner has smelled this trend early. They have started to release the “Cloud AI Developer Service Key Capabilities Report” in 2020. In Gartner’s view, the combination of AI and cloud will be more and more closely, AI cloud services The ability of AI will also become an important indicator of the AI industry.
It is gratifying that Chinese companies have already overcome this trend. In this year’s report, Alibaba language AI technology ranks second in the world, surpassing Amazon AWS, Microsoft and other companies, and officially entered the world’s first echelon.
Because of this, the prospects of the AI industry are still widely optimistic. The Gartner report mentioned that by 2025, 70% of new applications will integrate AI models, and cloud AI services can lower the development threshold of AI applications. This means that cloud computing will become the biggest variable of AI during labor pains.
For AI, why is it the cloud?
Two big mountains in front of AI commercialization: computing power cost and project cost
As early as 2017, academia and industry had heated discussions on deep learning at CVPR, the most influential AI conference.
The focus of the discussion is that the “big data + large computing power” paradigm of deep learning requires huge cost support, which will inevitably become the biggest obstacle to AI commercialization.
“Deep learning is indeed more accurate than traditional AI methods in data such as speech and image recognition, which is also the key to leading the third wave of AI, but deep learning is a double-edged sword. , data, energy consumption), especially the demand for computing power, far exceeds the traditional method. For example, in the past, you could live by eating only two steamed buns. Now, in order to live better, but limited by the ingredients, you can only Choose to eat expensive wagyu every day. Although it is more nutritious, it is obviously unsustainable.” Several AI experts told Leifeng.com.
Due to the high computing cost and energy consumption cost of AI, in the eyes of many benefit-oriented researchers, AI deep learning once became synonymous with brutality and violence.
In 2012, Google used 16,000 chips to let AI watch millions of YouTube videos to identify cats, and even that was buggy and less efficient than a glance at the human eye.
In 2016, in the man-machine battle in which AlphaGo defeated Go champion Lee Sedol, AlphaGo consumed about 1 million watts of electricity per game. In comparison, the human brain consumes only 20 watts of power, 1/50,000th of AlphaGo’s.
After 2018, Transformer and Bert gave birth to the birth of large pre-training models. Although the performance of AI has become stronger, the computing power required has also increased significantly. It is unbearable for most small and medium-sized enterprises to build such a cluster specially.
“Computing power” is in short supply, making it a scarce resource in the entire AI field. This is also the main reason why many AI giants in academia have flocked to large technology companies such as Google, Microsoft, and Alibaba. These companies have rich business scenarios and almost inexhaustible computing resources.
The problems faced by AI are not limited to this. In the process of commercial implementation: enterprises have to customize exclusive solutions for each scenario, which invisibly increases the development cost of enterprises and reduces profits.
Early start-up companies are superstitious in the business idea of ”developing SDK, standardization first, then scale, small profits but quick turnover, and quantity to win”. But the reality is very skinny. When AI companies rushed into the industry with the SDK, they found that the B-end customers who are used to heavy customized personal services do not need a single development kit, nor do they have the ability to integrate the SDK. What is needed is a customized solution. The dream of a set of SDK packages to conquer the world was shattered.
After the SDK’s dream was broken, AI companies began to change from light to heavy, and took the road of highly customized solutions. However, the project-based model full of personalized customization can easily make the company slip into the whirlpool of losses – long customer acquisition cycle, high implementation cost, heavy labor delivery… The high cost leads to meager profits, and even the more you do it accidentally , the more you lose.
The dream of standardization is fragile, the dilemma of customization is difficult to solve, and AI companies are in a dilemma when it comes to commercial implementation.
Facts have proved that the two hind legs created by the cost of computing power and the cost of the project are making AI falter.
To get rid of these two hind legs, we must break the inherent thinking and embark on a new path. Experts analyzed to Leifeng.com that the current exploration direction of top universities and leading technology companies is: from the basic theoretical level, using innovative algorithms to make AI itself leaner and smarter; at the engineering level, it is necessary to make AI research and development. cost becomes lower.
Cloud computing, why is it a good medicine to solve the “AI cost dilemma”
There is no doubt that the cost of AI, computing power is one of the biggest crux of the problem, but also the biggest breakthrough.
Through the scale of computing power clusters, reducing the cost of unit computing power is a clear and feasible path.
In the early days, the computing power required for AI was not high, and the CPU was sufficient to handle it. However, with the advent of the era of deep learning, high-quality AI algorithms often have an astonishing amount of data behind them. At this time, the scale of data required for training has far exceeded that of the past, and more “powerful” GPUs have gradually entered the stage of history and become AI The mainstream of computing power.
However, when deep learning gradually deepens, the scale of the model becomes larger and larger, and a single GPU can no longer satisfy the computing power. At this time, the GPU parallel computing power cluster is particularly important. Large-scale computing power clusters can not only effectively reduce GPU procurement costs, but also improve computing performance through cluster advantages.
But at this time, a new problem emerged: if there are resources ≠ natural resources are used well. If the enterprise does not have reasonable and efficient resource management, no matter how strong the GPU-parallel computing power cluster itself is, it cannot automatically forge a high-quality AI model, nor can it carry an AI application with a good experience. The AI computing power dilemma faced by enterprises today contains many trivial pain points:
If there is no linear scalability of computing power, 100 machines may not be able to match the performance of 1 machine, and a lot of time will be consumed in non-computing overhead.
Without the ability to improve resource utilization, expensive GPU clusters can easily be under 10% utilization.
The speed of business development is difficult to predict. When a project arrives, it needs to be invested quickly. When resources are purchased offline, it is easy to miss the window of opportunity.
The failure rate of GPU cards is high, and enterprises need to free up their hands to deal with the hard work and tiring work such as IaaS operation and maintenance.
The GPU is updated almost every six months. If you replace it with the latest model at any time, the cost will remain high, and the old card will be idle again.
At this time, the solution of developing AI on the cloud is put on the table, and the characteristics of cloud computing itself, such as elasticity, sharing and interoperability, are matching these pain points. Enterprises can use cloud computing to flexibly expand or shrink capacity on demand anytime, anywhere, thereby improving computing power efficiency and reducing AI R&D costs. Issues such as infrastructure layer operation and maintenance can also be handled by more professional cloud vendors.
This allows enterprises to make full use of the existing technological dividends in the market to empower themselves and improve the efficiency of their own business iterations in the context of increasingly complex models in the AI field and stronger computing power requirements.
The domestic Internet cloud manufacturers represented by Alibaba Cloud have already laid out in advance and provided this series of technologies to external services.
Alibaba Cloud Zhangbei Data Center, which can accommodate millions of servers
It is worth mentioning that, unlike AI unicorns that focus on to B and to G, these Internet cloud giants that provide cloud AI services often have a large number of scene businesses themselves, which can make the computing power cluster highly saturated and distributed. The depreciation cost of the GPU, so as to avoid the problem of idle computing power of the GPU cluster.
This approach is similar to Google’s case. Schmidt, the former CEO of Google, once said that one of the key factors for Google search to have an advantage in the competition is its low cost.
“Google’s operating costs are only a fraction of Microsoft’s and Yahoo’s, and a search service costs only a fraction of a cent. With the money saved, Google can buy more servers and improve computing performance, so that it can compete with the competition. At the same unit price as competitors, Google can use more hardware and algorithms to achieve better search quality.”
The first thing that truly first-class technology and technology companies should do is to use technology to reduce costs and increase efficiency. Only by reducing the cost of production factors can they truly enter the industry.
By reducing its own production costs, improving the utilization efficiency of computing resources, maximizing the marginal effect, and moving towards large-scale applications with the lowest cost, this is the best path for the development of the technology industry.
In addition to computing power, cloud AI services can also effectively lower the development threshold for AI applications. Taking Alibaba as an example, its machine learning platform PAI, basic algorithm models developed by DAMO Academy, and various training acceleration frameworks have effectively met the development needs of AI algorithms from the perspective of low threshold and full link.
Cloud vendors shoulder the burden of AI industrialization
Jumping out of the technical level, at the commercial level, cloud computing is also helping the AI industry to accelerate the breakthrough.
At present, the domestic AI industry mainly has three evolution paths, starting from the project system: one is the multi-industry expansion model that is the most difficult to obtain high profits. In order to quickly spread a large booth, expand a large scale, or seek business breakthroughs, it enters into finance, medical care, and retail. There are several other fields, multi-line operations; one is to focus on a vertical industry, make solutions and services in-depth, and then seek to achieve platformization in a certain field; the other is to focus on the polishing of algorithms first, and do a good job The productization of the algorithm, and then rely on the cloud platform to serve the algorithm externally, and use the infrastructure capabilities of the cloud platform to help companies develop algorithms.
Three paths for the evolution of the domestic AI industry
The leading Internet cloud manufacturers represented by Alibaba Cloud are moving towards the most benign third path in the field of AI.
The advantage of this model is that the base based on the cloud platform can not only remove the shackles of most localized deployments, but also provide low-cost self-developed algorithm research and development, and quickly serve enterprises with weak algorithm research and development capabilities, such as Dharma Academy The algorithms developed for vision, speech, and NLP are available on Alibaba Cloud for external services. At the same time, cloud computing, storage, network, and machine learning platforms can also provide full-link support for AI R&D and implementation for companies with algorithm R&D capabilities.
This path of perfectly combining cloud and AI has already begun to bear fruit. Taking Mimozhixing as an example, this company puts algorithm training tasks on Alibaba Cloud, and uses the latter’s object storage OSS and small file storage CPFS to realize massive data cold and hot tiered storage and efficient data circulation, based on elastic GPU The instance is trained on the cloud distributed model on the machine learning platform PAI, the throughput performance is improved by 110%, and the model maturity is greatly improved in a short period of time. According to reports, such training efficiency can be increased by up to 70%, and the overall cost can be reduced by about 20%.
In the past ten years, cloud computing has entered all walks of life at the speed of DNA replication by virtue of its dual advantages in computing power cost and business. Today, its proven value in the field of general computing is being replicated to In the field of AI, help AI to break through the bottleneck of implementation and achieve thousands of benefits.
Gartner has also made no secret of its prediction of this trend. Its latest AI cloud services report pointed out that by 2025, the artificial intelligence software market will reach 134.8 billion US dollars, and cloud AI services are one of the indispensable core thrusts. one.
In fact, looking back at the ebb and flow of the artificial intelligence industry over the past half century, each trough is accompanied by a breakthrough brought by a new variable. Today, cloud computing is becoming the biggest variable with high expectations. This time, the responsibility of bringing the AI industry on the right track has been entrusted to the shoulders of cloud manufacturers. Leifeng Network
This article is reprinted from: https://www.leiphone.com/category/industrycloud/s5WnuUv4ywPbSX1R.html
This site is for inclusion only, and the copyright belongs to the original author.