Why is the “anti-involution” blade of cloud computing a CIPU?

In the surging cloud computing market, a “Huashan argument” is brewing, and the key to this showdown is the hot new dedicated processor for cloud data centers.

The atmosphere of this showdown is becoming warm. In the past, Nvidia made the concept of DPU (Data Processing Unit) a hit, and later, Intel and Google cooperated to offer IPU (Infrastructure Processing Unit) a strong takeover.

This week, with the newly released CIPU (Cloud infrastructure Processing Units) processor, Alibaba Cloud entered the martial arts conference of “Huashan Lunjian” in the cloud computing 3.0 era as a strong player.

In fact, functionally, CIPU, DPU, IPU, and CIPU are roughly similar, and they are all dedicated processors designed for new cloud data centers. Alibaba Cloud believes that in the future, the CIPU will replace the CPU and become the cloud computing control and acceleration center.

The same martial arts cheat book gave birth to three sets of martial arts. The winning or losing of this “discussion of the sword” is more than just a matter of kung fu.

What the cloud computing giants are really fighting for is the right to define the next generation of cloud computing standards and become the “martial arts leader” in the new era.

The addition of Alibaba Cloud made this martial arts conference more interesting.

After all, Professor Zheng Weimin, academician of the Chinese Academy of Engineering and Tsinghua University, also praised the latest CIPU released by Alibaba Cloud. He believes that “CIPU completely breaks the previous generation computing architecture and realizes the world’s leading dedicated processor for cloud data centers on the basis of basic technology. .”

How will CIPU open up a new battlefield for cloud computing?

740

Zhang Jianfeng, President of Alibaba Cloud Intelligence, released the CIPU

Cloud computing is entering the 3.0 era, and IPU/DPU/CIPU has become a new battlefield

CIPU, IPU and DPU are new concepts for most people, but they have become a battleground for the world’s leading cloud service providers.

Because this new processor has become the key for large cloud service providers to enter the cloud computing 3.0 era.

Zhang Jianfeng, president of Alibaba Cloud Intelligence, believes that in the past ten years, cloud computing technology has experienced two stages of development: the first stage is that distributed and virtualized technologies replaced mainframes, which met the scale of computing power required by enterprises at that time; In this stage, resource pooling technology emerged. Taking Alibaba as an example, computing, storage, and network resources are pooled separately through a computing-storage separation architecture, breaking through the bottleneck of scale and stability and providing ultra-large-scale cloud computing services.

“With the popularization of data-intensive computing scenarios, users’ demands for low latency and high bandwidth are also increasing, and the traditional CPU-centric computing architecture cannot adapt to this trend. In order to solve this problem, Alibaba Cloud The relevant R&D team started technical research as early as 2015, and continued to deepen the core technologies of computing, network and storage, and in-depth vertical integration, and then evolved a new architecture form centered on the CIPU, and cloud computing began to enter the third stage. ” Zhang Jianfeng said at the same time .

In the traditional CPU-centric cloud computing architecture, the CPU not only needs to undertake computing tasks, but also is responsible for logic control. It has become an industry consensus that it cannot provide high bandwidth and develop new CIPU//IPU/DPU accelerated computing chips to meet more and more data-intensive computing needs, which has also become a new battlefield for cloud computing giants.

Yan Guihai, CEO of Zhongke Yushu, said: “The background of the birth of DPU is the imbalance between bandwidth and computing performance. The performance of CPU has increased from 30% per year 5-10 years ago to less than 3% per year three years ago. Performance increase. And the network bandwidth still has an annual growth of about 35%. The ratio of processing performance and bandwidth growth has changed from about 1:1 to the current 1:10.”

Dayu Zhixin CEO Li Shuang pointed out, “When there is a gap of more than 10 times, you need to think about a new architecture. DPU is actually an architecture transfer.”

It can be seen that both traditional chip giants, cloud service providers, and start-ups have flooded into this track in recent years. According to Leifeng.com, the world’s leading cloud service providers are developing their own DPUs. However, Alibaba Cloud’s self-developed CIPU has outstanding advantages.

What is unique about the CIPU?

Different from general computing chips such as CPU and GPU, DPU/IPU/CIPU are typical application-driven chips. Since it is an application-driven chip, the integration of software and hardware and the understanding of application scenarios are crucial.

Jiang Linquan, head of Alibaba Cloud’s virtualization technology, said, “CIPU is a chip that we define according to our business. It is connected to the Feitianyun operating system upwards, connecting millions of servers around the world into a supercomputer, and downwards to data center computing, Storage and network resources are quickly cloudized and accelerated by hardware. Our self-developed CIPU can more accurately solve the problems of management, control, scheduling, and acceleration of some core services in the cloud operating system.”

740

CIPU Architecture Diagram

Is the CIPU a combination of IPU and DPU? Jiang Jiangwei, head of Alibaba Cloud’s technical products, believes, “This statement is right and wrong. If you just take out an IPU or DPU, it doesn’t have an operating system like Feitian, and it’s not so valuable. CIPU naturally needs to be compatible with cloud computing operating systems. to generate a value.

A number of industry insiders also told Leifeng.com that as an application-driven chip, the degree of integration between DPU and cloud computing service provider infrastructure is the key to DPU’s success. In fact, the underlying software and hardware architectures of different cloud service providers are different, so it is difficult for DPU/IPU designed by external chip design companies to perfectly fit with cloud service providers . The advantages of researching CIPU are obvious.

But compared to chips that are also self-developed by cloud service providers, such as AWS, what is unique about Alibaba Cloud’s CIPU?

Jiang Linquan believes: “We have entered a similar new stage, but in different markets, we see different landscapes. First of all, in terms of product performance, the performance of CIPU is far from computing, network, and storage. It surpasses other products because domestic customers pursue extreme performance and cost-effectiveness, and it is also related to our in-depth vertical technology stack. On the other hand, our customers are also significantly different from overseas cloud service providers, and there are many foreign customers. Mature enterprise users, and there are many small and medium-sized customers in China, they need more inclusive services.”

Of course, to prove the value brought by the CIPU in practical applications, data is the most intuitive manifestation. You must know that under the new-generation cloud computing architecture system with CIPU and Feitian operating system, Alibaba Cloud’s computing, network, and storage performance have achieved a comprehensive leap.

At the computing level, the CIPU can quickly access Shenlong computing platforms with different types of resources, bringing “0” loss of computing power and reinforcement and isolation of hardware-level security.

It is reflected in different scenarios. In the mainstream general computing scenario, the performance of Nginx is improved by 89%, the performance of Redis is improved by 68%, and the performance of MySQL is improved by 60%. In big data and AI scenarios, the training performance of AI deep learning scenarios is improved by 30%, and the computing performance of Spark is improved by 30%.

740

The combination of CIPU and the network, the basic bandwidth is upgraded from 100G to 200G, the network delay is reduced from 22us to 16us, and it can be as low as 5.5us under the RDMA protocol.

740

It is particularly worth mentioning that the CIPU can perform hardware acceleration on high-bandwidth physical networks. By building a large-scale eRDMA distributed high-performance network, RDMA, a “noble” technology that can only be used in supercomputing, can be used in Alibaba Cloud. Universalization.

The combination of CIPU and storage enables hardware acceleration for the block storage access of the storage-computing separation architecture. The cloud disk storage IOPS can reach up to 3 million, and the long-tail latency is reduced by 50%. It surpasses all cloud products on the market in an all-round way. More secure, reliable and high-performance storage capabilities locally.

The comprehensive improvement of the three core elements of cloud computing, computing, storage, and network performance brought by CIPU will not only affect the cloud and the interior of the data center, but also change the form of traditional computer terminals and software application distribution. It also means that cloud computing is entering the next era.

Alibaba Cloud believes that the new generation of cloud computing requires systematic innovation from the inside of the data center, from the previous CPU-centric architecture to the CIPU-centric architecture.

Behind the CIPU battle is the battle for the right to define the next generation of cloud computing standards

Changes in the cloud computing architecture will also trigger a debate over the right to define the next-generation cloud computing standards. Past experience has taught us that only the best in the industry have the right to define standards.

“Today we can clearly see that Alibaba Cloud has achieved a perfect combination of software and hardware, and has become a cloud computing technology system supported by ‘Feitian + CIPU’ ,” said Zhang Jianfeng, “Alibaba Cloud’s core technology has always been at the forefront of the world, and this new system is A new milestone on the long march of technology, this new technology system is defining the next generation of cloud computing architecture.

Alibaba Cloud has such confidence. The key lies in the self-research of core technologies in the past 13 years, and has built a new computing system architecture that integrates software and hardware such as self-developed chips, servers, computing, storage, and networks. Alan Kay, winner of the Turing Award in 2003, once said that as long as you are really serious about software, you should make your own hardware to get a differentiated experience.

Operating systems and software are the closest products to end users, and only with a deep understanding of it can we provide differentiated and competitive products.

Alibaba Cloud has chosen this path of self-development. First, it has developed China’s only cloud operating system, Feitian, which connects millions of servers around the world into a supercomputer, with a single cluster of up to 100,000 units. , hundreds of billions of files, EB-level storage space.

With the cloud operating system Feitian, if you want to further improve, you need to master the core technology from the top to the bottom, from the system to the software to the hardware. This goes back to the three elements of computing, storage and networking.

At the computing layer, in order to solve the long-standing problem of server virtualization performance loss, Alibaba Cloud independently developed the Dragon Architecture. In the storage layer, Pangu, a distributed storage system developed by Alibaba, adopts the advanced fault-tolerant architecture and flexible platform design of the distributed system, which greatly improves the reliability and security of the storage system. At the network layer, Alibaba Cloud’s self-developed NetShen Yun network supports millions of user service deployments, allowing more people to experience the efficient and convenient services brought by cloud computing.

Alibaba Cloud has further improved availability, concurrent processing, and elasticity through its self-developed database PolarDB, which can efficiently cope with the “Double 11”-like traffic peak.

On this basis, Alibaba Cloud released its self-developed Panjiu server and Longli operating system last year. The Panjiu server adopts the latest modular design, which brings a 50% increase in server delivery efficiency. While the performance of the Longli operating system has been greatly improved, it supports various chip architectures and computing scenarios such as x86, ARM, and LoongArch, making Alibaba Cloud the cloud manufacturer that supports the most types of CPUs in the world.

The Yitian 710 CPU released by Alibaba Cloud last year, because it is a product designed for cloud computing, can bring industry-leading extreme performance. The CIPU released this year is a natural choice of Alibaba Cloud, and it takes the right path for software to self-developed hardware seriously.

Over the past years of self-research, Alibaba Cloud has stood at the peak of the cloud computing field. The newly released self-developed product, CIPU, enables Alibaba Cloud to have the strongest connection between the upper-level Apsara operating system and the underlying computing, network, and storage, breaking the bottleneck of the data center once again, and leading the data center from CPU-centric to CIPU-centric. , to promote cloud computing to the 3.0 era.

In the new stage of cloud computing, Alibaba Cloud will have the strength to define the next-generation cloud computing standards and have the opportunity to stand on the top of the global cloud computing field.

This article is reprinted from: https://www.leiphone.com/category/industrynews/lCeeYBvr9YHvwpny.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment