DeepMind: AI has discovered more than 200 million protein structures

Original link: https://www.williamlong.info/archives/6880.html

On July 28, a team from DeepMind, an artificial intelligence company owned by Google parent Alphabet, and the European Bioinformatics Institute (EMBL-EBI), announced a major leap forward in biology. They used the artificial intelligence (AI) system AlphaFold to predict 214 million protein structures in more than 1 million species, covering almost all known proteins on Earth. This breakthrough will accelerate the development of new drugs and bring about a new revolution in basic science.

At the end of 2020, when people’s impression of AlphaFold was still a Go master who defeated all human beings, the appearance of this AI system in the field of biology brought new surprises. At that time, AlphaFold successfully solved a major problem in biology for 50 years-the problem of protein folding, which was able to predict the three-dimensional structure of a protein based on its amino acid sequence.

Just half a year later, DeepMind and EMBL-EBI collaborated to publish a database of protein structures predicted by AlphaFold in a Nature paper. This database covers 350,000 protein structures of humans and 20 commonly used model organisms, and has made accurate predictions on 98.5% of human protein structures – it must be known that before this, the protein structures analyzed by the scientific community only covered human protein sequences. 17% amino acids. A series of breakthroughs in the field of artificial intelligence prediction of protein structure have also been selected as the annual scientific breakthrough in 2021 by “Science”.

Now, DeepMind’s collaboration with EMBL-EBI has gone a step further. AlphaFold’s prediction of protein structures is no longer limited to humans and model organisms, but has expanded to cover 1 million species of animals, plants, bacteria, etc., and the number of predicted protein structures has also increased hundreds of times.

“With this database covering the entire protein universe, we have entered a new era of digital biology,” commented Dr. Demis Hassabis, CEO of DeepMind.

As early as 1972, Dr. Christian Anfinsen, the Nobel Laureate in Chemistry, proposed at the Nobel Prize ceremony that the amino acid structure of a protein should completely determine its three-dimensional structure. However, due to the astronomical number of protein conformations that amino acids may form, it is extremely difficult to predict protein structures by calculation. Using traditional experimental methods (such as X-ray crystallography) to solve this problem is very time-consuming and expensive.

For the new data released today, DeepMind and the EMBL-EBI team stated that among the more than 200 million protein structure predictions, about 35% of the structures have high precision, reaching the structural accuracy obtained by experimental means; 80% of the structures are reliable enough for multiple follow-up analyses.

However, the current AlphaFold still has room for improvement. How to develop models to predict how proteins fold, not just their final structure, is the next question for the research team, says Dr Tomek Wlodarski of University College London.

Dr. Pushmeet Kohli, head of the scientific team at DeepMind, also pointed out that they are currently improving the accuracy and performance of AlphaFold: “We are trying to understand the behavior of these proteins, how they interact with other proteins.”

When the Nature paper was published a year ago, the research team made AlphaFold’s source code and database freely available to researchers. At present, more than 500,000 scholars from 190 countries and regions have accessed the database. This data has already been used in scenarios such as malaria vaccine development, combating antibiotic resistance and plastic pollution, and helping researchers accelerate the development of new drugs.

This time, the team once again released the latest database for free, and all more than 200 million protein structures can be downloaded through the database. Experts in the field of new drug development say the vast database allows researchers to focus more on confirming the details of protein structures, which are key to the success of many targeted drugs. “We no longer need to ask, ‘Where is the protein structure?’, but rather, ‘How useful are the protein structures we have? ‘ problem.” He added: “This database allows us to expand the range of druggable genomes, greatly increasing the options scientists have in their discovery of innovative drugs.”

Dr. Demis Hassabis, CEO of DeepMind, said that AlphaFold gives us a glimpse into the future and sees the infinite potential of applying computing and AI technologies to biology. At its most basic level, biology can be thought of as an information processing system, but it is very complex. AI technology may hold the key to addressing the dynamic complexity of biology. The team at DeepMind is very excited to see that the great potential of AI is being realized, and it promises to become one of the most powerful tools for human science to discover and understand the fundamental principles of life.

Source: WuXi AppTec

This article is reprinted from: https://www.williamlong.info/archives/6880.html
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment