Original link: https://www.msra.cn/zh-cn/news/features/ai4science
Chris Bishop, Microsoft Technical Fellow, Director of the Center for Scientific Intelligence at Microsoft Research
In the next decade, deep learning is destined to have a transformative impact on the natural sciences. The results have potentially far-reaching implications and could greatly improve our ability to model and predict natural phenomena on vastly different spatial and temporal scales. Does this capability represent the dawn of a new paradigm of scientific discovery?
Jim Gary, Turing Award winner and former Microsoft technical fellow, described the historical evolution of scientific discovery in ” four paradigms “. The origins of the first paradigm can be traced back thousands of years, and it is purely empirical, based on direct observations of natural phenomena. While many regularities are evident in these observations, there is no systematic way to capture or express these regularities. The second normal form is characterized by theoretical models of nature, such as Newton’s laws of motion in the 17th century, or Maxwell’s equations of electrodynamics in the 19th century. These equations are derived inductively from empirical observations and can be generalized to a wider range of situations than direct observations. Although these equations can be solved analytically in simple scenarios, it was not until the development of electronic computers in the 20th century that they could be solved in a wider range of situations, resulting in the third normal form based on numerical computing. In the early 2000s, computing transformed science again, this time through its ability to collect, store, and process vast amounts of data, giving birth to the fourth paradigm of data-intensive scientific discovery. Machine learning is an increasingly important component of the Fourth Paradigm, enabling the modeling and analysis of large-scale experimental scientific data. These four paradigms complement each other and do not contradict each other.
The pioneer of quantum physics, Paul Dirac, said in 1929: “The fundamental laws of the mathematical theory required by most of physics, and of chemistry in general, are fully known, and the difficulty lies in the precise application of these laws. The equations are too complex to solve.” For example, the Schrödinger equation describes the behavior of molecules and matter with extremely high precision at the subatomic level, but high-precision numerical solutions are only available in very small systems composed of a small number of atoms can be obtained in. Scaling to larger systems means increasingly blurry approximations, leading to a trade-off between scale and accuracy. Even so, quantum chemical computing has become one of the most important workloads for supercomputers.
In the past year or two, however, we’ve seen a new use for deep learning — a powerful tool that balances the speed and accuracy of scientific discovery. This new way of using machine learning is radically different from fourth normal form data modeling because the data used to train neural networks comes from numerical solutions to fundamental equations in science, rather than empirical observations. We can think of numerical solutions of scientific equations as simulators of nature, at a high computational cost, for many applications of our interest – such as predicting the weather, simulating galaxy collisions, optimizing fusion reactor designs, or calculating drug candidates The binding free energy of the molecule to the target protein. However, from a machine learning perspective, the intermediate details of the simulation process can be viewed as training data that can be used to train deep learning simulators. Such data is fully annotated, and the amount of data depends only on computational overhead. Once trained, the simulator can perform new computations efficiently and greatly, sometimes by orders of magnitude.
The “fifth paradigm” of scientific discovery represents one of the most exciting frontiers in machine learning and the natural sciences. While these simulators have a long way to go before they become fast enough, robust, general-purpose, and mainstream in the industry, their potential real-world impact is clear. For example, the number of small molecule drug candidates alone is estimated to be as high as 10^60, while the total number of stable materials is about 10^180 (roughly the square of the number of atoms in the known universe). Finding more efficient ways to explore these vast spaces will enhance our ability to discover new substances—such as better drugs to treat disease, better substrates for capturing atmospheric carbon dioxide, better battery materials, and the ability to fuel the hydrogen economy New fuel cell electrodes that provide power, and countless other applications.
“AI4Science is an attempt deeply rooted in Microsoft’s mission to leverage our artificial intelligence capabilities to develop new scientific discovery tools that allow us and the rest of the scientific community to address some of the most important challenges facing humanity. Microsoft For more than 30 years, the Institute has maintained a tradition of curiosity and exploration. I am confident that the AI4Science team, which spans the fields of geography and science, will make an extraordinary contribution to this tradition.”
–Kevin Scott, Executive Vice President and Chief Technology Officer, Microsoft
Today, I’m excited to announce that I’m leading a new global team at Microsoft Research, with members from across the UK, China, the Netherlands, and more, focused on bringing the Fifth Paradigm to life. Our AI4Science team consists of world-class experts in machine learning, computational physics, computational chemistry, molecular biology, software engineering and other disciplines who work together to solve some of the most pressing challenges in the field.
Take the Graformer model as an example, it was established by my colleague, the leader of our China team, Dr. Liu Tieyan, the distinguished chief scientist of Microsoft. This is a general molecular modeling model with powerful molecular characterization capabilities, which will be of great help to new material design and drug discovery. Recently, Graformer won the Open Catalyst Challenge, a molecular dynamics competition aimed at simulating catalyst-adsorbate reaction systems through AI, with density functional theory (DFT) software simulating more than 66 10,000 catalyst-adsorbate reaction systems (144 million structure-energy frameworks).
Another project is Generative Chemistry , a collaboration between the Cambridge team and Novartis , where we use AI to empower scientists to accelerate the discovery and development of breakthrough drugs. As Iya Khalil , Global Head of Novartis’ AI Innovation Lab recently noted, this work is no longer science fiction, but scientific reality:
“Not only can AI learn from our past experiments, but with each new iteration of design and testing in the lab, machine learning algorithms can identify new patterns and guide the early drug discovery and development process. Hopefully, through In this way, we can augment the expertise of human scientists to design better molecules faster.”
Using this platform, the team has generated several very promising early-stage molecules that have been synthesized for further exploration.
In addition to the teams in China and the United Kingdom, our team in the Netherlands is also growing, including Max Welling , a world-renowned machine learning expert. Today, I am equally pleased to announce that our brand new laboratory in Amsterdam will be housed in the under-construction Amsterdam Science Park Matrix One . This purpose-built office space is in close proximity to the University of Amsterdam and Vrije Universiteit Amsterdam, with which we will be working closely through programmes such as joint doctoral training.
Matrix One, Amsterdam Science Park
It is with pride and excitement that we have come together as a cross-regional team to follow in the footsteps of pioneers and contribute to the next paradigm in scientific discovery, and in the process bring benefits to many important societal challenges influences. If you share our passions and ambitions and want to join our team, you are welcome to check out our open positions or get in touch with our team members.
The author of this article: Chris Bishop, Microsoft Technical Fellow, Director of the Center for Scientific Intelligence at Microsoft Research
This article is reprinted from: https://www.msra.cn/zh-cn/news/features/ai4science
This site is for inclusion only, and the copyright belongs to the original author.