MSRA – FENQ

USB: The first semi-supervised classification learning benchmark that unifies vision, language, and audio classification tasks

Original link: https://www.msra.cn/zh-cn/news/features/usb Editor’s note: Currently, the development of semi-supervised learning is in full swing. But existing semi-supervised learning benchmarks are mostly limited to computer vision classification tasks, which preclude consistent and diverse evaluation of classification tasks such as natural language processing, audio processing, etc. In addition, most semi-supervised papers are published by large institutions, […]

USB: The first semi-supervised classification learning benchmark that unifies vision, language, and audio classification tasks Read More »

Big coffee gathered! What is Microsoft’s annual research event focused on?

Original link: https://www.msra.cn/zh-cn/news/features/microsoft-research-summit-plenaries Microsoft Research Summit 2022 will be held online from October 18-20. Each day of the three-day conference will be opened with a keynote speech and in-depth discussions, including exploring the potential impact of deep learning on scientific discovery; how technology can be used to make medical care more accurate and inclusive; how

Big coffee gathered! What is Microsoft’s annual research event focused on? Read More »

Edit speech like text, is it possible?

Original link: https://www.msra.cn/zh-cn/news/features/text-based-speech-editing Editor’s note: Today’s videos published on various social networking platforms are loved by the public for their convenient shooting, real-time sharing, and interactive communication. Video has profoundly influenced and changed the way people observe the world, record their lives and express their emotions. However, many video or audio editing software on the

Edit speech like text, is it possible? Read More »

Universal Multimodal Basic Model BEiT-3: Leading Text, Image, and Multimodal Pre-training to “Unification”

Original link: https://www.msra.cn/zh-cn/news/features/beit-3 Editor’s note: In recent years, the research on foundation models (also known as pre-training models) has gradually tended to the big convergence from the technical level, and different fields of artificial intelligence (such as natural language processing, computer vision, speech The basic models of processing, multimodality, etc.) are technically dependent on three

Universal Multimodal Basic Model BEiT-3: Leading Text, Image, and Multimodal Pre-training to “Unification” Read More »

Deng Pan’s “Greedy” Algorithm: What is the experience of going from biology to computer?

Original link: https://www.msra.cn/zh-cn/news/features/ada-workshop-pan-deng Editor’s note: The road to scientific research is not full of flowers, but often explores the unknown on a road without footprints. What should be the path of scientific research? How to seize the opportunity to make a turn? In his speech titled “The “Greedy” Algorithm of Life, Deng Pan, a researcher

Deng Pan’s “Greedy” Algorithm: What is the experience of going from biology to computer? Read More »

How to efficiently and accurately perform image search? Take a look at Lightweight Vision Pretrained Models

Original link: https://www.msra.cn/zh-cn/news/features/lightweight-vision-pre-training Editor’s note: Have you ever had trouble with image retrieval? Or it is difficult to accurately find the desired image in the massive image, or get unsatisfactory results in text-based retrieval. For this problem, researchers from Microsoft Research Asia and Microsoft Cloud Computing and Artificial Intelligence Division have conducted in-depth research on

How to efficiently and accurately perform image search? Take a look at Lightweight Vision Pretrained Models Read More »

Document intelligence multimodal pre-training model LayoutLMv3: both versatility and superiority

Original link: https://www.msra.cn/zh-cn/news/features/layoutlmv3 Editor’s note: In the digital transformation of enterprises, structured analysis and content extraction based on multi-modal forms such as documents and images are a key part of the process, which can process information including contracts, bills, and reports quickly, automatically and accurately. It is crucial to improve the productivity of modern enterprises.

Document intelligence multimodal pre-training model LayoutLMv3: both versatility and superiority Read More »

The infinite visual generative model NUWA-Infinity allows the free extension of visual art creation

Original link: https://www.msra.cn/zh-cn/news/features/nuwa-infinity Editor’s note: Previously, Microsoft Research Asia proposed the multimodal model NUWA, which can generate images or videos based on a given textual, visual, or multimodal input, and supports a variety of visual artwork creation tasks, including text-to-image Or video generation, image completion, video prediction, etc. Recently, Microsoft Research Asia has publicly published

The infinite visual generative model NUWA-Infinity allows the free extension of visual art creation Read More »

OSDI 2022 | Come and watch! The latest paper in the field of computer systems at Microsoft Research Asia!

Original link: https://www.msra.cn/zh-cn/news/features/osdi-2022 Editor’s note: OSDI (Operating Systems Design and Implementation) is one of the top academic conferences in the field of computer systems, bringing together the forward-looking thinking of computer scientists around the world. The 16th OSDI will be held from July 11th to 13th, 2022. A total of 253 papers were submitted for

OSDI 2022 | Come and watch! The latest paper in the field of computer systems at Microsoft Research Asia! Read More »

AI4Science empowers the fifth paradigm of scientific discovery

Original link: https://www.msra.cn/zh-cn/news/features/ai4science Chris Bishop, Microsoft Technical Fellow, Director of the Center for Scientific Intelligence at Microsoft Research In the next decade, deep learning is destined to have a transformative impact on the natural sciences. The results have potentially far-reaching implications and could greatly improve our ability to model and predict natural phenomena on vastly

AI4Science empowers the fifth paradigm of scientific discovery Read More »

The slag quality video becomes clear in seconds, and the “Da Vinci” toolset will help you automatically

Original link: https://www.msra.cn/zh-cn/news/features/davinci Editor’s note: Do you often “archaeological” some old movies and animations to recall the old days? Do you also have some precious videos that take you to relive the good old days? However, we have become accustomed to the high-definition experience. Looking back at the old images, the picture quality may be

The slag quality video becomes clear in seconds, and the “Da Vinci” toolset will help you automatically Read More »

CVPR 2022 | One-click to unlock the cutting-edge progress in the field of computer vision at Microsoft Research Asia!

Original link: https://www.msra.cn/zh-cn/news/features/cvpr-2022 Editor’s note: The International Conference on Computer Vision and Pattern Recognition (CVPR) is one of the most academically influential top conferences in the field of artificial intelligence. Microsoft Research Asia also successfully held the CVPR 2022 paper sharing session in April. Today, we have selected 8 excellent papers of Microsoft Research Asia

CVPR 2022 | One-click to unlock the cutting-edge progress in the field of computer vision at Microsoft Research Asia! Read More »

NaturalSpeech model synthesized speech reaches human speech level for the first time in CMOS test

Original link: https://www.msra.cn/zh-cn/news/features/naturalspeech Editor’s note: AI-synthesized speech is now commonplace, but when users hear it, it cannot make people feel as immersive as talking with real people and reading. However, NaturalSpeech, a new end-to-end speech synthesis model jointly launched by Microsoft Research Asia and Microsoft Azure Speech Team, has reached the level of human speech

NaturalSpeech model synthesized speech reaches human speech level for the first time in CMOS test Read More »