The infinite visual generative model NUWA-Infinity allows the free extension of visual art creation

Original link: https://www.msra.cn/zh-cn/news/features/nuwa-infinity

Editor’s note: Previously, Microsoft Research Asia proposed the multimodal model NUWA, which can generate images or videos based on a given textual, visual, or multimodal input, and supports a variety of visual artwork creation tasks, including text-to-image Or video generation, image completion, video prediction, etc. Recently, Microsoft Research Asia has publicly published new research results: NUWA’s upgraded version – the infinite visual generation model NUWA-Infinity, which makes visual art creation tend to “infinite flow”, and can generate high-resolution images or long-term images of any size. time video. Come and experience the infinite creativity of AI!


Perhaps you have also had this idea – what is the scenery outside the frame of those “world famous paintings”?

Let NUWA-Infinity take us to “see it”!

Discover the wider landscape beyond the frame of Van Gogh’s Starry Night:

“Spying” “Across the River During Qingming Festival”, the people outside the 5.287-meter scroll:

“Across the River During Qingming Festival”

NUWA-Infinity recreates a part of the generated painting (resolution: 38912×2048) after learning from “Along the River During Qingming Festival”

NUWA-Infinity can also transform still images into ultra-high-definition video, bringing “life” to them.

original still image

NUWA-Infinity video generated from still images

In addition, NUWA-Infinity can also generate ultra-high-definition pictures based on text, bringing richer imagination to artistic creation.

Don’t know what to do and want to explore more? You are welcome to click to read the original text, go to the NUWA-Infinity demo page, and intuitively experience the unlimited creative capabilities of NUWA-Infinity.

Why did Microsoft Research Asia develop NUWA-Infinity, and what new technologies were used behind it?

As the consumption-based attention economy gradually shifts to a production-based creative economy, more and more people have become everyday creators, using a variety of photo and video editing tools to innovate or recreate artworks. creation. However, creating high-quality visual art is never easy and often requires specialized skills and equipment, and takes a lot of time. At the same time, there is an increasing demand for higher-resolution images or videos of longer duration in everyday visual art creation.

To this end, the NUWA team of Microsoft Asia Research Institute has developed an infinite visual generation model NUWA-Infinity. Compared to NUWA, which also covers image and video creation, NUWA-Infinity has superior performance in resolution and variable-size visual artwork generation, and supports the generation of five high-resolution vision tasks, including unconditional image generation with high Resolution Maps, Text to High-Res Images, Text to High-Res Videos, Images to High-Res Animations, and Images to High-Res Images.

In the NUWA-Infinity model, the researchers proposed a global autoregressive nested local autoregressive generation mechanism. The global autoregression modeled the dependencies between visual blocks and the local autoregression modeled the dependencies between visual words. relationship, allowing NUWA-Infinity to generate globally consistent and locally detailed high-quality images and videos, and propose an Arbitrary Direction Controller (ADC) to decide the appropriate generation order and learn order-aware location embeddings. Compared with other multimodal generation models, NUWA-Infinity can generate ultra-high resolution images of any shape and size related to a given text, image or video to adapt to different devices, platforms and scenarios; more Importantly, NUWA-Infinity also supports the generation of long-term videos, such as the production of image animations.

In addition, the NUWA-Infinity model also introduces a Nearby Context Pool (NCP) to cache the local image that has been generated as the context of the current image being generated. Significant savings in computing costs. NUWA-Infinity greatly makes up for the shortcomings of existing technologies on the market that only support the generation of visual content of limited size and the high computational cost of visual content creation.

In the next step, the NUWA team will continue to promote the evolution of NUWA and hope to develop technologies that can empower professional and everyday art creators in three aspects:

  • Conception: Through automatic, fast and diverse design generation capabilities, the threshold for conception is lowered, and more information and inspiration are provided to art creators in the conception stage.
  • Aesthetics: Lowers the threshold for creativity and supports ordinary users to create creative works with appropriate aesthetic/design quality (NUWA model learns a large number of pictures with high quality/high aesthetic standards).
  • Efficiency: Improve creative efficiency and reduce creative workload by integrating the capabilities of NUWA into a set of intelligent tools.

In the future, high-resolution visual content generated by AI will be more in line with the visual content creation needs of image design, advertising, animation, games and other industries, providing creators with a steady stream of creative inspiration. Welcome more researchers and developers to explore the broad future of AI visual creation together with Microsoft Research Asia.

NUWA is carried out at the scientific research level and is a cutting-edge exploration of the automatic generation of visual art works, aiming to provide more intelligent tools for visual art creators and support them to better develop their creativity. Microsoft remains committed to fighting disinformation and does everything it can to provide the latest technology to detect manipulated content and help people identify “deepfake” information (for more information on Microsoft’s efforts to combat disinformation, please visit Visit: https://ift.tt/QwKgxIc). At the same time, Microsoft’s technological progress is guided by Microsoft’s responsible AI process and follows the principles of fairness, inclusion, reliability and security, transparency, privacy and security, and accountability.

Paper link:

NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis

https://ift.tt/1W3mLjt

Demo page: https://ift.tt/2dBR1Or

NUWA-Infinity project page: https://ift.tt/jlv5Pk2

This article is reprinted from: https://www.msra.cn/zh-cn/news/features/nuwa-infinity
This site is for inclusion only, and the copyright belongs to the original author.

Leave a Comment