TechMicrosoft's New AI Can Make Photos Talk: Innovations and Risks Explored

Microsoft's New AI Can Make Photos Talk: Innovations and Risks Explored

Microsoft has unveiled an innovative artificial intelligence model designed to animate photographs using generated audio, producing stunning yet potentially risky results.

VASA-1
VASA-1
Images source: © Microsoft

The development of advanced machine learning technologies has remarkably expanded the capabilities of artificial intelligence. For example, Microsoft's latest AI model can bring static images of people to life.

With this model, named Microsoft VASA-1, an image can suddenly start speaking, animating human portraits to synchronize with sound recordings. This technology impressively transforms ordinary photos into realistic animations of people talking or singing.

Turning a simple photo into a realistic animation

Microsoft conducted experiments using non-existent, generated portraits created with StyleGAN2 and DALL-E 3. This feature works effectively on realistic photos of people and cartoon avatars, with experiments even including the animation of the famous Mona Lisa.

The VASA-1 model goes beyond synchronizing lip movements; it captures the richness of facial expressions and natural head movements, enhancing the realism of the animations.

The model supports the creation of animations in a resolution of 512 x 512 pixels at a frame rate of 45 frames per second in offline mode. It can also produce real-time recordings at up to 40 frames per second with a minimal delay of just 170 ms on desktop computers equipped with an NVIDIA GeForce RTX 4090 graphics card.

The potential risks of new technology

Although Microsoft's research primarily focuses on generating animations for virtual portraits rather than creating misleading content, the company acknowledges the potential misuse of this technology for impersonation.

Microsoft has publicly stated its opposition to using the VASA-1 model for deceptive purposes or creating harmful content with the images of real people. Consequently, the company has decided not to release the demonstration version, API, or complete product to the public. Microsoft remains interested in leveraging this technology to enhance the detection of counterfeit content.

Related content
© essanews.com
·

Downloading, reproduction, storage, or any other use of content available on this website—regardless of its nature and form of expression (in particular, but not limited to verbal, verbal-musical, musical, audiovisual, audio, textual, graphic, and the data and information contained therein, databases and the data contained therein) and its form (e.g., literary, journalistic, scientific, cartographic, computer programs, visual arts, photographic)—requires prior and explicit consent from Wirtualna Polska Media Spółka Akcyjna, headquartered in Warsaw, the owner of this website, regardless of the method of exploration and the technique used (manual or automated, including the use of machine learning or artificial intelligence programs). The above restriction does not apply solely to facilitate their search by internet search engines and uses within contractual relations or permitted use as specified by applicable law.Detailed information regarding this notice can be found  here.