Technology

Runway’s latest AI video generator brings giant cotton candy monsters to life

Screenshot of a Runway Gen-3 Alpha video generated with the prompt
Enlarge / Screenshot of a Runway Gen-3 Alpha video generated with the prompt “A giant humanoid, made of fluffy blue cotton candy, stomping the ground and roaring towards the sky, a clear blue sky behind them.”

On Sunday, Runway announced a new AI video synthesis model called Gen-3 Alpha that is still under development, but it appears to create similar quality video to OpenAI’s Sora, which debuted earlier this year (and has not yet been published). It can generate new high-definition videos from text prompts ranging from realistic humans to surreal monsters stomping across the countryside.

Unlike Runway’s previous best model from June 2023, which could only create two-second clips, Gen-3 Alpha would be able to create 10-second video segments of people, places and objects that have consistency and a consistency that easily surpasses those of Runway. Generation 2. If 10 seconds seems short compared to Sora’s full minute of video, consider that the company is working on a shoestring compute budget compared to the more generously funded OpenAI, and actually has a history of providing video generation capabilities to commercial users.

Gen-3 Alpha does not generate audio to accompany video clips, and it is most likely that temporally consistent generations (those that maintain a consistent character over time) rely on similar high-quality training material. But Runway’s improvement in visual fidelity over the past year is hard to ignore.

AI video is heating up

It’s been a busy few weeks for AI video summarization in the AI ​​research community, including the launch of the Chinese Kling model, created by Beijing-based Kuaishou Technology (sometimes called “Kwai”). Kling can output two minutes of 1080p HD video at 30 frames per second with a level of detail and consistency that would match that of Sora.

Gen-3 Alpha Prompt: “A woman’s subtle reflections on the window of a high-speed train in a Japanese city.”

Shortly after Kling’s debut, people on social media began creating surreal AI videos using Luma AI’s Luma Dream Machine. These videos were new and strange but generally lacked coherence; We tested Dream Machine and weren’t impressed with anything we saw.

Meanwhile, one of the pioneers of text-to-video conversion, New York-based Runway, founded in 2018, recently found itself the target of memes showing its Gen-2 technology falling out of favor compared to newer ones. video synthesis templates. This may have motivated the announcement of Gen-3 Alpha.

Gen-3 Alpha Prompt: “An astronaut running down an alley in Rio de Janeiro.”

Generating realistic humans has always been tricky for CG models, so Runway specifically shows off Gen-3 Alpha’s ability to create what its developers call “expressive” human characters with a range of actions, gestures and emotions. However, the examples the company provided weren’t particularly expressive (mostly people just stared and blinked slowly), but they looked realistic.

Human examples provided include generated videos of a woman on a train, an astronaut running down a street, a man with his face lit by the glow of a television, a woman driving a car, and ‘a woman running, among others.

Gen-3 Alpha Prompt: “A close-up of a young woman driving a car, looking at a blurred, reflected green forest visible through the rainy car window.”

The generated demo videos also include more surreal CGI examples, including a giant creature walking through a dilapidated city, a man made of rocks walking through a forest, and the giant cotton candy monster seen below, which is probably the best video overall. page.

Gen-3 Alpha Prompt: “A giant humanoid, made of fluffy blue cotton candy, stomping the ground and roaring towards the sky, a clear blue sky behind them.”

Gen-3 will power various Runway AI editing tools (one of the company’s most notable claims), including Multi Motion Brush, Advanced Camera Controls, and Director Mode. It can create videos from text or image prompts.

Runway says Gen-3 Alpha is the first in a series of models trained on new infrastructure designed for large-scale multimodal training, taking a step toward developing what it calls “general global models,” which are hypothetical AI systems that construct internal representations of environments and use them to simulate future events within those environments.

News Source : arstechnica.com
Gn tech

Back to top button