Artificial Intelligence10 months ago
Google’s VideoPoet Multimodal Model Creates Both Video and Audio
Google researchers introduced VideoPoet, a sophisticated language model capable of processing multimodal inputs, including text, images, videos, and audio, to produce videos. VideoPoet employs a decoder-only...