OpenAISora-最大型號Sora能夠生成一分鍾的高保真視頻

米言看科技 2024-04-18 18:05:33
OpenAI Sora 是一種革命性的文本到視頻 AI 模型,是人工智能和現實世界應用的重大進步。它能夠從文本提示生成長達一分鍾的視頻,保持卓越的視覺質量。Sora 利用擴散模型將視頻從靜態噪聲演變爲連貫的視覺敘事,爲 AI 技術樹立了新標准。 OpenAI 還透露了將視頻生成模型作爲世界模擬器的研究。他們探索了在視頻數據上大規模訓練生成模型。具體來說,他們在可變持續時間、分辨率和縱橫比的視頻和圖像上聯合訓練文本條件擴散模型。它們利用一種 transformer 架構,該架構在視頻和圖像潛在代碼的時空補丁上運行。最大的型號 Sora 能夠生成一分鍾的高保真視頻。結果表明,擴展視頻生成模型是構建物理世界通用模擬器的一條有前途的途徑。 超現實視頻可用于生成超有用的 AI 訓練數據。這與每年 100 倍的訓練計算規模相一致。到 2025 年,這一代 OpenAI 視頻可以擴展到數小時。到 2026 年,每小時可以生成數周的視頻。訓練數據的生成可以變成實時的多個倍數。 OpenAI Hyperrealistic AI Videos and AI Video Generation for World Simulators OpenAI Sora is a revolutionary text-to-video AI model that is a significant advance for artificial intelligence and real-world applications. It is capable of generating up to one-minute-long videos from textual prompts, maintaining exceptional visual quality. Sora utilizes a diffusion model to evolve videos from static noise into coherent visual narratives, setting a new standard in AI technology. OpenAI also revealed research for video generation models as world simulators. They explore large-scale training of generative models on video data. Specifically, they train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. They leverage a transformer architecture that operates on spacetime patches of video and image latent codes. The largest model, Sora, is capable of generating a minute of high fidelity video. The results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. Hyperrealistic Video can be used to generate hyper-useful AI training data. This goes in line with the scaling of training compute by 100 times every year. By 2025, this OpenAI video generation could scale to many hours. By 2026, weeks of video could be generated every hour. The generation of training data could become many multiple of real-time.
0 阅读:1

米言看科技

簡介:感謝大家的關注