Events
IFML Seminar
Generating a Video: Reflecting on a Two-Year Odyssey
Atlas Wang, Associate Professor, The University of Texas at Austin
-The University of Texas at Austin
Gates Dell Complex (GDC 6.302)
United States
Abstract: In this talk, I will recount the developmental trajectory of video generation models at Picsart AI Research over the past two years—a journey that has taken us from initial baselines to the frontiers of ultra-long video streaming and storytelling. Our inaugural project Text2Video-Zero, presented at ICCV 2023, marked a milestone as the first training-free video generator to leverage pre-trained Stable Diffusion models, serving as a versatile foundation for subsequent works and earning widespread acclaim. Building on this success, our team ventured into creating of the first open-source video generator capable of producing ultra-long sequences. Our new model, StreamingT2V, reliably generates up to 1200 frames—equating to a video duration of 2 minutes—with potential for scaling to even more prolonged timeframes. Concluding the talk, I will share personal insights and reflections gleaned from this intensive R&D period, while highlighting the untapped possibilities for the future video generation models.
Speaker Bio: Atlas Wang is an associate professor at UT Austin, affiliated with ECE (primary), CS (GSC), and Oden Institute. He leads the VITA research group (https://vita-group.github.io/