Move a Mouse, Create a Video: New Technology from the Technion

A technology developed at the Technion gives everyday users intuitive tools for creating realistic video clips, without requiring enormous computing resources

A groundbreaking technology developed at the Technion enables ordinary users to create realistic video clips intuitively, without the need for massive computing resources. Called Time to Move (TTM), it offers unprecedented control over the movement of objects and characters in AI-generated videos using nothing more than mouse movements, eliminating the need for complex and expensive infrastructure or training on millions of videos.

Dr. Or Litany of the Henry and Marilyn Taub Faculty of Computer Science, who led the research together with Faculty colleague Prof. Ron Kimmel and students Asaf Singer, Noam Rotstein, and Amir Mann, presented the work at the International Conference on Learning Representations (ICLR) 2026 conference held in Brazil last month. ICLR is one of the world’s leading conferences in deep learning and AI.

ד"ר אור ליטני
Dr. Or Litany

“Our development,” explains Dr. Litany, “solves one of the main limitations of AI-based video generation: the difficulty of precisely controlling the movement of objects and characters over time. TTM does not require retraining and can be integrated as a plug-in into existing video models. Unlike previous approaches, which require model-specific adaptation and substantial computing resources, this technology operates with no additional computational cost. In doing so, it helps democratize AI video creation by expanding access beyond giant companies such as Google and Meta.”

הדגמה של הטכנולוגיה החדשה מול טכנולוגיות קיימות. כל צמד תמונות מציג משמאל את היכולות הקיימות ומימין את יכולותיה של טכנולוגיית TTM
Demonstration of the new technology compared with existing technologies. In each image pair, the left side shows current capabilities, while the right side demonstrates the capabilities of TTM.

The key innovation behind the new technology is dual-clock denoising, a highly efficient method for refining motion that balances fidelity to the user’s intent with natural, realistic movement. Experiments conducted by Dr. Litany demonstrate that TTM matches training-based methods and surpasses them in motion accuracy and realism. The technology introduces capabilities that prior trained methods can’t offer, such as editing object appearance and adding new objects to a scene. This marks a significant step toward intuitive, creative, and controllable tools for generative video.

Dr. Litany joined the Taub Faculty of Computer Science at the Technion as a senior lecturer in 2023 after being selected as both an Azrieli Faculty Fellow and a Taub Fellow, two prestigious distinctions awarded to outstanding early-career researchers. He came to the Technion with an extensive academic and industry track record, including two postdoctoral fellowships at leading AI centers (Stanford University and FAIR at Meta), major awards, and developments considered foundational contributions to the field of computer vision.  

To read the full article, click

 

On the left is the user input to the video, the right is the output. Here’s a comparison:

https://time-to-move.github.io/UserObjectControl_Comparison/Hamburger_concat.mp4

Left is the user input again, then the researchers’ implemented on top of a strong video model (WAN) and a less strong one (CogVideoX); on the right is the baseline (a trained method that the researchers compare against)

https://time-to-move.github.io/UserObjectControl/splash_knight_concat_960x320_optimized.mp4

Demonstration of the system’s performance:
https://time-to-move.github.io/assets/teaser_video_5_v13_720p_trimmed_lightweight.mp4