A real-time action prediction framework by encoding temporal evolution

Fahimeh Rezazadegan, Sareh Shirazi, Larry S. Davis

Research output: Contribution to journalConference paperpeer-review


Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars. While recent work has addressed prediction of raw RGB pixel values in future video frames, we focus on predicting further in future by predicting a summary of moving pixels through a sequence of frames which we call dynamic images. More precisely, given a dynamic image, we predict the motion evolution through next unseen video frames. Since this representation consists of a sequence of frames, we can go one second further into the future compared to the previous work in this field. We employed convolutional LSTMs to train our network on the dynamic images in an unsupervised learning process. Since our final goal is predicting the next action of a complex task such as an assembly task, we exploited labelled actions for the recognition process on top of predicted dynamic images. We show the effectiveness of our method on predicting the next human action in the above-mentioned task through the two-step process of predicting the next dynamic image and recognizing the action which it represents.
Original languageEnglish
Number of pages9
JournalarXiv.org e-Print archive
Publication statusE-pub ahead of print - 2017
Externally publishedYes


Dive into the research topics of 'A real-time action prediction framework by encoding temporal evolution'. Together they form a unique fingerprint.

Cite this