TY - GEN
T1 - Enhancing human action recognition with region proposals
AU - Rezazadegan, Fahimeh
AU - Shirazi, Sareh
AU - Sünderhauf, Niko
AU - Milford, Michael
AU - Upcroft, Ben
PY - 2015
Y1 - 2015
N2 - Deep convolutional network models have dominated recent work in human action recognition as well as image classification. However, these methods are often unduly influenced by the image background, learning and exploiting the presence of cues in typical computer vision datasets. For unbiased robotics applications, the degree of variation and novelty in action backgrounds is far greater than in computer vision datasets. To address this challenge, we propose an "action region proposal" method that, informed by optical flow, extracts image regions likely to contain actions for input into the network both during training and testing. In a range of experiments, we demonstrate that manually segmenting the background is not enough; but through active action region proposals during training and testing, state-of-The-Art or better performance can be achieved on individual spatial and temporal video components. Finally, we show by focusing attention through action region proposals, we can further improve upon the existing state-of-The-Art in spatio-Temporally fused action recognition performance.
AB - Deep convolutional network models have dominated recent work in human action recognition as well as image classification. However, these methods are often unduly influenced by the image background, learning and exploiting the presence of cues in typical computer vision datasets. For unbiased robotics applications, the degree of variation and novelty in action backgrounds is far greater than in computer vision datasets. To address this challenge, we propose an "action region proposal" method that, informed by optical flow, extracts image regions likely to contain actions for input into the network both during training and testing. In a range of experiments, we demonstrate that manually segmenting the background is not enough; but through active action region proposals during training and testing, state-of-The-Art or better performance can be achieved on individual spatial and temporal video components. Finally, we show by focusing attention through action region proposals, we can further improve upon the existing state-of-The-Art in spatio-Temporally fused action recognition performance.
UR - http://www.scopus.com/inward/record.url?scp=85023764888&partnerID=8YFLogxK
UR - http://www.araa.asn.au/category/acra-2015/
M3 - Conference proceeding contribution
AN - SCOPUS:85023764888
T3 - Australasian Conference on Robotics and Automation, ACRA
BT - ACRA 2015
PB - Australian Robotics and Automation Association
T2 - 2015 Australasian Conference on Robotics and Automation, ACRA 2015
Y2 - 2 December 2015 through 4 December 2015
ER -