Deep learning to frame objects for visual target tracking

Shuchao Pang, Juan José del Coz, Zhezhou Yu, Oscar Luaces, Jorge Díez

Research output: Contribution to journalArticlepeer-review

29 Citations (Scopus)


We present a new approach to deal with visual tracking target tasks. This method uses a convolutional neural network able to rank a set of patches depending on how well the target is framed (centered). To cover the possible interferences our proposal is to feed the network with patches located in the surroundings of the object detected in the previous frame, and with different sizes, thus taking into account eventual changes of scale. In order to train the network, we had to create an ad-hoc large dataset with positive and negative examples of framed objects extracted from the Imagenet detection database. The positive examples were those containing the object in a correct frame, while the negative ones were the incorrectly framed. Finally, we select the most promising patch, using a matching function based on the deep features provided by the well-known AlexNet network. All the training stage of this method is offline, so it is fast and useful for real-time visual tracking. Experimental results show that the method is very competitive with respect to state-of-the-art algorithms, being also very robust against typical interferences during the visual target tracking process.

Original languageEnglish
Pages (from-to)406-420
Number of pages15
JournalEngineering Applications of Artificial Intelligence
Publication statusPublished - Oct 2017
Externally publishedYes


  • deep convolutional networks
  • deep learning
  • target tracking visualization


Dive into the research topics of 'Deep learning to frame objects for visual target tracking'. Together they form a unique fingerprint.

Cite this