Robust visual tracking via scale-and-state-awareness

Yuankai Qi, Lei Qin, Shengping Zhang, Qingming Huang*, Hongxun Yao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

27 Citations (Scopus)


Convolutional neural networks (CNNs) have been applied to visual tracking with demonstrated success in recent years. However, the performance of CNN-based trackers can be further improved, because the predicted upright bounding box cannot tightly enclose the target due to factors such as deformations and rotations. Besides, many existing CNN-based trackers neglect to distinguish the occluded state of the target from non-occluded states, which causes the samples collected during occlusions wrongly update the tracker to focus on other objects. To address these problems, we propose to adaptively utilize the level set segmentation and bounding box regression techniques to obtain a tight enclosing box, and design a CNN to recognize whether the target is occluded. Extensive experimental results on a large benchmark dataset demonstrate the effectiveness of the proposed method compared to several state-of-the-art tracking algorithms.

Original languageEnglish
Pages (from-to)75-85
Number of pages11
Publication statusPublished - 15 Feb 2019
Externally publishedYes


  • Visual tracking
  • Convolutional neural network
  • Bounding box refinement
  • Occlusion awareness


Dive into the research topics of 'Robust visual tracking via scale-and-state-awareness'. Together they form a unique fingerprint.

Cite this