TY - JOUR
T1 - Stacked convolutional denoising auto-encoders for feature representation
AU - Du, Bo
AU - Xiong, Wei
AU - Wu, Jia
AU - Zhang, Lefei
AU - Zhang, Liangpei
AU - Tao, Dacheng
PY - 2017/4/1
Y1 - 2017/4/1
N2 - Deep networks have achieved excellent performance in learning representation from visual data. However, the supervised deep models like convolutional neural network require large quantities of labeled data, which are very expensive to obtain. To solve this problem, this paper proposes an unsupervised deep network, called the stacked convolutional denoising auto-encoders, which can map images to hierarchical representations without any label information. The network, optimized by layer-wise training, is constructed by stacking layers of denoising auto-encoders in a convolutional way. In each layer, high dimensional feature maps are generated by convolving features of the lower layer with kernels learned by a denoising auto-encoder. The auto-encoder is trained on patches extracted from feature maps in the lower layer to learn robust feature detectors. To better train the large network, a layer-wise whitening technique is introduced into the model. Before each convolutional layer, a whitening layer is embedded to sphere the input data. By layers of mapping, raw images are transformed into high-level feature representations which would boost the performance of the subsequent support vector machine classifier. The proposed algorithm is evaluated by extensive experimentations and demonstrates superior classification performance to state-of-the-art unsupervised networks.
AB - Deep networks have achieved excellent performance in learning representation from visual data. However, the supervised deep models like convolutional neural network require large quantities of labeled data, which are very expensive to obtain. To solve this problem, this paper proposes an unsupervised deep network, called the stacked convolutional denoising auto-encoders, which can map images to hierarchical representations without any label information. The network, optimized by layer-wise training, is constructed by stacking layers of denoising auto-encoders in a convolutional way. In each layer, high dimensional feature maps are generated by convolving features of the lower layer with kernels learned by a denoising auto-encoder. The auto-encoder is trained on patches extracted from feature maps in the lower layer to learn robust feature detectors. To better train the large network, a layer-wise whitening technique is introduced into the model. Before each convolutional layer, a whitening layer is embedded to sphere the input data. By layers of mapping, raw images are transformed into high-level feature representations which would boost the performance of the subsequent support vector machine classifier. The proposed algorithm is evaluated by extensive experimentations and demonstrates superior classification performance to state-of-the-art unsupervised networks.
KW - Convolution
KW - Deep learning
KW - Denoising auto-encoders
KW - Unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=84960969688&partnerID=8YFLogxK
U2 - 10.1109/TCYB.2016.2536638
DO - 10.1109/TCYB.2016.2536638
M3 - Article
C2 - 26992191
AN - SCOPUS:84960969688
SN - 2168-2267
VL - 47
SP - 1017
EP - 1027
JO - IEEE Transactions on Cybernetics
JF - IEEE Transactions on Cybernetics
IS - 4
ER -