TY - JOUR
T1 - Link sign prediction by Variational Bayesian Probabilistic Matrix Factorization with Student-t Prior
AU - Wang, Yisen
AU - Liu, Fangbing
AU - Xia, Shu Tao
AU - Wu, Jia
PY - 2017/9/1
Y1 - 2017/9/1
N2 - In signed social networks, link sign prediction refers to using the observed link signs to infer the signs of the remaining links, which is important for mining and analyzing the evolution of social networks. The widely used matrix factorization-based approach – Bayesian Probabilistic Matrix Factorization (BMF), assumes that the noise between the real and predicted entry is Gaussian noise, and the prior of latent features is multivariate Gaussian distribution. However, Gaussian noise model is sensitive to outliers and is not robust. Gaussian prior model neglects the differences between latent features, that is, it does not distinguish between important and non-important features. Thus, Gaussian assumption based models perform poorly on real-world (sparse) datasets. To address these issues, a novel Variational Bayesian Probabilistic Matrix Factorization with Student-t prior model (TBMF) is proposed in this paper. A univariate Student-t distribution is used to fit the prediction noise, and a multivariate Student-t distribution is adopted for the prior of latent features. Due to the high kurtosis of Student-t distribution, TBMF can select informative latent features automatically, characterize long-tail cases and obtain reasonable representations on many real-world datasets. Experimental results show that TBMF improves the prediction performance significantly compared with the state-of-the-art algorithms, especially when the observed links are few.
AB - In signed social networks, link sign prediction refers to using the observed link signs to infer the signs of the remaining links, which is important for mining and analyzing the evolution of social networks. The widely used matrix factorization-based approach – Bayesian Probabilistic Matrix Factorization (BMF), assumes that the noise between the real and predicted entry is Gaussian noise, and the prior of latent features is multivariate Gaussian distribution. However, Gaussian noise model is sensitive to outliers and is not robust. Gaussian prior model neglects the differences between latent features, that is, it does not distinguish between important and non-important features. Thus, Gaussian assumption based models perform poorly on real-world (sparse) datasets. To address these issues, a novel Variational Bayesian Probabilistic Matrix Factorization with Student-t prior model (TBMF) is proposed in this paper. A univariate Student-t distribution is used to fit the prediction noise, and a multivariate Student-t distribution is adopted for the prior of latent features. Due to the high kurtosis of Student-t distribution, TBMF can select informative latent features automatically, characterize long-tail cases and obtain reasonable representations on many real-world datasets. Experimental results show that TBMF improves the prediction performance significantly compared with the state-of-the-art algorithms, especially when the observed links are few.
KW - Matrix factorization
KW - Signed networks
KW - Student-t distribution
UR - http://www.scopus.com/inward/record.url?scp=85017496935&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2017.04.014
DO - 10.1016/j.ins.2017.04.014
M3 - Article
AN - SCOPUS:85017496935
SN - 0020-0255
VL - 405
SP - 175
EP - 189
JO - Information Sciences
JF - Information Sciences
ER -