Robust size estimation of online social networks via subgraph sampling

Yangfan Jiang, Yao Fu, Yipeng Zhou, Di Wu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Estimating the number of nodes in a large-scale online social network is a fundamental problem in the field of network science. However, it is observed that the performance of traditional RW (Random Walk)-based network size estimators is not stable enough. The relative error of estimated network size can be extremely high for social networks with high skewness of degree distribution, even when the sampling rate is over 50%. In this paper, we propose a robust network size estimation algorithm called Graser, which significantly improves the robustness of size estimation on different types of online social networks. To understand the problem, we first conduct an in-depth study to examine influential factors that lead to inaccuracy of traditional RW-based estimators. Our study identifies a few key factors (e.g., degree distribution) closely related to the inaccuracy of size estimation. Next, we design an efficient network size estimator called Graser, which can iteratively reduce the estimation error by subgraph sampling. We also perform theoretical analysis on the performance of Graser, and prove that our algorithm can guarantee to achieve a relative error that is much smaller than that of traditional RW-based estimators. Finally, we conduct extensive experiments using real datasets to evaluate the performance of our proposed algorithm and the results show that Graser performs much better than traditional RW-based estimators on different types of social networks, and can reduce the relative error of size estimation significantly even with an extremely low sampling rate.

Original languageEnglish
Pages (from-to)2702-2713
Number of pages12
JournalIEEE Transactions on Network Science and Engineering
Volume7
Issue number4
DOIs
Publication statusPublished - Oct 2020

Keywords

  • Size estimation, online social networks, subgraph sampling

Fingerprint Dive into the research topics of 'Robust size estimation of online social networks via subgraph sampling'. Together they form a unique fingerprint.

Cite this