Projects per year
Estimating the number of nodes in a large-scale online social network is a fundamental problem in the field of network science. However, it is observed that the performance of traditional RW (Random Walk)-based network size estimators is not stable enough. The relative error of estimated network size can be extremely high for social networks with high skewness of degree distribution, even when the sampling rate is over 50%. In this paper, we propose a robust network size estimation algorithm called Graser, which significantly improves the robustness of size estimation on different types of online social networks. To understand the problem, we first conduct an in-depth study to examine influential factors that lead to inaccuracy of traditional RW-based estimators. Our study identifies a few key factors (e.g., degree distribution) closely related to the inaccuracy of size estimation. Next, we design an efficient network size estimator called Graser, which can iteratively reduce the estimation error by subgraph sampling. We also perform theoretical analysis on the performance of Graser, and prove that our algorithm can guarantee to achieve a relative error that is much smaller than that of traditional RW-based estimators. Finally, we conduct extensive experiments using real datasets to evaluate the performance of our proposed algorithm and the results show that Graser performs much better than traditional RW-based estimators on different types of social networks, and can reduce the relative error of size estimation significantly even with an extremely low sampling rate.
|Number of pages||12|
|Journal||IEEE Transactions on Network Science and Engineering|
|Publication status||Published - Oct 2020|
- Size estimation, online social networks, subgraph sampling