Searching the big data: Practices and experiences in efficiently querying knowledge bases

Wei Emma Zhang, Quan Z. Sheng*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

Knowledge bases (KBs) are computer systems that store complex structured and unstructured facts, i.e., knowledge. KB are described as open shared database of the world's knowledge and typically use the entity-relational model. Most of the existing knowledge bases make their data in the RDF format. Tools including querying, inferencing and reasoning on facts are developed to consume the knowledge. In this chapter,we introduce a client-side caching framework aiming at accelerating the overall query response speed. In particular, we improve a suboptimal graph edit distance function to estimate the similarity of SPARQL queries and develop an approach to transform the SPARQL queries to feature vectors. Machine learning algorithms are leveraged using these feature vectors to identify similar queries that could potentially be the subsequent queries. We adapt multiple dimensional reduction algorithms to reduce the identification time. We then prefetch and cache the results of these queries aiming to improve the overall querying performance. We also develop a forecasting method, namely Modified Simple Exponential Smoothing, to implement the cache replacement. Our approach has been evaluated by using a very large set of real world queries. The empirical results show that our approach has great potential to enhance the cache hit rate and accelerate the querying speed on SPARQL endpoints.

Original languageEnglish
Title of host publicationHandbook of big data technologies
EditorsAlbert Y. Zomaya, Sherif Sakr
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Chapter13
Pages429-453
Number of pages25
ISBN (Electronic)9783319493404
ISBN (Print)9783319493398
DOIs
Publication statusPublished - 25 Feb 2017
Externally publishedYes

Fingerprint

Dive into the research topics of 'Searching the big data: Practices and experiences in efficiently querying knowledge bases'. Together they form a unique fingerprint.

Cite this