Abstract
Knowledge bases (KBs) are computer systems that store complex structured and unstructured facts, i.e., knowledge. KB are described as open shared database of the world's knowledge and typically use the entity-relational model. Most of the existing knowledge bases make their data in the RDF format. Tools including querying, inferencing and reasoning on facts are developed to consume the knowledge. In this chapter,we introduce a client-side caching framework aiming at accelerating the overall query response speed. In particular, we improve a suboptimal graph edit distance function to estimate the similarity of SPARQL queries and develop an approach to transform the SPARQL queries to feature vectors. Machine learning algorithms are leveraged using these feature vectors to identify similar queries that could potentially be the subsequent queries. We adapt multiple dimensional reduction algorithms to reduce the identification time. We then prefetch and cache the results of these queries aiming to improve the overall querying performance. We also develop a forecasting method, namely Modified Simple Exponential Smoothing, to implement the cache replacement. Our approach has been evaluated by using a very large set of real world queries. The empirical results show that our approach has great potential to enhance the cache hit rate and accelerate the querying speed on SPARQL endpoints.
Original language | English |
---|---|
Title of host publication | Handbook of big data technologies |
Editors | Albert Y. Zomaya, Sherif Sakr |
Place of Publication | Cham, Switzerland |
Publisher | Springer, Springer Nature |
Chapter | 13 |
Pages | 429-453 |
Number of pages | 25 |
ISBN (Electronic) | 9783319493404 |
ISBN (Print) | 9783319493398 |
DOIs | |
Publication status | Published - 25 Feb 2017 |
Externally published | Yes |