TY - GEN
T1 - Ontology augmentation via attribute extraction from multiple types of sources
AU - Susie Fang, Xiu
AU - Wang, Xianzhi
AU - Sheng, Quan Z.
PY - 2015
Y1 - 2015
N2 - A comprehensive ontology can ease the discovery, maintenance and popularization of knowledge in many domains. As a means to enhance existing ontologies, attribute extraction has attracted tremendous research attentions. However, most existing attribute extraction techniques focus on exploring a single type of sources, such as structured (e.g., relational databases), semi-structured (e.g., Extensible Markup Language (XML)) or unstructured sources (e.g., Web texts, images), which leads to the poor coverage of knowledge bases (KBs). This paper presents a framework for ontology augmentation by extracting attributes from four types of sources, namely existing knowledge bases (KBs), query stream, Web texts, and Document Object Model (DOM) trees. In particular, we use query stream and two major KBs, DBpedia and Freebase, to seed the attribute extraction from Web texts and DOM trees. We specially focus on exploring the extraction technique from DOM trees, which is rarely studied in previous works. Algorithms and a series of filters are developed. Experiments show the capability of our approach in augmenting existing KB ontology.
AB - A comprehensive ontology can ease the discovery, maintenance and popularization of knowledge in many domains. As a means to enhance existing ontologies, attribute extraction has attracted tremendous research attentions. However, most existing attribute extraction techniques focus on exploring a single type of sources, such as structured (e.g., relational databases), semi-structured (e.g., Extensible Markup Language (XML)) or unstructured sources (e.g., Web texts, images), which leads to the poor coverage of knowledge bases (KBs). This paper presents a framework for ontology augmentation by extracting attributes from four types of sources, namely existing knowledge bases (KBs), query stream, Web texts, and Document Object Model (DOM) trees. In particular, we use query stream and two major KBs, DBpedia and Freebase, to seed the attribute extraction from Web texts and DOM trees. We specially focus on exploring the extraction technique from DOM trees, which is rarely studied in previous works. Algorithms and a series of filters are developed. Experiments show the capability of our approach in augmenting existing KB ontology.
KW - Dom tree
KW - Information extraction
KW - Knowledge base
KW - Web data
UR - http://www.scopus.com/inward/record.url?scp=84959440354&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-19548-3_2
DO - 10.1007/978-3-319-19548-3_2
M3 - Conference proceeding contribution
AN - SCOPUS:84959440354
SN - 9783319195476
T3 - Lecture Notes in Computer Science
SP - 16
EP - 27
BT - Databases Theory and Applications
A2 - Sharaf, Mohamed A.
A2 - Cheema, Muhammad Aamir
A2 - Qi, Jianzhong
PB - Springer, Springer Nature
CY - Cham, Switzerland
T2 - 26th Australasian Database Conference, ADC 2015
Y2 - 4 June 2015 through 7 June 2015
ER -