Improving data discovery in metadata repositories through semantic search

Chad Berkley*, Shawn Bowers, Matthew B. Jones, Joshua S. Madin, Mark Schildhauer

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

20 Citations (Scopus)
6 Downloads (Pure)

Abstract

The amount of ecological data available electronically is increasing at a rapid rate, e.g., over 15,000 data sets are available today in the Knowledge Network for Biocomplexity (KNB) alone. Using the existing search capabilities of these online data repositories, however, scientists struggle to quickly locate data that are relevant to their needs or that will integrate with their current data sets. Semantic technologies aim at addressing many of these problems and hold the promise of enabling more powerful "smart" searches of online data archives. We describe new semantic search features within the Metacat metadata system, which is used by many ecological research sites around the world for archiving their data using a standardized metadata format. Our semantic search system adds to Metacat the ability to store OWL-DL ontologies in addition to semantic annotations that link data set attributes to ontology terms. Our approach also extends Metacat to improve metadata search in multiple ways: (i) by expanding standard keyword searches with ontology term hierarchies; (ii) by allowing keyword searches to be applied to annotations in addition to traditional metadata; and (iii) by allowing more structured searches over annotations via ontology terms. We describe our implementation of these extensions, and compare and contrast these different types of search for a corpus of annotated documents. As data repositories continue to grow, these tools will be instrumental in helping scientists precisely locate and then interpret data for their research needs.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Complex, Intelligent and Software Intensive Systems, CISIS 2009
Place of PublicationLos Alamitos, CA, USA
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages1152-1159
Number of pages8
ISBN (Print)9780769535753
DOIs
Publication statusPublished - 2009
EventInternational Conference on Complex, Intelligent and Software Intensive Systems, CISIS 2009 - Fukuoka, Japan
Duration: 16 Mar 200919 Mar 2009

Other

OtherInternational Conference on Complex, Intelligent and Software Intensive Systems, CISIS 2009
CountryJapan
CityFukuoka
Period16/03/0919/03/09

Bibliographical note

Copyright 2009 IEEE. Reprinted from Proceedings of the International Conference on Complex, Intelligent and Software Intensive Systems : 16-19 March 2009, Fukuoka, Fukuoka Prefecture, Japan. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Macquarie University’s products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

Fingerprint Dive into the research topics of 'Improving data discovery in metadata repositories through semantic search'. Together they form a unique fingerprint.

  • Cite this

    Berkley, C., Bowers, S., Jones, M. B., Madin, J. S., & Schildhauer, M. (2009). Improving data discovery in metadata repositories through semantic search. In Proceedings of the International Conference on Complex, Intelligent and Software Intensive Systems, CISIS 2009 (pp. 1152-1159). [5066940] Los Alamitos, CA, USA: Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/CISIS.2009.122