IRMA-International.org: Creator of Knowledge
Information Resources Management Association
Advancing the Concepts & Practices of Information Resources Management in Modern Organizations

Acquiring Semantic Sibling Associations from Web Documents

Acquiring Semantic Sibling Associations from Web Documents
View Sample PDF
Author(s): Marko Brunzel (German Research Center for Artificial Intelligence (DFKI GmbH), Germany)and Myra Spiliopoulou (Otto-von-Guericke-Universitat Magdeburg, Germany)
Copyright: 2008
Pages: 17
Source title: Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications
Source Author(s)/Editor(s): John Wang (Montclair State University, USA)
DOI: 10.4018/978-1-59904-951-9.ch118

Purchase

View Acquiring Semantic Sibling Associations from Web Documents on the publisher's website for pricing and purchasing information.

Abstract

The automated discovery of relationships among terms contributes to the automation of the ontology engineering process and allows for sophisticated query expansion in information retrieval. While there are many findings on the identification of direct hierarchical relations among concepts, less attention has been paid on the discovery sibling terms. These are terms that share a common, a priori unknown parent such as co-hyponyms and co-meronyms. In this study, we present our results on the discovery of pairs or groups of sibling terms with XTREEM-SA (Xhtml TREE mining for sibling associations), an algorithm that extracts semantics from Web documents. While conventional methods process an appropriately prepared corpus, XTREEM-SA takes as input an arbitrary collection of Web documents on a given topic and finds sibling relations between terms in this corpus. It is thus independent of domain and language, does not require linguistic preprocessing, and does not rely on syntactic or other rules on text formation. We describe XTREEM-SA and evaluate it toward two reference ontologies. In this context, we also elaborate on the challenges of evaluating semantics extracted from the Web against handcrafted ontologies of high quality but possibly low coverage.

Related Content

Nuno Silva, Pedro Sousa, Miguel Mira da Silva. © 2019. 19 pages.
Ioannis Routis, Mara Nikolaidou, Nancy Alexopoulou. © 2019. 21 pages.
Jeffrey S. Zanzig, Guillermo A. Francia III, Xavier P. Francia. © 2019. 26 pages.
S. B. Goyal. © 2019. 30 pages.
Maria João Ferreira, Fernando Moreira, Isabel Seruca. © 2019. 24 pages.
Agostino Poggi, Paolo Fornacciari, Gianfranco Lombardo, Monica Mordonini, Michele Tomaiuolo. © 2019. 21 pages.
Rüdiger Pryss, Manfred Reichert. © 2019. 26 pages.
Body Bottom