The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
OntoExtractor: A Tool for Semi-Automatic Generation and Maintenance of Taxonomies from Semi-Structured Documents
Abstract
This chapter introduces OntoExtractor, a tool for the semi-automatic generation of the taxonomy from a set of documents or data sources. The tool generates the taxonomy in a bottom-up fashion. Starting from structural analysis of the documents, it produces a set of clusters, which can be refined by a further grouping created by content analysis. Metadata describing the content of each cluster is automatically generated and analysed by the tool for producing the final taxonomy. A simulation of a tool, based on an implicit and explicit voting mechanism, for the maintenance of the taxonomy is also described. The author depicts a system that can be used to generate the taxonomy from a heterogeneous source of information, using wrappers for converting the original format of the document to a structured one. This way, OntoExtractor can virtually generate the taxonomy from any source of information just adding the proper wrapper. Moreover, the trust mechanism allows a reliable method for maintaining the taxonomy and for overcoming the unavoidable generation of wrong classes in the taxonomy.
Related Content
|
Elisha Mupaikwa.
© 2026.
24 pages.
|
|
Usharani Bhimavarapu.
© 2026.
24 pages.
|
|
Methembe Melusi Mhlope.
© 2026.
28 pages.
|
|
Usharani Bhimavarapu.
© 2026.
24 pages.
|
|
Methembe Melusi Mhlope.
© 2026.
32 pages.
|
|
Stephen Tsekea, Alfred Mapolisa.
© 2026.
28 pages.
|
|
Elisha Mupaikwa.
© 2026.
20 pages.
|
|
|