The paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the interaction with the user by presenting him so-called Conceptual Directions the search results are purified, which allows to increase Precision value. The preliminary evaluation on multi-sense test phrases indicates, that the algorithm is able to increase the Precision, within result set, without Recall loss. We also describe an additional method used for extending the result set based on creating cluster prototypes and finding the most similar, not retrieved content in text repository. In our demo implementation in the form of web portal, clustering has been used to present the search results organized in thematic groups instead of ranked list.
Authors
Additional information
- DOI
- Digital Object Identifier link open in new tab 10.1007/978-3-319-14977-6_42
- Category
- Aktywność konferencyjna
- Type
- materiały konferencyjne indeksowane w Web of Science
- Language
- angielski
- Publication year
- 2015