Publications Repository - Gdańsk University of Technology

Page settings

polski
Publications Repository
Gdańsk University of Technology

Treść strony

Self–Organizing Map representation for clustering Wikipedia search results

The article presents an approach to automated organization of textual data. The experiments have been performed on selected sub-set of Wikipedia. The Vector Space Model representation based on terms has been used to build groups of similar articles extracted from Kohonen Self-Organizing Maps with DBSCAN clustering. To warrant efficiency of the data processing, we performed linear dimensionality reduction of raw data using Principal Component Analysis. We introduce hierarchical organization of the categorized articles changing the granularity of SOM network. The categorization method has been used in implementation of the system that clusters results of keyword-based search in Polish Wikipedia.

Authors

Additional information

DOI
Digital Object Identifier link open in new tab 10.1007/978-3-642-20042-7_15
Category
Aktywność konferencyjna
Type
materiały konferencyjne indeksowane w Web of Science
Language
angielski
Publication year
2011

Source: MOSTWiedzy.pl - publication "Self–Organizing Map representation for clustering Wikipedia search results" link open in new tab

Portal MOST Wiedzy link open in new tab