Towards Effective Processing of Large Text Collections

Julian Szymański; Henryk Krawczyk

doi:10.1109/intech.2012.6457784

In the article we describe the approach to parallelimplementation of elementary operations for textual data categorization.In the experiments we evaluate parallel computations ofsimilarity matrices and k-means algorithm. The test datasets havebeen prepared as graphs created from Wikipedia articles relatedwith links. When we create the clustering data packages, wecompute pairs of eigenvectors and eigenvalues for visualizationsof the datasets. We describe the method used for evaluation ofthe clustering quality. Finally we discuss achieved results, pointsome improvements and perspectives for future development.

Autorzy

Informacje dodatkowe

DOI: Cyfrowy identyfikator dokumentu elektronicznego link otwiera się w nowej karcie 10.1109/intech.2012.6457784
Kategoria: Aktywność konferencyjna
Typ: materiały konferencyjne indeksowane w Web of Science
Język: angielski
Rok wydania: 2012

Źródło danych: MOSTWiedzy.pl - publikacja "Towards Effective Processing of Large Text Collections" link otwiera się w nowej karcie

link otwiera się w nowej karcie

Repozytorium publikacji - Politechnika Gdańska

Treść strony