The paper analyzes existing approaches for approximate string matching based on linear search with Levenshtein distance, AllScan and CPMerge algorithms using cosine, Jaccard and Dice distance measures. The methods are presented and compared to our approach that improves indexing time using Locally Sensitive Hashing. Advantages and drawbacks of the methods are identified based on theoretical considerations as well as empirical evaluations on real-life dictionaries.
Authors
Additional information
- DOI
- Digital Object Identifier link open in new tab 10.15439/2016e311
- Category
- Aktywność konferencyjna
- Type
- materiały konferencyjne indeksowane w Web of Science
- Language
- angielski
- Publication year
- 2016