The algorithm and the software for conducting the procedure of Preprocessing of the reviews of films in the Polish language were developed. This algorithm contains the following steps: Text Adaptation Procedure; Procedure of Tokenization; Procedure of Transforming Words into the Byte Format; Part-of-Speech Tagging; Stemming / Lemmatization Procedure; Presentation of Documents in the Vector Form (Vector Space Model) Procedure; Forming the Documents Models Database Procedure. The experiments of this algorithm conduction on the test sampling of reviews analysis was performed and the main conclusion was formulated.
Authors
Additional information
- Category
- Publikacja w czasopiśmie
- Type
- artykuły w czasopismach recenzowanych i innych wydawnictwach ciągłych
- Language
- angielski
- Publication year
- 2017