This paper describes a research study that investigates the use of machine learning algorithms on synthetic data to classify the risk of developing preeclampsia by pregnant women. Synthetic datasets were generated based on parameter distributions from three real patient studies. Four models were compared: XGBoost, Support Vector Machine (SVM), Random Forest, and Explainable Boosting Machines (EBM). The study found that the XGBoost and EBM consistently outperform the other models. An analysis of patient subsets based on their pregnancy history was also conducted, revealing that the group of patients in their first pregnancy achieved the highest prediction accuracy. Additionally, the study explored the efficacy of risk prediction based on various parameters and found that the results vary depending on the models used and the degree of class balance in the database. Finally, an additional test was performed on the dataset annotated by physicians.
Authors
- dr inż. Magdalena Mazur-Milecka link open in new tab ,
- mgr inż. Natalia Kowalczyk link open in new tab ,
- Kinga Jaguszewska,
- Dorota Zamkowska,
- Dariusz Wójcik,
- dr n. med Krzysztof Preis,
- Henriette Skov,
- Stefan Rahr Wagner,
- Puk Sandager,
- mgr inż. Milena Sobotka link open in new tab ,
- prof. dr hab. inż. Jacek Rumiński link open in new tab
Additional information
- DOI
- Digital Object Identifier link open in new tab 10.1007/978-3-031-38430-1_21
- Category
- Publikacja monograficzna
- Type
- rozdział, artykuł w książce - dziele zbiorowym /podręczniku w języku o zasięgu międzynarodowym
- Language
- angielski
- Publication year
- 2024