The article describes a problem of splitting data for k-fold cross-validation, where class proportions must be preserved, with additional constraint that data is divided into groups that cannot be split into different cross-validation sets. This problem often occurs in e.g. medical data processing, where data samples from one patient must be included in the same cross-validation set. As this problem is NP-complete, a heuristic anytime polynomial algorithm is proposed and described in the article. Also, it is experimentally compared to two other, simpler algorithms.
Autorzy
Informacje dodatkowe
- Kategoria
- Publikacja monograficzna
- Typ
- rozdział, artykuł w książce - dziele zbiorowym /podręczniku w języku o zasięgu międzynarodowym
- Język
- angielski
- Rok wydania
- 2014