Cross-validated stepwise regression for identification of novel non-nucleoside reverse transcriptase inhibitor resistance associated mutations
Van der Borght, Koen × Van Craenenbroeck, Elke Lecocq, Pierre Van Houtte, Margriet Van Kerckhove, Barbara Bacheler, Lee Verbeke, Geert van Vlijmen, Herman #
BMC Bioinformatics vol:12 pages:386
Linear regression models are used to quantitatively predict drug resistance, the phenotype, from the HIV-1 viral genotype. As new antiretroviral drugs become available, new resistance pathways emerge and the number of resistance associated mutations continues to increase. To accurately identify which drug options are left, the main goal of the modeling has been to maximize predictivity and not interpretability. However, we originally selected linear regression as the preferred method for its transparency as opposed to other techniques such as neural networks. Here, we apply a method to lower the complexity of these phenotype prediction models using a 3-fold cross-validated selection of mutations.