miSTAR: miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure
In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the num- ber of potential miRNA binding sites present in a target mRNA and the genomic context of each in- dividual site. Single model structures insufficiently cope with this complex training data structure, con- sisting of feature vectors of unequal length as a con- sequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this prob- lem, we developed a two-layered, stacked model, in which the influence of binding site context is sep- arately modeled. Using logistic regression and ran- dom forests, we applied the stacked model approach to a unique data set of 7990 probed miRNA–mRNA interactions, hereby including the largest number of miRNAs in model training to date. Compared to lower-complexity models, a particular stacked model, named miSTAR ( miRNA stacked model target prediction; www.mi-star.org), displays a higher gen- eral performance and precision on top scoring pre- dictions. More importantly, our model outperforms published and widely used miRNA target predic- tion algorithms. Finally, we highlight flaws in cross- validation schemes for evaluation of miRNA target prediction models and adopt a more fair and strin- gent approach.
Gert Van Peer, Ayla De Paepe, Michiel Stock, Jasper Anckaert, Pieter-Jan Volders, Jo Vandesompele, Bernard De Baets and Willem Waegeman
Post date: 05 January 2017
Copyright 2021 Center for Medical Genetics, Gent.