New version of the MDR method for stratified samples
Abstract
The new version of the MDR method of performing identication of relevant factors within a given collection X_1,..., X_n is introduced for stratified samples in the case of binary response variable Y. We establish a criterion of strong consistency of estimates (involving K-cross-validation procedure and penalty) for a specified prediction error function. The cost approach is proposed to compare experiments with random and nonrandom number of observations. Analytic results are accompanied by simulations.References
S.E.Ahmed. Penalty, Shrinkage and Pretest Strategies. Variable Selection and Estimation. Springer, Cham, 2014.
V.Bolon-Canedo, N.Sanchez-Marono and A.Alonso-Betanzos. Feature Selection for High-Dimensional Data. Springer, Cham, 2015.
P.Buhlmann, S.van de Geer. Statistics for High-Dimensional Data. Methods, Theory and Applications. Springer, Heidelberg, 2011.
A. Bulinski. On foundation of the dimensionality reduction method for explanatory variables. Journal of Mathematical Sciences, v. 199, No. 2, 113-122 (2014).
A.Bulinski. Central limit theorem related to MDR-method. In: Asymptotic Laws and Methods in Stochastics. A volume in Hounor of Miklos Csorgo. Fields Institute Communications, v. 76, 113-128. Springer, New York, 2015.
A.Bulinski. Some statistical methods in genetics. In: V.Schmidt (Ed.). Stochastic Geometry, Spatial Statistics and Random Fields. Lecture Notes in Mathematics, v. 2120, 293-320. Springer-Verlag, Berlin, 2014.
A.Bulinski, O.Butkovsky, V.Sadovnichy, A.Shashkin, P.Yaskov, A.Balatskiy, L.Samokhodskaya and V.Tkachuk. Statistical methods of SNP data analysis and applications. Open Journal of Statistics, v. 2, No 1, 73-87 (2012).
A.Bulinski, A.Rakitko. Simulation and analytical approach to the identification of significant factors. Commun. in Statistics. Part B: Simulation and Computation, v. 44, 1-23 (2015).
A.Bulinski, A.Rakitko. MDR method for nonbinary response variable. J. of Multivariate Analysis, v. 135, 25-42 (2015).
A.Dehman, C.Ambroise and P.Neuvial. Performance of a blockwise approach in variable selection using linkage disequilibrium information. BMC Bioinformatics 16:148 (2015).
K-A.Do, Z.S.Qin and M.Vannucci (Eds.). Advances in Statistical Bioinformatics. Models and Integrative Inference for High-Throughput Data. Cambridge University Press, Cambridge, 2013.
D.Gola, J.M.M.John, K. van Steen and R.Konig. A roadmap to multifactor dimensionality reduction methods. Briefings in Bioinformatics, June 24, 1-16 (2015).
G. James, D. Witten, T. Hastie and R. Tibshirani. An Introduction to Statistical Learning with Applications in R, Springer Science + Business Media, New York, 2013.
I.Koch. Analysis of Multivariate and High-Dimensional Data. Cambridge University Press, Cambridge, 2014.
J-M.Marin, C.Robert. Bayessian Essentials with R. Springer Science + Business Media, New York, 2014.
J.H.Moore, S.M.Williams (Eds.). Epistasis: Methods and Protocols. Methods in Molecular Biology. v. 1253. Springer Science + Business Media, New York, 2015.
J. Park, Independent rule in classification of multi-variate Binary Data, J. of Multivariate Analysis, v. 100, No. 10, 2270-2286 (2009).
M. D. Ritchie, L. W. Hahn, N. Roodi, R. Bailey, W. D. Dupont, F. F. Parl and J.H. Moore. Multifactor dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer, Amer. J. Human Genetics, v. 69, 139-147 (2001).
G.Ritter. Robust Claster Analysis and Variable Selection. CRC Press, Boca Raton, 2015.
J. Shang, J. Zhang, Y. Sun, D. Liu, D. Ye and Y. Yin. Performance analysis of novel methods for detecting epistasis BMC Bioinformatics 12:475 (2011).
R. L. Taylor and T.-C. Hu. Strong laws of large numbers for arrays of row-wise independent random elements, Int. J. Math. Math. Sci., v. 10, 805-814 (1987).
D. Velez, B. White, A. Motsinger, W. Bush, M. Ritchie, S. Williams, and J. Moore. Balanced accuracy function for epistasis modeling in imbalanced datasets using multifactor dimensionality reduction, Genetic Epidemiology, v. 31, 306-315 (2007).
S.J.Winham, A.J.Slater and A.A.Motsinger-Reif. A comparison of internal validation techniques for multifactor dimensionality reduction. BMC Bioinformatics, 11:394 (2010).
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).