Active Effects Selection which Considers Heredity Principle in Multi-Factor Experiment Data Analysis
Abstract
The sparsity principle suggests that the number of effects that contribute significantly to the response variable of an experiment is small. It means that the researchers need an efficient selection procedure to identify those active effects. Most common procedures can be found in literature work by considering an effect as an individual entity so that selection process works on individual effect. Another principle we should consider in experimental data analysis is the heredity principle. This principle allows an interaction effect is included in the model only if the correspondence main effects are there in. This paper addresses the selection problem that takes into account the heredity principle as Yuan et al. (2007) did using least angle regression (LARS). Instead of selecting the effects individually, the proposed approach perform the selection process in groups. The advantage our proposed approach, using genetic algorithm, is on the opportunity to determine the number of desired effect, which the LARS approach cannot.References
Aalaei, S., H. Shahraki, A. Rowhanimanesh, and S. Eslami (2016). Feature selection using genetic algorithm for breast cancer diagnosis experiment on three different datasets. Iranian Journal of Basic Medical Sciences 19, 476–482.
Algamal, Z. Y. (2019). Variable selection in count data regression model based on firefly algorithm. Stat. Optim. Inf. Comput. 7, 520–529.
Asadzadeh, L. and Zamanifar, K. (2010). An agent-based parallel approach for the job shop scheduling problem with genetic algorithms. Mathematical and Computer Modelling, 52:1957-1965.
Broadhurst, D., Goodacre, R., Jones, A., Rowland, J. J., and Kell, D. B. (1997). Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry. Analytica Chimica Acta, 348:71-86.
Efron, B., Hastie, T., Johnstone, I., and Tishibrani, R. (2004). Least angle regression. The Annals of Statistics, 32:407-499.
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96:1348-1360.
Georgiou, S. (2014). Supersaturated designs: A review of their construction and analysis. Journal of Statistical Planning and Inference, 144:92-109.
Hamada, M. and Wu, C. F. J. (1992). Analysis of designed experiments with complex aliasing. Journal of Quality Technology, 24:130-137.
Lesiak, P. and Bojarczyk, P. (2015). Application of genetic algorithms in design of public transport network. Logistics and Transport, 52:75-81.
Meier, L., van de Geer, S., and Buhlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society. Series B (methodological), 70:53-71.
Raghavarao, D. and Altan, S. (2003). A heuristic analysis of highly fractionated 2n factorial experiments. Metrika, 156:185-191.
Rais, F., Kamoun, A., Chaabouni, M., Claeys-Bruno, M., Phan-Tan-Luu, R., and Sergent, M. (2009). Supersaturated design for screening factors in uencing the preparation of sulfated amides of olive pomace oil fatty acids. Chemometrics and Intelligent Laboratory Systems, 99:71-78.
Rawlings, J., Pantula, S., and Dickey, D. A. (1998). Applied Regression Analysis: A Research Tool, Second Edition. Springer.
Schoen, E. D., Eendebak, P. T., and Nguyen, M. V. M. (2010). Complete enumeration of pure-level and mixed-level orthogonal arrays. Journal of Combinatorial Designs, 18:123-140.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (methodological), 58:267-288.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society. Series B (methodological), 67:91-108.
Umbarkar, A. and P. Sheth (2015). Crossover operators in genetic algorithms: a review. ICTACT Journal on Soft Computing 6, 1083–1092.
Vafaie, H. and De Jong, K. (1992). Genetic algorithms as a tool for feature selection in machine learning. In Proceeding of the 4th International Conference on Tools with Artificial Intelligence.
Vandewater, L., Brusic, V., Wilson, W., Macaulay, L., and Zhang, P. (2015). An adaptive genetic algorithm for selection of blood-based biomarkers for prediction of alzheimer's disease progression. BMC Bioinformatics, 16:1-10.
Wu, C. F. J. and Hamada, M. (2000). Experiments: Planning, Analysis and Parameter Design Optimization. Wiley, New York.
Yang, J. and Honavar, V. (1997). Feature subset selection using a genetic algorithm. Computer Science Technical Reports, 156.
Yuan, M., Joseph, V. R., and Lin, Y. (2007). An ecient variable selection approach for analyzing designed experiments. Technometrics, 49:430-438.
Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society. Series B (methodological), 68:49-67.
Zelenkov, Y., Fedorova, E., and Chekrizov, D. (2017). Two-step classication method based on genetic algorithm for bankruptcy forecasting. Expert Systems with Applications, 88:393-401.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).