CSIR Central

A Combinatorial Approach to the Variable Selection in Multiple Linear Regression: Analysis of Selwood et al Data set -A Case Study.

IR@CDRI: CSIR-Central Drug Research Institute, Lucknow

View Archive Info
 
 
Field Value
 
Creator Prabhakar, Yenamandra S
 
Date 2008-02-25T05:24:55Z
2008-02-25T05:24:55Z
2003
 
Identifier QSAR & Combinatorial Chemistry Science (2003), 22, 538
http://hdl.handle.net/123456789/81
 
Description A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.
 
Format 360884 bytes
application/pdf
 
Language en
 
Relation CDRI Communication Number 6225
 
Subject Regression analysis
variable selection
combinatorial approach
antimycin A1 analogues
antifilarial
 
Title A Combinatorial Approach to the Variable Selection in Multiple Linear Regression: Analysis of Selwood et al Data set -A Case Study.
 
Type Article