Identifying promising compounds from a vast collection of feasible compounds is an important and yet challenging problem in pharmaceutical industry. An efficient solution to this problem will help reduce the expenditure at the early stages of drug discovery. In an attemp to solve this problem, Mandal, Wu and Johnson (2006) proposed SELC algorithm which was motivated by SEL algorithm of Wu, Mao and Ma (1990). However, SELC fails to extract substantial information from the data to guide the search efficiently as this methodology is not based on any statistical modeling of the data. The current approach uses Gaussian Process (GP) modeling to improve upon SELC method, and hence named as G-SELC. The performance of the proposed methodology is illustrated using four and five dimensional test functions, and its higher success rates are demonstrated via simulations. Finally, we use the proposed approach on a real pharmaceutical data set for finding a group of chemical compounds with optimal properties.

Abhyuday Mandal, Pritam Ranjan, and C.F. Jeff Wu

