Pharmacophore Modeling and Docking Based QSAR Studies of Aryl Amidino Isoxazoline Derivatives to Design Potential FXa Inhibitors

In order to elucidate the structural requirements for human factor Xa receptor antagonism, 72 antagonists belonging to isoxazolidine chemical class were selected from the literature and conducted molecular modeling studies. Best binding conformations were isolated by docking selected molecules into the receptor binding site. To further explore the structure-activity relationships within the considered chemical class, a pharmacophore model and QSAR analyses were developed. Pharmacophore models of these inhibitors were established by using the HipHop and HypoGen algorithms implemented in the Catalyst software package. The best quantitative pharmacophore model, which has the highest correlation coefficient (0.92), consisted of two hydrogen bond donor, a hydrophobic aromatic, and a ring aromatic feature. The model was further validated by test set and cross validation method. Molecular shape analysis (MSA) and Molecular field analysis (MFA) were used as the QSAR techniques. Two conformer-based alignment strategies were employed to construct reliable 3D-QSAR models. The docked conformer-based alignment strategy gave the best 3D-QSAR models. The best MFA and MSA models gave a cross-validated coefficient q(2) of 0.641 and 0.816, non-cross-validated r(2) values of 0.736 and 0.902, 20% out r(2) values of 0.743 and 0.871, respectively. The information obtained from molecular modelling studies was very helpful to design some novel selective inhibitors of factor Xa with desired activity.


Introduction
Over the past twenty to thirty years, scientists have used computer models of new chemical entities to help define activity profiles, geometries and reactivities [1]. Computational chemistry/molecular modeling is the science (or art) of representing molecular structures numerically and simulating their behavior with the equations of quantum and classical physics [2]. The computational approaches used to design the drugs are dependent upon the amount of information that is available about the ligand and receptor. Based on the information that is available, one can apply either structure-based or ligand-based molecular design methods [3].
To discover novel ligands for receptors of known structure, investigators often use docking computer programs to screen multi-compound databases for molecules that fit a binding site on the receptor. Ligandfit [4] is one of such a drug discovery software program. QSAR and pharmacophore 12 Suresh B Vepuri et al.: Pharmacophore Modeling and Docking Based QSAR Studies of Aryl Amidino Isoxazoline Derivatives To Design Potential FXa Inhibitors activities for new molecules, perhaps prioritizing or screening a large group of molecules whose activities are not known A pharmacophore is defined as an ensemble of universal chemical features that characterize a specific mode of action of a ligand in the active site of the macromolecule in 3D space [7]. Chemical features are e.g. hydrogen bonds, charge interactions, hydrophobic areas. CATALYST [8] is a pharmacophore generating modeling tool. It has been used successfully, in conjunction with traditional research techniques, to examine the structural properties of existing compounds, develop and quantify a hypothesis which relates these properties to observed activity and utilize these "rules" to predict properties and activities for new chemical entities [8]. Catalyst tools help to rationally design small molecules as drug candidates using 3D pharmacophore and shape-based models, and to suggest potentially active compounds suitable for synthesis and biological testing The clinical usefulness of anticoagulants such as warfarin (Coumadin) and low molecular weight heparins (LMWHs) have generated a great interest in searching for new anticoagulants for the treatment and prevention of thromboembolic diseases. However, warfarin is an indirect inhibitor of blood coagulation and takes several days to reach effective anticoagulant levels [9]. LMWH requires a physiological cofactor, antithrombin III, to inactivate factor Xa (fXa). Therefore, the current drug discovery research focuses on designing small molecule inhibitors that directly act on coagulation factors such as thrombin and fXa.
The direct thrombin inhibitors have shown good antithrombotic efficacy in animal thrombosis models. However, it is not certain whether the antithrombotic effects of these inhibitors can be achieved without bleeding complications in humans [10]. The alternative target of new anticoagulants is inhibition of the earlier sites of the coagulation cascade such as fXa. Factor Xa (fXa), a trypsin-like serine protease, is a critical enzyme which in combination with fVa and Ca 2+ on a phospholipid surface forms the prothrombinase complex that converts biologically inactive prothrombin to thrombin. The activation of thrombin by fXa is a highly amplified process. Because of its central position at the convergent point of the intrinsic and extrinsic pathways of coagulation it is believed that inhibition of fXa may reduce the production of thrombin by either the extrinsic or intrinsic pathways without interfering with a basal level of thrombin activity necessary for normal hemostasis. The present study involves estimation of structure-activity relationship of aryl amidino isoxazoline derivatives which were reported as fXa inhibitors. This study would help in the rational development of new, synthetic, potent fXa inhibitors since it gives an insight into structural requirements for this series of inhibitors with the help of different QSAR methods.

Docking
All docking studies were performed using Accelrys Ligandfit installed on a Silicon Graphic Octane desktop Workstation. Shape and volume of the active site has been identified by selecting the aminoacid residues around the non stranded residue of the protein at 5 A o distance. A set of aryl amidino isoxazoline derivatives from literature was used in this study [11][12][13][14][15] (Table-1 to Table-5). Ten Molecules with varied range of activities were selected from the study set molecules. Conformational search on each selected molecules was done by simplex search method. Generated conformers were docked in the active site and calculated for their interactions. Ten best conformations for each molecule were selected by the system based on dock score and ligand interaction energies.

3D-QSAR
Factor Xa binding affinity data reported by Quan  The major steps of MSA were (1) generation of conformers and energy minimization; (2) hypothesizing an active conformer (global minimum of the most active compound); (3) selecting a candidate shape reference compound (based on the active conformation); (4) performing pairwise molecular superimposition using the maximum common subgroup (MCSG) method; (5) measuring molecular shape commonality using MSA descriptors; (6) determining other molecular features by calculating quantum mechanical, spatial, electronic and conformational parameters; (7) selection of conformers; (8) generation of QSAR equations by genetic function algorithm (GFA) or stepwise regression. A complete list of descriptors used for the study molecules were presented in Table-7. Multiple conformations of each molecule were generated using the Boltzmann jump conformational search method. The upper limit of the number of conformations per molecule was 150. Each conformer was subjected to an energy minimization procedure using the smart minimizer with the Drieding force field to generate the lowest energy conformation for each structure. Varying the conformation of shape reference compound during alignment, two separate studies were conducted. In one study (MSA1) the best conformer of the most active inhibitor 51 was used as shape reference for the alignment. Best conformer was selected based on dock score and interaction energy which were obtained in flexible docking study (Ligandfit). In another study (MSA2) lowest energy conformer obtained in conformational search was used. A rigid fit of atom pairings was performed to superimpose each structure so that it overlays the shape-reference compound.
The major steps of MFA were (1) generating conformers and energy minimization; (2) matching atoms using maximum common substructure (MCS) search and aligning molecules using the default options; (3) setting MFA preferences (rectangular grid with 2.00 A o step size, charges by Gasteiger algorithm, H + , CH 3 and HOas probes); (4) creating the field; (5) analysis by the Genetic partial least squares (G/PLS) method. The MFA formalism calculates probe interaction energies on a rectangular grid around a bundle of active molecules. The surface was generated from a ''Shape Field''. The atomic coordinates of the contributing models were used to compute field values on each point of a 3D grid. Grid size was adjusted to default 2.00 A ˚. The MFA evaluates the energy between a probe (H + , CH 3 and HO -) and a molecular model at a series of points defined by a rectangular grid. Fields of molecules were represented using grids in MFA and each energy associated with an MFA grid point serves as input for the calculation of a QSAR. These energies were added to the study table to form new columns headed according to the probe type. Statistical analysis of data was done using techniques like G/PLS, genetic function approximation (GFA) using QSAR+ environment of Cerius2 software. The GFA technique was used to generate a population of equations rather than one single equation for correlation between biological activity and physicochemical properties. The GFA was done as followed: (1) an initial population of equations is generated by random choice of descriptors; (2) pairs from the population of equations were chosen at random and ''crossovers'' were performed and progeny equations were generated; (3) the fitness of each progeny equation was assessed by the LOF measure; (4) a progeny equation with better fitness score was preserved. The model with a proper balance of all statistical terms explained the variance of the biological activity. The G/PLS algorithm was used as an alternative to a GFA calculation. The G/PLS is derived from two QSAR calculation methods: GFA and partial least squares (PLS). The G/PLS algorithm uses GFA to select appropriate basis functions to be used in a model of the data and PLS regression as the fitting technique to weigh the basis functions' relative contributions in the final model.     The PLS is a generalization of regression, which can handle data with strongly correlated and/or noisy or numerous X variables. r 2 , r and least square error (LSE) were taken as statistical measures for PLS equations, and LOF was noted for the GFA-derived equations. The 3D-QSAR equations generated were validated by PRESS (leave-one-out) and bootstrap statistics which were calculated using the QSAR+ module of the Cerius2 software and the reported parameters were cross-validation r 2 (q 2 ), predicted residual sum of squares (PRESS), standard deviation based on PRESS (SPRESS), standard deviation of error of prediction (SDEP) and bootstrap r 2 (bsr 2 ). Additionally, the final models were subjected to leave-20%-out cross validation with 15 trials in each case.

Pharmacophore Modeling (Catalyst)
All pharmacophore modeling studies were performed using Catalyst 4.11 installed on a Silicon Graphic Octane desktop Workstation. The flexibility of each molecule was represented by a set of energetically reasonable conformers which were generated with the Catalyst catConf module choosing a maximum number of 250 conformers, the best quality generation type, and an energy threshold of 20 kcal/mol beyond the calculated global energy minimum. The number of conformers generated for each molecule was limited to a maximum of 255. Ten hypotheses were generated using these conformers for each of the molecules and estimated activity values were generated after selection of the following features for the drugs; hydrogen bond donor, hydrophobic, negative charge, positive ionizable and ring aromatic. After assessing all 10 hypotheses generated for each data set, the lowest energy cost hypothesis was considered the best. The goodness of the structure activity correlation was estimated by means of the correlation coefficient (r). Catalyst also calculates the total energy cost of the generated pharmacophores from the deviation between the estimated activity and the observed activity, combined with the complexity of the hypothesis (i.e., the number of pharmacophore features). A null hypothesis was additionally calculated, which presumes that there is no relationship in the data and that experimental activities are normally distributed about their mean. Hence, the greater the difference between the energy cost of the generated hypothesis and the energy cost of the null hypothesis, the less likely it is that the hypothesis reflects a chance correlation. This criterion was then used as an assessment of the pharmacophore model selected.

Results & Discussion
Results of docking studies were displayed in Table-6. Difference in interaction energy values and the binding mode of conformations (Fig-1) can be used to explain the activity variance between the molecules. Conformer with better interaction energy values for 51 (most active molecule) was isolated from the others and was used as shape reference to align all other study molecules for 3D-QSAR studies.   Fig-2 and 3 respectively. MSA1 study was found to be superior to MSA2. The best equation obtained from GFA regression (at 100,000 crossovers and F value for inclusion of variables was set to 4) in MSA1 is the following: Activity = -2.1378 -0.022792*<Foct> -0.008514* <NCOSV> -0.324377* <AlogP98> + 0.045833*<Mol Ref>

Molecular field analysis
The generated field was of the rectangular type. The probes used in the study (MFA1) were H + and CH 3 . In another study (MFA2) HOprobe was also included. The charge method used was Gasteiger and the energy cutoff was kept at -30 to +30 kcal. QSAR equations were generated using both G/PLS and GFA method. The number of iterations was set to 1000,000 to obtain the final equation. The mutation probabilities were set to the system defaults. The final best result was obtained with G/PLS in MFA2 and was discussed here. A view of aligned molecules studied in the field was shown in Fig-7 Fig-7); i.e., these represent interactions at points 220 by H + , 674 by CH 3 , etc. Equation 4 is of very good statistical quality. It shows 90.2% explained variance while leave-one-out cross-validation r 2 is found to be 80.1%. The final models were also subjected to leave-20%-out cross-validation tests with 15 trials and the r 2 value between the observed and predicted values were found to be 0.871 and 0.870 respectively ( Table-9). Predicted activity values for each molecule by the QSAR model were given in Table-10. Fig-4

Pharmacophore Model
Catalyst uses a collection of molecules with activity spanning orders of magnitude to construct a useful model of the chemical features and their position in 3D space (Fig-9 and Table-11) necessary for a biological response. After several iterations the model of our 30-molecule data set produced a good correlation when compared with the estimated (r of 0.92, Table-12). The model contained four features necessary for activity, namely two hydrogen bond donor, a hydrophobic aromatic, and a ring aromatic feature (Fig-10 and Table-11). The features were compared with highest and low activity molecules in the dataset using comparefit and found that ring aromatic feature was missed for the later one (Fig-10 & 11). The generated catalyst hypotheses can serve as query features in 3D database (DB) search for virtual screening to detect novel lead compounds as factor Xa inhibitors.

Conclusions
The present 3D-QSAR analysis based on the alignment of best docked conformation explores the spatial, shape and thermodynamic requirements for the binding affinity of aryl amidino isoxazoline derivatives to factor Xa. The MSA-derived equations shows the importance of thermodynamic and quantum mechanical descriptors, molecular refractivity and radius of gyration contribution to activity. The MFA-derived equation shows interaction energies at different grid points with positive, negative and neutral bulk probes. Statistically reliable 3D-QSAR and pharmcophore models obtained from this study on aryl amidino Isoxazoline derivatives suggest that these techniques could be useful to design potent factor Xa inhibitors.