chemogenomics references

[Caldwell1995introduction] J. Caldwell, I. Gardner, and N. Swales. An introduction to drug disposition: the basic principles of absorption, distribution, metabolism, and excretion. Toxicol. Pathol., 23(2):102-114, 1995. [ bib ]
A knowledge of the fate of a drug, its disposition (absorption, distribution, metabolism, and excretion, known by the acronym ADME) and pharmacokinetics (the mathematical description of the rates of these processes and of concentration-time relationships), plays a central role throughout pharmaceutical research and development. These studies aid in the discovery and selection of new chemical entities, support safety assessment, and are critical in defining conditions for safe and effective use in patients. ADME studies provide the only basis for critical judgments from situations where the behavior of the drug is understood to those where it is unknown: this is most important in bridging from animal studies to the human situation. This presentation is intended to provide an introductory overview of the life cycle of a drug in the animal body and indicates the significance of such information for a full understanding of mechanisms of action and toxicity.

Keywords: chemogenomics
[Bockaert1999Molecular] J. Bockaert and J. P. Pin. Molecular tinkering of G protein-coupled receptors: an evolutionary success. EMBO J., 18(7):1723-1729, Apr 1999. [ bib | DOI | http ]
Among membrane-bound receptors, the G protein-coupled receptors (GPCRs) are certainly the most diverse. They have been very successful during evolution, being capable of transducing messages as different as photons, organic odorants, nucleotides, nucleosides, peptides, lipids and proteins. Indirect studies, as well as two-dimensional crystallization of rhodopsin, have led to a useful model of a common 'central core', composed of seven transmembrane helical domains, and its structural modifications during activation. There are at least six families of GPCRs showing no sequence similarity. They use an amazing number of different domains both to bind their ligands and to activate G proteins. The fine-tuning of their coupling to G proteins is regulated by splicing, RNA editing and phosphorylation. Some GPCRs have been found to form either homo- or heterodimers with a structurally different GPCR, but also with membrane-bound proteins having one transmembrane domain such as nina-A, odr-4 or RAMP, the latter being involved in their targeting, function and pharmacology. Finally, some GPCRs are unfaithful to G proteins and interact directly, via their C-terminal domain, with proteins containing PDZ and Enabled/VASP homology (EVH)-like domains.

Keywords: chemogenomics
[Egan2000Prediction] W. J. Egan, K. M. Merz, and J. J. Baldwin. Prediction of drug absorption using multivariate statistics. J. Med. Chem., 43(21):3867-3877, Oct 2000. [ bib ]
Literature data on compounds both well- and poorly-absorbed in humans were used to build a statistical pattern recognition model of passive intestinal absorption. Robust outlier detection was utilized to analyze the well-absorbed compounds, some of which were intermingled with the poorly-absorbed compounds in the model space. Outliers were identified as being actively transported. The descriptors chosen for inclusion in the model were PSA and AlogP98, based on consideration of the physical processes involved in membrane permeability and the interrelationships and redundancies between available descriptors. These descriptors are quite straightforward for a medicinal chemist to interpret, enhancing the utility of the model. Molecular weight, while often used in passive absorption models, was shown to be superfluous, as it is already a component of both PSA and AlogP98. Extensive validation of the model on hundreds of known orally delivered drugs, "drug-like" molecules, and Pharmacopeia, Inc. compounds, which had been assayed for Caco-2 cell permeability, demonstrated a good rate of successful predictions (74-92%, depending on the dataset and exact criterion used).

Keywords: chemogenomics
[Veber2002Molecular] D. F. Veber, S. R. Johnson, H.-Y. Cheng, B. R. Smith, K. W. Ward, and K. D. Kopple. Molecular properties that influence the oral bioavailability of drug candidates. J. Med. Chem., 45(12):2615-2623, Jun 2002. [ bib ]
Oral bioavailability measurements in rats for over 1100 drug candidates studied at SmithKline Beecham Pharmaceuticals (now GlaxoSmithKline) have allowed us to analyze the relative importance of molecular properties considered to influence that drug property. Reduced molecular flexibility, as measured by the number of rotatable bonds, and low polar surface area or total hydrogen bond count (sum of donors and acceptors) are found to be important predictors of good oral bioavailability, independent of molecular weight. That on average both the number of rotatable bonds and polar surface area or hydrogen bond count tend to increase with molecular weight may in part explain the success of the molecular weight parameter in predicting oral bioavailability. The commonly applied molecular weight cutoff at 500 does not itself significantly separate compounds with poor oral bioavailability from those with acceptable values in this extensive data set. Our observations suggest that compounds which meet only the two criteria of (1) 10 or fewer rotatable bonds and (2) polar surface area equal to or less than 140 A(2) (or 12 or fewer H-bond donors and acceptors) will have a high probability of good oral bioavailability in the rat. Data sets for the artificial membrane permeation rate and for clearance in the rat were also examined. Reduced polar surface area correlates better with increased permeation rate than does lipophilicity (C log P), and increased rotatable bond count has a negative effect on the permeation rate. A threshold permeation rate is a prerequisite of oral bioavailability. The rotatable bond count does not correlate with the data examined here for the in vivo clearance rate in the rat.

Keywords: chemogenomics
[Schuffenhauer2002ontology] A. Schuffenhauer, J. Zimmermann, R. Stoop, J. J. van der Vyver, S. Lecchini, and E. Jacoby. An ontology for pharmaceutical ligands and its application for in silico screening and library design. J. Chem. Inf. Comput. Sci., 42(4):947-955, 2002. [ bib ]
Annotation efforts in biosciences have focused in past years mainly on the annotation of genomic sequences. Only very limited effort has been put into annotation schemes for pharmaceutical ligands. Here we propose annotation schemes for the ligands of four major target classes, enzymes, G protein-coupled receptors (GPCRs), nuclear receptors (NRs), and ligand-gated ion channels (LGICs), and outline their usage for in silico screening and combinatorial library design. The proposed schemes cover ligand functionality and hierarchical levels of target classification. The classification schemes are based on those established by the EC, GPCRDB, NuclearDB, and LGICDB. The ligands of the MDL Drug Data Report (MDDR) database serve as a reference data set of known pharmacologically active compounds. All ligands were annotated according to the schemes when attribution was possible based on the activity classification provided by the reference database. The purpose of the ligand-target classification schemes is to allow annotation-based searching of the ligand database. In addition, the biological sequence information of the target is directly linkable to the ligand, hereby allowing sequence similarity-based identification of ligands of next homologous receptors. Ligands of specified levels can easily be retrieved to serve as comprehensive reference sets for cheminformatics-based similarity searches and for design of target class focused compound libraries. Retrospective in silico screening experiments within the MDDR01.1 database, searching for structures binding to dopamine D2, all dopamine receptors and all amine-binding class A GPCRs using known dopamine D2 binding compounds as a reference set, have shown that such reference sets are in particular useful for the identification of ligands binding to receptors closely related to the reference system. The potential for ligand identification drops with increasing phylogenetic distance. The analysis of the focus of a tertiary amine based combinatorial library compared to known amine binding class A GPCRs, peptide binding class A GPCRs, and LGIC ligands constitutes a second application scenario which illustrates how the focus of a combinatorial library can be treated quantitatively. The provided annotation schemes, which bridge chem- and bioinformatics by linking ligands to sequences, are expected to be of key utility for further systematic chemogenomics exploration of previously well explored target families.

Keywords: chemogenomics
[Hopkins2002druggable] A. L. Hopkins and C. R. Groom. The druggable genome. Nat. Rev. Drug Discov., 1(9):727-730, Sep 2002. [ bib | DOI | http ]
An assessment of the number of molecular targets that represent an opportunity for therapeutic intervention is crucial to the development of post-genomic research strategies within the pharmaceutical industry. Now that we know the size of the human genome, it is interesting to consider just how many molecular targets this opportunity represents. We start from the position that we understand the properties that are required for a good drug, and therefore must be able to understand what makes a good drug target.

Keywords: chemogenomics
[Balakin2002Property-based] K. V. Balakin, S. E. Tkachenko, S. A. Lang, I. Okun, A. A. Ivashchenko, and N. P. Savchuk. Property-based design of GPCR-targeted library. J. Chem. Inf. Comput. Sci., 42(6):1332-1342, 2002. [ bib ]
The design of a GPCR-targeted library, based on a scoring scheme for the classification of molecules into "GPCR-ligand-like" and "non-GPCR-ligand-like", is outlined. The methodology is a valuable tool that can aid in the selection and prioritization of potential GPCR ligands for bioscreening from large collections of compounds. It is based on the distillation of knowledge from large databases of GPCR and non-GPCR active agents. The method employed a set of descriptors for encoding the molecular structures and by training of a neural network for classifying the molecules. The molecular requirements were profiled and validated by using available databases of GPCR- and non-GPCR-active agents [5736 diverse GPCR-active molecules and 7506 diverse non-GPCR-active molecules from the Ensemble Database (Prous Science, 2002)]. The method enables efficient qualification or disqualification of a molecule as a potential GPCR ligand and represents a useful tool for constraining the size of GPCR-targeted libraries that will help speed up the development of new GPCR-active drugs.

Keywords: chemogenomics
[Schuffenhauer2003Similarity] A. Schuffenhauer, P. Floersheim, P. Acklin, and E. Jacoby. Similarity metrics for ligands reflecting the similarity of the target proteins. J. Chem. Inf. Comput. Sci., 43(2):391-405, 2003. [ bib | DOI | http ]
In this study we evaluate how far the scope of similarity searching can be extended to identify not only ligands binding to the same target as the reference ligand(s) but also ligands of other homologous targets without initially known ligands. This "homology-based similarity searching" requires molecular representations reflecting the ability of a molecule to interact with target proteins. The Similog keys, which are introduced here as a new molecular representation, were designed to fulfill such requirements. They are based only on the molecular constitution and are counts of atom triplets. Each triplet is characterized by the graph distances and the types of its atoms. The atom-typing scheme classifies each atom by its function as H-bond donor or acceptor and by its electronegativity and bulkiness. In this study the Similog keys are investigated in retrospective in silico screening experiments and compared with other conformation independent molecular representations. Studied were molecules of the MDDR database for which the activity data was augmented by standardized target classification information from public protein classification databases. The MDDR molecule set was split randomly into two halves. The first half formed the candidate set. Ligands of four targets (dopamine D2 receptor, opioid delta-receptor, factor Xa serine protease, and progesterone receptor) were taken from the second half to form the respective reference sets. Different similarity calculation methods are used to rank the molecules of the candidate set by their similarity to each of the four reference sets. The accumulated counts of molecules binding to the reference target and groups of targets with decreasing homology to it were examined as a function of the similarity rank for each reference set and similarity method. In summary, similarity searching based on Unity 2D-fingerprints or Similog keys are found to be equally effective in the identification of molecules binding to the same target as the reference set. However, the application of the Similog keys is more effective in comparison with the other investigated methods in the identification of ligands binding to any target belonging to the same family as the reference target. We attribute this superiority to the fact that the Similog keys provide a generalization of the chemical elements and that the keys are counted instead of merely noting their presence or absence in a binary form. The second most effective molecular representation are the occurrence counts of the public ISIS key fragments, which like the Similog method, incorporates key counting as well as a generalization of the chemical elements. The results obtained suggest that ligands for a new target can be identified by the following three-step procedure: 1. Select at least one target with known ligands which is homologous to the new target. 2. Combine the known ligands of the selected target(s) to a reference set. 3. Search candidate ligands for the new targets by their similarity to the reference set using the Similog method. This clearly enlarges the scope of similarity searching from the classical application for a single target to the identification of candidate ligands for whole target families and is expected to be of key utility for further systematic chemogenomics exploration of previously well explored target families.

Keywords: chemogenomics
[Mirzadegan2003Sequence] T. Mirzadegan, G. Benkö, S. Filipek, and K. Palczewski. Sequence analyses of G-protein-coupled receptors: similarities to rhodopsin. Biochemistry, 42(10):2759-2767, Mar 2003. [ bib | DOI ]
Keywords: chemogenomics
[Horn2003GPCRDB] F. Horn, E. Bettler, L. Oliveira, F. Campagne, F. E. Cohen, and G. Vriend. GPCRDB information system for G protein-coupled receptors. Nucl. Acids Res., 31(1):294-297, 2003. [ bib | DOI | arXiv | http ]
The GPCRDB is a molecular class-specific information system that collects, combines, validates and disseminates heterogeneous data on G protein-coupled receptors (GPCRs). The database stores data on sequences, ligand binding constants and mutations. The system also provides computationally derived data such as sequence alignments, homology models, and a series of query and visualization tools. The GPCRDB is updated automatically once every 4-5 months and is freely accessible at http://www.gpcr.org/7tm/.

Keywords: chemogenomics
[Bissantz2003Protein-based] C. Bissantz, P. Bernard, M. Hibert, and D. Rognan. Protein-based virtual screening of chemical databases. II. are homology models of G-protein coupled receptors suitable targets? Proteins, 50(1):5-25, Jan 2003. [ bib | DOI | http ]
The aim of the current study is to investigate whether homology models of G-Protein-Coupled Receptors (GPCRs) that are based on bovine rhodopsin are reliable enough to be used for virtual screening of chemical databases. Starting from the recently described 2.8 A-resolution X-ray structure of bovine rhodopsin, homology models of an "antagonist-bound" form of three human GPCRs (dopamine D3 receptor, muscarinic M1 receptor, vasopressin V1a receptor) were constructed. The homology models were used to screen three-dimensional databases using three different docking programs (Dock, FlexX, Gold) in combination with seven scoring functions (ChemScore, Dock, FlexX, Fresno, Gold, Pmf, Score). Rhodopsin-based homology models turned out to be suitable, indeed, for virtual screening since known antagonists seeded in the test databases could be distinguished from randomly chosen molecules. However, such models are not accurate enough for retrieving known agonists. To generate receptor models better suited for agonist screening, we developed a new knowledge- and pharmacophore-based modeling procedure that might partly simulate the conformational changes occurring in the active site during receptor activation. Receptor coordinates generated by this new procedure are now suitable for agonist screening. We thus propose two alternative strategies for the virtual screening of GPCR ligands, relying on a different set of receptor coordinates (antagonist-bound and agonist-bound states).

Keywords: chemogenomics
[Cavasotto2003Structure-based] C. N. Cavasotto, A. J. W. Orry, and R. A. Abagyan. Structure-based identification of binding sites, native ligands and potential inhibitors for G-protein coupled receptors. Proteins, 51(3):423-433, May 2003. [ bib | DOI | http ]
G-protein coupled receptors (GPCRs) are the largest family of cell-surface receptors involved in signal transmission. Drugs associated with GPCRs represent more than one fourth of the 100 top-selling drugs and are the targets of more than half of the current therapeutic agents on the market. Our methodology based on the internal coordinate mechanics (ICM) program can accurately identify the ligand-binding pocket in the currently available crystal structures of seven transmembrane (7TM) proteins [bacteriorhodopsin (BR) and bovine rhodopsin (bRho)]. The binding geometry of the ligand can be accurately predicted by ICM flexible docking with and without the loop regions, a useful finding for GPCR docking because the transmembrane regions are easier to model. We also demonstrate that the native ligand can be identified by flexible docking and scoring in 1.5% and 0.2% (for bRho and BR, respectively) of the best scoring compounds from two different types of compound database. The same procedure can be applied to the database of available chemicals to identify specific GPCR binders. Finally, we demonstrate that even if the sidechain positions in the bRho binding pocket are entirely wrong, their correct conformation can be fully restored with high accuracy (0.28 A) through the ICM global optimization with and without the ligand present. These binding site adjustments are critical for flexible docking of new ligands to known structures or for docking to GPCR homology models. The ICM docking method has the potential to be used to "de-orphanize" orphan GPCRs (oGPCRs) and to identify antagonists-agonists for GPCRs if an accurate model (experimentally and computationally validated) of the structure has been constructed or when future crystal structures are determined.

Keywords: chemogenomics
[Shacham2004PREDICT] S. Shacham, Y. Marantz, S. Bar-Haim, O. Kalid, D. Warshaviak, N. Avisar, B. Inbal, A. Heifetz, M. Fichman, M. Topf, Z. Naor, S. Noiman, and O. M. Becker. PREDICT modeling and in-silico screening for G-protein coupled receptors. Proteins, 57(1):51-86, Oct 2004. [ bib | DOI | http ]
G-protein coupled receptors (GPCRs) are a major group of drug targets for which only one x-ray structure is known (the nondrugable rhodopsin), limiting the application of structure-based drug discovery to GPCRs. In this paper we present the details of PREDICT, a new algorithmic approach for modeling the 3D structure of GPCRs without relying on homology to rhodopsin. PREDICT, which focuses on the transmembrane domain of GPCRs, starts from the primary sequence of the receptor, simultaneously optimizing multiple 'decoy' conformations of the protein in order to find its most stable structure, culminating in a virtual receptor-ligand complex. In this paper we present a comprehensive analysis of three PREDICT models for the dopamine D2, neurokinin NK1, and neuropeptide Y Y1 receptors. A shorter discussion of the CCR3 receptor model is also included. All models were found to be in good agreement with a large body of experimental data. The quality of the PREDICT models, at least for drug discovery purposes, was evaluated by their successful utilization in in-silico screening. Virtual screening using all three PREDICT models yielded enrichment factors 9-fold to 44-fold better than random screening. Namely, the PREDICT models can be used to identify active small-molecule ligands embedded in large compound libraries with an efficiency comparable to that obtained using crystal structures for non-GPCR targets.

Keywords: chemogenomics
[Perlman2004Multidimensional] Z. E. Perlman, M. D. Slack, Y. Feng, T. J. Mitchison, L. F. Wu, and S. J. Altschuler. Multidimensional drug profiling by automated microscopy. Science, 306(5699):1194-1198, Nov 2004. [ bib | DOI | http | .pdf ]
We present a method for high-throughput cytological profiling by microscopy. Our system provides quantitative multidimensional measures of individual cell states over wide ranges of perturbations. We profile dose-dependent phenotypic effects of drugs in human cell culture with a titration-invariant similarity score (TISS). This method successfully categorized blinded drugs and suggested targets for drugs of uncertain mechanism. Multivariate single-cell analysis is a starting point for identifying relationships among drug effects at a systems level and a step toward phenotypic profiling at the single-cell level. Our methods will be useful for discovering the mechanism and predicting the toxicity of new drugs.

Keywords: chemogenomics, highcontentscreening
[Okada2004retinal] T. Okada, M. Sugihara, A.-N. Bondar, M. Elstner, P. Entel, and V. Buss. The retinal conformation and its environment in rhodopsin in light of a new 2.2 a crystal structure. J. Mol. Biol., 342(2):571-583, Sep 2004. [ bib | DOI | http ]
A new high-resolution structure is reported for bovine rhodopsin, the visual pigment in rod photoreceptor cells. Substantial improvement of the resolution limit to 2.2 A has been achieved by new crystallization conditions, which also reduce significantly the probability of merohedral twinning in the crystals. The new structure completely resolves the polypeptide chain and provides further details of the chromophore binding site including the configuration about the C6-C7 single bond of the 11-cis-retinal Schiff base. Based on both an earlier structure and the new improved model of the protein, a theoretical study of the chromophore geometry has been carried out using combined quantum mechanics/force field molecular dynamics. The consistency between the experimental and calculated chromophore structures is found to be significantly improved for the 2.2 A model, including the angle of the negatively twisted 6-s-cis-bond. Importantly, the new crystal structure refinement reveals significant negative pre-twist of the C11-C12 double bond and this is also supported by the theoretical calculation although the latter converges to a smaller value. Bond alternation along the unsaturated chain is significant, but weaker in the calculated structure than the one obtained from the X-ray data. Other differences between the experimental and theoretical structures in the chromophore binding site are discussed with respect to the unique spectral properties and excited state reactivity of the chromophore.

Keywords: chemogenomics
[Lin2004Orphan] S. H. S. Lin and O. Civelli. Orphan G protein-coupled receptors: targets for new therapeutic interventions. Ann. Med., 36(3):204-214, 2004. [ bib | DOI | http ]
With the completion of the human genome, many genes will be uncovered with unknown functions. The 'orphan' G protein coupled receptors (GPCRs) are examples of genes without known functions. These are genes that exhibit the seven helical conformation hallmark of the GPCRs but that are called 'orphans' because they are activated by none of the primary messengers known to activate GPCRs in vivo. They are the targets of undiscovered transmitters and this lack of knowledge precludes understanding their function. Yet, because they belong to the supergene family that has the widest regulatory role in the organism, the orphan GPCRs have generated much excitement in academia and industry. They hold much hope for revealing new intercellular interactions that will open new areas of basic research which ultimately will lead to new therapeutic applications. However, the first step in understanding the function of orphan GPCRs is to 'deorphanize' them, to identify their natural transmitters. Here we review the search for the natural primary messengers of orphan GPCRs and focus on two recently deorphanized GPCR systems, the melanin-concentrating hormone (MCH) and prolactin-releasing peptide (PrRP) systems, to illustrate the strategies applied to solve their function and to exemplify the therapeutic potentials that such systems hold.

Keywords: chemogenomics
[Bredel2004Chemogenomics] M. Bredel and E. Jacoby. Chemogenomics: an emerging strategy for rapid target and drug discovery. Nat. Rev. Genet., 5(4):262-275, Apr 2004. [ bib | DOI | http | .pdf ]
Keywords: chemogenomics
[Becker2004G] O. M. Becker, Y. Marantz, S. Shacham, B. Inbal, A. Heifetz, O. Kalid, S. Bar-Haim, D. Warshaviak, M. Fichman, and S. Noiman. G protein-coupled receptors: in silico drug discovery in 3D. Proc. Natl. Acad. Sci. USA, 101(31):11304-11309, Aug 2004. [ bib | DOI | http ]
The application of structure-based in silico methods to drug discovery is still considered a major challenge, especially when the x-ray structure of the target protein is unknown. Such is the case with human G protein-coupled receptors (GPCRs), one of the most important families of drug targets, where in the absence of x-ray structures, one has to rely on in silico 3D models. We report repeated success in using ab initio in silico GPCR models, generated by the predict method, for blind in silico screening when applied to a set of five different GPCR drug targets. More than 100,000 compounds were typically screened in silico for each target, leading to a selection of <100 "virtual hit" compounds to be tested in the lab. In vitro binding assays of the selected compounds confirm high hit rates, of 12-21% (full dose-response curves, Ki < 5 microM). In most cases, the best hit was a novel compound (New Chemical Entity) in the 1- to 100-nM range, with very promising pharmacological properties, as measured by a variety of in vitro and in vivo assays. These assays validated the quality of the hits as lead compounds for drug discovery. The results demonstrate the usefulness and robustness of ab initio in silico 3D models and of in silico screening for GPCR drug discovery.

Keywords: chemogenomics
[Rolland2005G-protein-coupled] C. Rolland, R. Gozalbes, A. Nicolaï, M.-F. Paugam, L. Coussy, F. Barbosa, D. Horvath, and F. Revah. G-protein-coupled receptor affinity prediction based on the use of a profiling dataset: Qsar design, synthesis, and experimental validation. J. Med. Chem., 48(21):6563-6574, Oct 2005. [ bib | DOI | http ]
A QSAR model accounting for "average" G-protein-coupled receptor (GPCR) binding was built from a large set of experimental standardized binding data (1939 compounds systematically tested over 40 different GPCRs) and applied to the design of a library of "GPCR-predicted" compounds. Three hundred and sixty of these compounds were randomly selected and tested in 21 GPCR binding assays. Positives were defined by their ability to inhibit by more than 70% the binding of reference compounds at 10 microM. A 5.5-fold enrichment in positives was observed when comparing the "GPCR-predicted" compounds with 600 randomly selected compounds predicted as "non-GPCR" from a general collection. The model was efficient in predicting strongest binders, since enrichment was greater for higher cutoffs. Significant enrichment was also observed for peptidic GPCRs and receptors not included to develop the QSAR model, suggesting the usefulness of the model to design ligands binding with newly identified GPCRs, including orphan ones.

Keywords: chemogenomics
[Lapinsh2005Improved] M. Lapinsh, P. Prusis, S. Uhlén, and J. E. S. Wikberg. Improved approach for proteochemometrics modeling: application to organic compound-amine G protein-coupled receptor interactions. Bioinformatics, 21(23):4289-4296, Dec 2005. [ bib | DOI | http ]
MOTIVATION: Proteochemometrics is a novel technology for the analysis of interactions of series of proteins with series of ligands. We have here customized it for analysis of large datasets and evaluated it for the modeling of the interaction of psychoactive organic amines with all the five known families of amine G protein-coupled receptors (GPCRs). RESULTS: The model exploited data for the binding of 22 compounds to 31 amine GPCRs, correlating chemical descriptions and cross-descriptions of compounds and receptors to binding affinity using a novel strategy. A highly valid model (q2 = 0.76) was obtained which was further validated by external predictions using data for 10 other entirely independent compounds, yielding the high q2ext = 0.67. Interpretation of the model reveals molecular interactions that govern psychoactive organic amines overall affinity for amine GPCRs, as well as their selectivity for particular amine GPCRs. The new modeling procedure allows us to obtain fully interpretable proteochemometrics models using essentially unlimited number of ligand and protein descriptors.

Keywords: chemogenomics
[Kratochwil2005automated] N. A. Kratochwil, P. Malherbe, L. Lindemann, M. Ebeling, M. C. Hoener, A. Mühlemann, R. H. P. Porter, M. Stahl, and P. R. Gerber. An automated system for the analysis of G protein-coupled receptor transmembrane binding pockets: alignment, receptor-based pharmacophores, and their application. J. Chem. Inf. Model., 45(5):1324-1336, 2005. [ bib | DOI | http | .pdf ]
G protein-coupled receptors (GPCRs) share a common architecture consisting of seven transmembrane (TM) domains. Various lines of evidence suggest that this fold provides a generic binding pocket within the TM region for hosting agonists, antagonists, and allosteric modulators. Here, a comprehensive and automated method allowing fast analysis and comparison of these putative binding pockets across the entire GPCR family is presented. The method relies on a robust alignment algorithm based on conservation indices, focusing on pharmacophore-like relationships between amino acids. Analysis of conservation patterns across the GPCR family and alignment to the rhodopsin X-ray structure allows the extraction of the amino acids lining the TM binding pocket in a so-called ligand binding pocket vector (LPV). In a second step, LPVs are translated to simple 3D receptor pharmacophore models, where each amino acid is represented by a single spherical pharmacophore feature and all atomic detail is omitted. Applications of the method include the assessment of selectivity issues, support of mutagenesis studies, and the derivation of rules for focused screening to identify chemical starting points in early drug discovery projects. Because of the coarseness of this 3D receptor pharmacophore model, however, meaningful scoring and ranking procedures of large sets of molecules are not justified. The LPV analysis of the trace amine-associated receptor family and its experimental validation is discussed as an example. The value of the 3D receptor model is demonstrated for a class C GPCR family, the metabotropic glutamate receptors.

Keywords: chemogenomics
[Frimurer2005physicogenetic] T. M. Frimurer, T. Ulven, C. E. Elling, L.-O. Gerlach, E. Kostenis, and T. Högberg. A physicogenetic method to assign ligand-binding relationships between 7tm receptors. Bioorg. Med. Chem. Lett., 15(16):3707-3712, Aug 2005. [ bib | DOI | http | .pdf ]
A computational protocol has been devised to relate 7TM receptor proteins (GPCRs) with respect to physicochemical features of the core ligand-binding site as defined from the crystal structure of bovine rhodopsin. The identification of such receptors that already are associated with ligand information (e.g., small molecule ligands with mutagenesis or SAR data) is used to support structure-guided drug design of novel ligands. A case targeting the newly identified prostaglandin D2 receptor CRTH2 serves as a primary example to illustrate the procedure.

Keywords: chemogenomics
[Freyhult2005Unbiased] E. Freyhult, P. Prusis, M. Lapinsh, J. E. S. Wikberg, V. Moulton, and M. G. Gustafsson. Unbiased descriptor and parameter selection confirms the potential of proteochemometric modelling. BMC Bioinformatics, 6:50, 2005. [ bib | DOI | http ]
BACKGROUND: Proteochemometrics is a new methodology that allows prediction of protein function directly from real interaction measurement data without the need of 3D structure information. Several reported proteochemometric models of ligand-receptor interactions have already yielded significant insights into various forms of bio-molecular interactions. The proteochemometric models are multivariate regression models that predict binding affinity for a particular combination of features of the ligand and protein. Although proteochemometric models have already offered interesting results in various studies, no detailed statistical evaluation of their average predictive power has been performed. In particular, variable subset selection performed to date has always relied on using all available examples, a situation also encountered in microarray gene expression data analysis. RESULTS: A methodology for an unbiased evaluation of the predictive power of proteochemometric models was implemented and results from applying it to two of the largest proteochemometric data sets yet reported are presented. A double cross-validation loop procedure is used to estimate the expected performance of a given design method. The unbiased performance estimates (P2) obtained for the data sets that we consider confirm that properly designed single proteochemometric models have useful predictive power, but that a standard design based on cross validation may yield models with quite limited performance. The results also show that different commercial software packages employed for the design of proteochemometric models may yield very different and therefore misleading performance estimates. In addition, the differences in the models obtained in the double CV loop indicate that detailed chemical interpretation of a single proteochemometric model is uncertain when data sets are small. CONCLUSION: The double CV loop employed offer unbiased performance estimates about a given proteochemometric modelling procedure, making it possible to identify cases where the proteochemometric design does not result in useful predictive models. Chemical interpretations of single proteochemometric models are uncertain and should instead be based on all the models selected in the double CV loop employed here.

Keywords: chemogenomics
[Evers2005Structure-based] A. Evers and T. Klabunde. Structure-based drug discovery using GPCR homology modeling: successful virtual screening for antagonists of the alpha1A adrenergic receptor. J. Med. Chem., 48(4):1088-1097, Feb 2005. [ bib | DOI | http ]
In this paper, we describe homology modeling of the alpha1A receptor based on the X-ray structure of bovine rhodopsin. The protein model has been generated by applying ligand-supported homology modeling, using mutational and ligand SAR data to guide the protein modeling procedure. We performed a virtual screening of the company's compound collection to test how well this model is suited to identify alpha1A antagonists. We applied a hierarchical virtual screening procedure guided by 2D filters and three-dimensional pharmacophore models. The ca. 23,000 filtered compounds were docked into the alpha1A homology model with GOLD and scored with PMF. From the top-ranked compounds, 80 diverse compounds were tested in a radioligand displacement assay. 37 compounds revealed K(i) values better than 10 microM; the most active compound binds with 1.4 nM to the alpha1A receptor. Our findings suggest that rhodopsin-based homology models may be used as the structural basis for GPCR lead finding and compound optimization.

Keywords: chemogenomics
[Bock2005Virtual] J. R. Bock and D. A. Gough. Virtual screen for ligands of orphan G protein-coupled receptors. J. Chem. Inform. Model., 45(5):1402-1414, 2005. [ bib | DOI | http | .pdf ]
This paper describes a virtual screening methodology that generates a ranked list of high-binding small molecule ligands for orphan G protein-coupled receptors (oGPCRs), circumventing the requirement for receptor three-dimensional structure determination. Features representing the receptor are based only on physicochemical properties of primary amino acid sequence, and ligand features use the two-dimensional atomic connection topology and atomic properties. An experimental screen comprised nearly 2 million hypothetical oGPCR-ligand complexes, from which it was observed that the top 1.96% predicted affinity scores corresponded to "highly active" ligands against orphan receptors. Results representing predicted high-scoring novel ligands for many oGPCRs are presented here. Validation of the method was carried out in several ways: (1) A random permutation of the structure-activity relationship of the training data was carried out; by comparing test statistic values of the randomized and nonshuffled data, we conclude that the value obtained with nonshuffled data is unlikely to have been encountered by chance. (2) Biological activities linked to the compounds with high cross-target binding affinity were analyzed using computed log-odds from a structure-based program. This information was correlated with literature citations where GPCR-related pathways or processes were linked to the bioactivity in question. (3) Anecdotal, out-of-sample predictions for nicotinic targets and known ligands were performed, with good accuracy in the low-to-high "active" binding range. (4) An out-of-sample consistency check using the commercial antipsychotic drug olanzapine produced "active" to "highly-active" predicted affinities for all oGPCRs in our study, an observation that is consistent with documented findings of cross-target affinity of this compound for many different GPCRs. It is suggested that this virtual screening approach may be used in support of the functional characterization of oGPCRs by identifying potential cognate ligands. Ultimately, this approach may have implications for pharmaceutical therapies to modulate the activity of faulty or disease-related cellular signaling pathways. In addition to application to cell surface receptors, this approach is a generalized strategy for discovery of small molecules that may bind intracellular enzymes and involve protein-protein interactions.

Keywords: chemogenomics
[Martin2005bioavailability] Y. C. Martin. A bioavailability score. J. Med. Chem., 48(9):3164-3170, May 2005. [ bib | DOI | http ]
Responding to a demonstrated need for scientists to forecast the permeability and bioavailability (F) properties of compounds before their purchase, synthesis, or advanced testing, we have developed a score that assigns the probability that a compound will have F > 10% in the rat. Neither the rule-of-five, log P, log D, nor the combination of the number of rotatable bonds and polar surface area successfully categorized compounds. Instead, different properties govern the bioavailability of compounds depending on their predominant charge at biological pH. The fraction of anions with >10% F falls from 85% if the polar surface area (PSA) is < or = 75 A(2), to 56% if 75 < PSA < 150 A(2), to 11% if PSA is > or = 150 A(2). On the other hand, whereas 55% of the neutral, zwitterionic, or cationic compounds that pass the rule-of-five have >10% F, only 17% of those that fail have > 10% F. This same categorization distinguishes compounds that are poorly permeable from those that are permeable in Caco-2 cells. Further validation is provided with human bioavailability values from the literature.

Keywords: chemogenomics
[2006Chemical] S. E. Jaroch and H. Weinmann, editors. Chemical Genomics: Small Molecule Probes to Study Cellular Function. Ernst Schering Research Foundation Workshop. Springer, Berlin, 2006. [ bib ]
Keywords: chemogenomics
[Yao2006Coupling] X. Yao, C. Parnot, X. Deupi, V. R. P. Ratnala, G. Swaminath, D. Farrens, and B. Kobilka. Coupling ligand structure to specific conformational switches in the beta2-adrenoceptor. Nat. Chem. Biol., 2(8):417-422, Aug 2006. [ bib | DOI | http ]
G protein-coupled receptors (GPCRs) regulate a wide variety of physiological functions in response to structurally diverse ligands ranging from cations and small organic molecules to peptides and glycoproteins. For many GPCRs, structurally related ligands can have diverse efficacy profiles. To investigate the process of ligand binding and activation, we used fluorescence spectroscopy to study the ability of ligands having different efficacies to induce a specific conformational change in the human beta2-adrenoceptor (beta2-AR). The 'ionic lock' is a molecular switch found in rhodopsin-family GPCRs that has been proposed to link the cytoplasmic ends of transmembrane domains 3 and 6 in the inactive state. We found that most partial agonists were as effective as full agonists in disrupting the ionic lock. Our results show that disruption of this important molecular switch is necessary, but not sufficient, for full activation of the beta2-AR.

Keywords: chemogenomics
[Okuno2006GLIDA] Y. Okuno, J. Yang, K. Taneishi, H. Yabuuchi, and G. Tsujimoto. GLIDA: GPCR-ligand database for chemical genomic drug discovery. Nucleic Acids Res., 34(Database issue):D673-D677, Jan 2006. [ bib | DOI | http ]
G-protein coupled receptors (GPCRs) represent one of the most important families of drug targets in pharmaceutical development. GPCR-LIgand DAtabase (GLIDA) is a novel public GPCR-related chemical genomic database that is primarily focused on the correlation of information between GPCRs and their ligands. It provides correlation data between GPCRs and their ligands, along with chemical information on the ligands, as well as access information to the various web databases regarding GPCRs. These data are connected with each other in a relational database, allowing users in the field of GPCR-related drug discovery to easily retrieve such information from either biological or chemical starting points. GLIDA includes structure similarity search functions for the GPCRs and for their ligands. Thus, GLIDA can provide correlation maps linking the searched homologous GPCRs (or ligands) with their ligands (or GPCRs). By analyzing the correlation patterns between GPCRs and ligands, we can gain more detailed knowledge about their interactions and improve drug design efforts by focusing on inferred candidates for GPCR-specific drugs. GLIDA is publicly available at http://gdds.pharm.kyoto-u.ac.jp:8081/glida. We hope that it will prove very useful for chemical genomic research and GPCR-related drug discovery.

Keywords: chemogenomics
[Klabunde2006Chemogenomics] T. Klabunde and R. Jäger. Chemogenomics approaches to g-protein coupled receptor lead finding. Ernst Schering Res Found Workshop, 58:31-46, 2006. [ bib ]
G-protein coupled receptors (GPCRs) are promising targets for the discovery of novel drugs. In order to identify novel chemical series, high-throughput screening (HTS) is often complemented by rational chemogenomics lead finding approaches. We have compiled a GPCR directed screening set by ligand-based virtual screening of our corporate compound database. This set of compounds is supplemented with novel libraries synthesized around proprietary scaffolds. These target-directed libraries are designed using the knowledge of privileged fragments and pharmacophores to address specific GPCR subfamilies (e.g., purinergic or chemokine-binding GPCRs). Experimental testing of the GPCR collection has provided novel chemical series for several GPCR targets including the adenosine A1, the P2Y12, and the chemokine CCR1 receptor. In addition, GPCR sequence motifs linked to the recognition of GPCR ligands (termed chemoprints) are identified using homology modeling, molecular docking, and experimental profiling. These chemoprints can support the design and synthesis of compound libraries tailor-made for a novel GPCR target.

Keywords: chemogenomics
[Hill2006G-protein-coupled] S. J. Hill. G-protein-coupled receptors: past, present and future. Br. J. Pharmacol., 147 Suppl 1:S27-S37, Jan 2006. [ bib | DOI | http ]
The G-protein-coupled receptor (GPCR) family represents the largest and most versatile group of cell surface receptors. Drugs active at these receptors have therapeutic actions across a wide range of human diseases ranging from allergic rhinitis to pain, hypertension and schizophrenia. This review provides a brief historical overview of the properties and signalling characteristics of this important family of receptors.

Keywords: chemogenomics
[Guba2006Chemogenomics] W. Guba. Chemogenomics strategies for g-protein coupled receptor hit finding. Ernst Schering Res Found Workshop, 58:21-29, 2006. [ bib | DOI ]
Targeting protein superfamilies via chemogenomics is based on a similarity clustering of gene sequences and molecular structures of ligands. Both target and ligand clusters are linked by generating binding affinity profiles of chemotypes vs a target panel. The application of this multidimensional similarity paradigm will be described in the context of Lead Generation to identify novel chemical hit classes for G-protein coupled receptors.

Keywords: chemogenomics
[Erhan2006Collaborative] D. Erhan, P.-J. L'heureux, S. Y. Yue, and Y. Bengio. Collaborative filtering on a family of biological targets. J. Chem. Inf. Model., 46(2):626-635, 2006. [ bib | DOI | http | .pdf ]
Building a QSAR model of a new biological target for which few screening data are available is a statistical challenge. However, the new target may be part of a bigger family, for which we have more screening data. Collaborative filtering or, more generally, multi-task learning, is a machine learning approach that improves the generalization performance of an algorithm by using information from related tasks as an inductive bias. We use collaborative filtering techniques for building predictive models that link multiple targets to multiple examples. The more commonalities between the targets, the better the multi-target model that can be built. We show an example of a multi-target neural network that can use family information to produce a predictive model of an undersampled target. We evaluate JRank, a kernel-based method designed for collaborative filtering. We show their performance on compound prioritization for an HTS campaign and the underlying shared representation between targets. JRank outperformed the neural network both in the single- and multi-target models.

Keywords: chemogenomics
[Okuno2007GLIDA] Y. Okuno, A. Tamon, H. Yabuuchi, S. Niijima, Y. Minowa, K. Tonomura, R. Kunimoto, and C. Feng. GLIDA: GPCR ligand database for chemical genomics drug discovery database and tools update. Nucleic Acids Res., 36(Database issue):D907-D912, Nov 2007. [ bib | DOI | http ]
G-protein coupled receptors (GPCRs) represent one of the most important families of drug targets in pharmaceutical development. GLIDA is a public GPCR-related Chemical Genomics database that is primarily focused on the integration of information between GPCRs and their ligands. It provides interaction data between GPCRs and their ligands, along with chemical information on the ligands, as well as biological information regarding GPCRs. These data are connected with each other in a relational database, allowing users in the field of Chemical Genomics research to easily retrieve such information from either biological or chemical starting points. GLIDA includes a variety of similarity search functions for the GPCRs and for their ligands. Thus, GLIDA can provide correlation maps linking the searched homologous GPCRs (or ligands) with their ligands (or GPCRs). By analyzing the correlation patterns between GPCRs and ligands, we can gain more detailed knowledge about their conserved molecular recognition patterns and improve drug design efforts by focusing on inferred candidates for GPCR-specific drugs. This article provides a summary of the GLIDA database and user facilities, and describes recent improvements to database design, data contents, ligand classification programs, similarity search options and graphical interfaces. GLIDA is publicly available at http://pharminfo.pharm.kyoto-u.ac.jp/services/glida/. We hope that it will prove very useful for Chemical Genomics research and GPCR-related drug discovery.

Keywords: chemogenomics
[Kobilka2007G] B. K. Kobilka. G protein coupled receptor structure and activation. Biochim. Biophys. Acta, 1768(4):794-807, Apr 2007. [ bib | DOI | http ]
G protein coupled receptors (GPCRs) are remarkably versatile signaling molecules. The members of this large family of membrane proteins are activated by a spectrum of structurally diverse ligands, and have been shown to modulate the activity of different signaling pathways in a ligand specific manner. In this manuscript I will review what is known about the structure and mechanism of activation of GPCRs focusing primarily on two model systems, rhodopsin and the beta(2) adrenoceptor.

Keywords: chemogenomics
[Jacob2007Kernel] L. Jacob and J.-P. Vert. Kernel methods for in silico chemogenomics. Technical Report 0709.3931v1, arXiv, 2007. [ bib | http ]
Keywords: chemogenomics
[Deupi2007Structural] X. Deupi, N. Dölker, M. L. Lòpez-Rodrìguez, M. Campillo, J. A. Ballesteros, and L. Pardo. Structural models of class a G protein-coupled receptors as a tool for drug design: insights on transmembrane bundle plasticity. Curr. Top. Med. Chem., 7(10):991-998, 2007. [ bib ]
G protein-coupled receptors (GPCRs) interact with an extraordinary diversity of ligands by means of their extracellular domains and/or the extracellular part of the transmembrane (TM) segments. Each receptor subfamily has developed specific sequence motifs to adjust the structural characteristics of its cognate ligands to a common set of conformational rearrangements of the TM segments near the G protein binding domains during the activation process. Thus, GPCRs have fulfilled this adaptation during their evolution by customizing a preserved 7TM scaffold through conformational plasticity. We use this term to describe the structural differences near the binding site crevices among different receptor subfamilies, responsible for the selective recognition of diverse ligands among different receptor subfamilies. By comparing the sequence of rhodopsin at specific key regions of the TM bundle with the sequences of other GPCRs we have found that the extracellular region of TMs 2 and 3 provides a remarkable example of conformational plasticity within Class A GPCRs. Thus, rhodopsin-based molecular models need to include the plasticity of the binding sites among GPCR families, since the "quality" of these homology models is intimately linked with the success in the processes of rational drug-design or virtual screening of chemical databases.

Keywords: chemogenomics
[Chen2007GPCR] J.-Z. Chen, J. Wang, and X.-Q. Xie. Gpcr structure-based virtual screening approach for cb2 antagonist search. J. Chem. Inf. Model., 47(4):1626-1637, 2007. [ bib | DOI | http | .pdf ]
The potential for therapeutic specificity in regulating diseases has made cannabinoid (CB) receptors one of the most important G-protein-coupled receptor (GPCR) targets in search for new drugs. Considering the lack of related 3D experimental structures, we have established a structure-based virtual screening protocol to search for CB2 bioactive antagonists based on the 3D CB2 homology structure model. However, the existing homology-predicted 3D models often deviate from the native structure and therefore may incorrectly bias the in silico design. To overcome this problem, we have developed a 3D testing database query algorithm to examine the constructed 3D CB2 receptor structure model as well as the predicted binding pocket. In the present study, an antagonist-bound CB2 receptor complex model was initially generated using flexible docking simulation and then further optimized by molecular dynamic and mechanical (MD/MM) calculations. The refined 3D structural model of the CB2-ligand complex was then inspected by exploring the interactions between the receptor and ligands in order to predict the potential CB2 binding pocket for its antagonist. The ligand-receptor complex model and the predicted antagonist binding pockets were further processed and validated by FlexX-Pharm docking against a testing compound database that contains known antagonists. Furthermore, a consensus scoring (CScore) function algorithm was established to rank the binding interaction modes of a ligand on the CB2 receptor. Our results indicated that the known antagonists seeded in the testing database can be distinguished from a significant amount of randomly chosen molecules. Our studies demonstrated that the established GPCR structure-based virtual screening approach provided a new strategy with a high potential for in silico identifying novel CB2 antagonist leads based on the homology-generated 3D CB2 structure model.

Keywords: chemogenomics
[Catapano2007G] L. A. Catapano and H. K. Manji. G protein-coupled receptors in major psychiatric disorders. Biochim. Biophys. Acta, 1768(4):976-993, Apr 2007. [ bib | DOI | http ]
Keywords: chemogenomics
[Avlani2007Critical] V. A. Avlani, K. J. Gregory, C. J. Morton, M. W. Parker, P. M. Sexton, and A. Christopoulos. Critical role for the second extracellular loop in the binding of both orthosteric and allosteric g protein-coupled receptor ligands. J. Biol. Chem., 282(35):25677-25686, Aug 2007. [ bib | DOI | http ]
The second extracellular (E2) loop of G protein-coupled receptors (GPCRs) plays an essential but poorly understood role in the binding of non-peptidic small molecules. We have utilized both orthosteric ligands and allosteric modulators of the M2 muscarinic acetylcholine receptor, a prototypical Family A GPCR, to probe possible E2 loop binding dynamics. We developed a homology model based on the crystal structure of bovine rhodopsin and predicted novel cysteine substitutions that should dramatically reduce E2 loop flexibility via disulfide bond formation and significantly inhibit the binding of both types of ligands. This prediction was validated experimentally using radioligand binding, dissociation kinetics, and cell-based functional assays. The results argue for a flexible "gatekeeper" role of the E2 loop in the binding of both allosteric and orthosteric GPCR ligands.

Keywords: chemogenomics
[Klabunde2007Chemogenomic] T. Klabunde. Chemogenomic approaches to drug discovery: similar receptors bind similar ligands. Br. J. Pharmacol., 152:5-7, May 2007. [ bib | DOI | http ]
Within recent years, a paradigm shift from traditional receptor-specific studies to a cross-receptor view has taken place within pharmaceutical research to increase the efficiency of modern drug discovery. Receptors are no longer viewed as single entities but grouped into sets of related proteins or receptor families that are explored in a systematic manner. This interdisciplinary approach attempting to derive predictive links between the chemical structures of bioactive molecules and the receptors with which these molecules interact is referred to as chemogenomics. Insights from chemogenomics are used for the rational compilation of screening sets and for the rational design and synthesis of directed chemical libraries to accelerate drug discovery.British Journal of Pharmacology advance online publication, 29 May 2007; doi:10.1038/sj.bjp.0707308.

Keywords: chemogenomics
[Fredholm2007G-protein-coupled] B. B. Fredholm, T. Hökfelt, and G. Milligan. G-protein-coupled receptors: an update. Acta Physiol., 190(1):3-7, May 2007. [ bib | DOI | http ]
The receptors that couple to G proteins (GPCR) and which span the cell membranes seven times (7-TM receptors) were the focus of a symposium in Stockholm 2006. The ensemble of GPCR has now been mapped in several animal species. They remain a major focus of interest in drug development, and their diverse physiological and pathophysiological roles are being clarified, i.a. by genetic targeting. Recent developments hint at novel levels of complexity. First, many, if not all, GPCRs are part of multimeric ensembles, and physiology and pharmacology of a given GPCR may be at least partly guided by the partners it was formed together with. Secondly, at least some GPCRs may be constitutively active. Therefore, drugs that are inverse agonists may prove useful. Furthermore, the level of activity may vary in such a profound way between cells and tissues that this could offer new ways of achieving specificity of drug action. Finally, it is becoming increasingly clear that some of these receptors can signal via novel types of pathways, and hence that 'GPCRs' may not always be G-protein-coupled. Thus there are many challenges for the basic scientist and the drug industry.

Keywords: chemogenomics
[Lefkowitz2008crystal] R. J. Lefkowitz, J.-P. Sun, and A. K. Shukla. A crystal clear view of the beta2-adrenergic receptor. Nat. Biotechnol., 26(2):189-191, Feb 2008. [ bib | DOI | http ]
Keywords: chemogenomics
[Kellenberger2008How] E. Kellenberger, C. Schalon, and D. Rognan. How to measure the similarity between protein ligand-binding sites? Current Computer-Aided Drug Design, 4(3):209-220, Sep. 2008. [ bib | DOI | http | .pdf ]
Quantification of local similarity between protein 3D structures is a promising tool in computer-aided drug design and prediction of biological function. Over the last ten years, several computational methods were proposed, mostly based on geometrical comparisons. This review summarizes the recent literature and gives an overview of available programs. A particular interest is given to the underlying methodologies. Our analysis points out strengths and weaknesses of the various approaches. If all described methods work relatively well when two binding sites obviously resemble each other, scoring potential solutions remains a difficult issue, especially if the similarity is low. The other challenging question is the protein flexibility, which is indeed difficult to evaluate from a static representation. Last, most of recently developed techniques are fast and can be applied to large amounts of data. Examples were carefully chosen to illustrate the wide applicability domain of the most popular methods: detection of common structural motifs, identification of secondary targets for a drug-like compound, comparison of binding sites across a functional family, comparison of homology models, database screening.

Keywords: chemogenomics
[Jacob2008Protein] L. Jacob and J.-P. Vert. Protein-ligand interaction prediction: an improved chemogenomics approach. Bioinformatics, 24(19):2149-2156, 2008. [ bib | DOI | http | .pdf ]
Keywords: chemogenomics
[Jacob2008Efficient] L. Jacob and J.-P. Vert. Efficient peptide-MHC-I binding prediction for alleles with few known binders. Bioinformatics, 24(3):358-366, Feb 2008. [ bib | DOI | http | .pdf ]
MOTIVATION: In silico methods for the prediction of antigenic peptides binding to MHC class I molecules play an increasingly important role in the identification of T-cell epitopes. Statistical and machine learning methods in particular are widely used to score candidate binders based on their similarity with known binders and non-binders. The genes coding for the MHC molecules, however, are highly polymorphic, and statistical methods have difficulties building models for alleles with few known binders. In this context, recent work has demonstrated the utility of leveraging information across alleles to improve the performance of the prediction. RESULTS: We design a support vector machine algorithm that is able to learn peptide-MHC-I binding models for many alleles simultaneously, by sharing binding information across alleles. The sharing of information is controlled by a user-defined measure of similarity between alleles. We show that this similarity can be defined in terms of supertypes, or more directly by comparing key residues known to play a role in the peptide-MHC binding. We illustrate the potential of this approach on various benchmark experiments where it outperforms other state-of-the-art methods. AVAILABILITY: The method is implemented on a web server: http://cbio.ensmp.fr/kiss. All data and codes are freely and publicly available from the authors.

Keywords: chemogenomics immunoinformatics
[Jacob2008Virtual] L. Jacob, B. Hoffmann, V. Stoven, and J.-P. Vert. Virtual screening of GPCRs: an in silico chemogenomics approach. BMC Bioinformatics, 9:363, 2008. [ bib | DOI | http | .pdf ]
Keywords: chemogenomics
[Cavasotto2008Discovery] C. N. Cavasotto, A. J. W. Orry, N. J. Murgolo, M. F. Czarniecki, S. A. Kocsi, B. E. Hawes, K. A. O'Neill, H. Hine, M. S. Burton, J. H. Voigt, R. A. Abagyan, M. L. Bayne, and F. J. Monsma. Discovery of novel chemotypes to a G-protein-coupled receptor through ligand-steered homology modeling and structure-based virtual screening. J. Med. Chem., 51(3):581-588, Feb 2008. [ bib | DOI | http ]
Melanin-concentrating hormone receptor 1 (MCH-R1) is a G-protein-coupled receptor (GPCR) and a target for the development of therapeutics for obesity. The structure-based development of MCH-R1 and other GPCR antagonists is hampered by the lack of an available experimentally determined atomic structure. A ligand-steered homology modeling approach has been developed (where information about existing ligands is used explicitly to shape and optimize the binding site) followed by docking-based virtual screening. Top scoring compounds identified virtually were tested experimentally in an MCH-R1 competitive binding assay, and six novel chemotypes as low micromolar affinity antagonist "hits" were identified. This success rate is more than a 10-fold improvement over random high-throughput screening, which supports our ligand-steered method. Clearly, the ligand-steered homology modeling method reduces the uncertainty of structure modeling for difficult targets like GPCRs.

Keywords: chemogenomics
[Wassermann2009Ligand] Anne Mai Wassermann, Hanna Geppert, and Jürgen Bajorath. Ligand prediction for orphan targets using support vector machines and various target-ligand kernels is dominated by nearest neighbor effects. J Chem Inf Model, 49(10):2155-2167, Oct 2009. [ bib | DOI | http | .pdf ]
Support vector machine (SVM) calculations combining protein and small molecule information have been applied to identify ligands for simulated orphan targets (i.e., targets for which no ligands were available). The combination of protein and ligand information was facilitated through the design of target-ligand kernel functions that account for pairwise ligand and target similarity. The design and biological information content of such kernel functions was expected to play a major role for target-directed ligand prediction. Therefore, a variety of target-ligand kernels were implemented to capture different types of target information including sequence, secondary structure, tertiary structure, biophysical properties, ontologies, or structural taxonomy. These kernels were tested in ligand predictions for simulated orphan targets in two target protein systems characterized by the presence of different intertarget relationships. Surprisingly, although there were target- and set-specific differences in prediction rates for alternative target-ligand kernels, the performance of these kernels was overall similar and also similar to SVM linear combinations. Test calculations designed to better understand possible reasons for these observations revealed that ligand information provided by nearest neighbors of orphan targets significantly influenced SVM performance, much more so than the inclusion of protein information. As long as ligands of closely related neighbors of orphan targets were available for SVM learning, orphan target ligands could be well predicted, regardless of the type and sophistication of the kernel function that was used. These findings suggest simplified strategies for SVM-based ligand prediction for orphan targets.

Keywords: chemogenomics, chemoinformatics
[Shivakumar2009Structural] Pavithra Shivakumar and Michael Krauthammer. Structural similarity assessment for drug sensitivity prediction in cancer. BMC Bioinformatics, 10 Suppl 9:S17, 2009. [ bib | DOI | http | .pdf ]
BACKGROUND: The ability to predict drug sensitivity in cancer is one of the exciting promises of pharmacogenomic research. Several groups have demonstrated the ability to predict drug sensitivity by integrating chemo-sensitivity data and associated gene expression measurements from large anti-cancer drug screens such as NCI-60. The general approach is based on comparing gene expression measurements from sensitive and resistant cancer cell lines and deriving drug sensitivity profiles consisting of lists of genes whose expression is predictive of response to a drug. Importantly, it has been shown that such profiles are generic and can be applied to cancer cell lines that are not part of the anti-cancer screen. However, one limitation is that the profiles can not be generated for untested drugs (i.e., drugs that are not part of an anti-cancer drug screen). In this work, we propose using an existing drug sensitivity profile for drug A as a substitute for an untested drug B given high structural similarities between drugs A and B. RESULTS: We first show that structural similarity between pairs of compounds in the NCI-60 dataset highly correlates with the similarity between their activities across the cancer cell lines. This result shows that structurally similar drugs can be expected to have a similar effect on cancer cell lines. We next set out to test our hypothesis that we can use existing drug sensitivity profiles as substitute profiles for untested drugs. In a cross-validation experiment, we found that the use of substitute profiles is possible without a significant loss of prediction accuracy if the substitute profile was generated from a compound with high structural similarity to the untested compound. CONCLUSION: Anti-cancer drug screens are a valuable resource for generating omics-based drug sensitivity profiles. We show that it is possible to extend the usefulness of existing screens to untested drugs by deriving substitute sensitivity profiles from structurally similar drugs part of the screen.

Keywords: chemogenomics

This file was generated by bibtex2html 1.97.