In silico Drug Design: some concepts & tools - Success stories

Article Index

Comments about virtual ligand screening and drug design

(BO Villoutreix, PhD, Research Director Inserm, Feb 10 2014)
The process of drug discovery and development is challenging, time consuming, expensive, and requires consideration of many aspects [some numbers: 7–15 years and $1.2 billion dollars to bring a new molecule to the market, 5 out of about 50,000 compounds tested in animals (usually resulting from the experimental screening of millions of molecules, the cost associated with the experimental screening of 1 million compounds is around $500,000) reach clinical trials and only 1 of 5 compounds reaching clinical studies is approved, some people mention that in general 90% of the compounds entering clinical trials will fail].

One first is to find hits. There are many ways to do this as seen in the figure below:

 

 

But finding binders is not enough ! One needs to optimize the compounds (hit2lead, lead optimization..) and here again in silico approaches can help, eg., using for instance multiparameter optimization tools.



Drug discovery requires a multidisciplinary approach. Indeed, several concepts, disciplines, skills and techniques (e.g., NMR, Xray, bioinformatics, chemoinformatics, medicinal chemistry, toxicology, medical sciences, biology, genetics…) have to work together to succeed as no technology/science alone is likely to make it. Drugs can be small chemical compounds, proteins, peptides, vaccines ... but here, we discuss only small chemical compounds and how in silico approaches can help the process.

To find interesting candidates, one possible way is to find substrates of the target (e.g., enzyme) and to make them drug-like. Another way is to start with someone else's hit and thus mine patent databases to find new ideas. These starting points can then be modified by combining for instance medicinal chemistry and chemoinformatics strategies. Around the 1990's (and still today), a very common approach to find hits is to use high-throughput screening (HTS). In this case, usually, one assumes (there are of course experimental evidences that link a target to a disease but often we fully know the value of a target at the end of the process, it can be too late...) that a target is critical to a disease condition and then, the target is screened experimentally using thousands of compounds (with robots) (some difficulties, cost, and time as it easily requires over a year to screen 200,000 cmps, it does take a lot of time even with robots not mentioning the time to analyze the data....the hit rate can be low ~0.2% or even lower for challenging targets such as modulation of protein-protein interactions) and it is only possible to explore a very small part of the chemical space, eg, screen 1 million compounds while the number of small drug like molecules is basically infinite, only in silico approaches can be used to really explore this almost infinite chemspace).

It is possible to use in silico screening instead of HTS (or with HTS) and then generate a small list of molecules (eg., 300 molecules) for experimental assays. Virtual screening can also be used to search for latent hits missed in a HTS project. Of course, combining experimental screening and virtual screening seem to be the best solution, but to save time and money, often only virtual screening is used, at least in many academic groups and in small companies. In silico tools are not perfect, scoring, docking, failures due to the use of a wrong training sets, difficulties with flexibility....are well-known but solutions to improve these situations are difficult to find. Yet, in many cases, the results obtained after in silico screening are interesting and the hit rate with in silico approaches is generally better than with HTS (from 1% to 10% or more). Of course many parameters will play a role here, like the preparation of the compound collection (filtering, ADMETox prediction, use of focused libraries, libraries to explore new challenging targets, like collections prepared to modulate protein-protein interactions, see for instance our website CDithem for additional information...). And again, finding hits is one thing, finding a drug is something else.

After in vitro, or phenotypic screening, or in silico screening, hits have to be optimized and here again, many in silico approaches can be used to assist the process. Most of the approaches mentioned above can be used for chemical biology endeavors. Finding binders is of course very different from finding a drug, yet, one important step is to find high quality starting compounds. In the paragraphs below, I ll give some examples illustrating the use of in silico drug design and screening. Obviously, you can find many others in Pubmed or on the WWW.

Example 1: AIDS, compound docking, receptor flexibility, new binding pocket and the generation of ideas to design Isentress
A binding pocket for a new class of drugs to treat AIDS was discovered using docking while considering the flexibility of the receptor through molecular dynamics. McCammon and his colleagues used AutoDock in conjunction with the Relaxed Complex Method to discover novel modes of inhibition of HIV integrase (J Med Chem, 2004). Researchers at the Merck Pharmaceutical Company then used these data to design the orally available raltegravir (HIV integrase inhibitor, brand name Isentress, approved by the Food and Drug Administration approved in 2007 while it received in 2011 approval for pediatric use)

Example 2: Virtual screening versus experimental screening, in silico investigations proposed interesting compounds at a reduced cost
An interesting example which can serve as a proof of principle on the benefit of using in silico approach involves a type I TGF‐beta receptor kinase inhibitor. The same molecule (HTS‐466284), a 27 nM inhibitor, was discovered independently using virtual screening by Biogen IDEC (J. Singh et al., Bioorg. Med. Chem. Lett. 13, 2003, p4355) and traditional enzyme and cell‐based high‐throughput screening by Eli Lilly (J.S. Sawyer et al., J. Med. Chem. 46, 2003, p3953). The in silico work involved pharmacophore‐screening of 200,000 compounds and used as a starting point the knowledge of hit compounds published several years before. The compound discovered experimentally at Lilly required in vitro screening of a large library of compounds to find potential inhibitors in a TGF‐β‐dependent cell‐based assay and chemical synthesis

Example 3: antianxiety, antidepression, 5HT1A agonist using several in silico strategies
An in silico modeling drug development program (homology modeling, virtual screening with DOCK, hit to lead optimization and in silico profiling) led to clinical trials of a novel, potent, and selective antianxiety, antidepression 5‐HT1A agonist in less than 2 years from the start and requiring less than 6 months of lead optimization and synthesis of only 31 compounds (O.M. Becker et al., J. Med. Chem. 49, 2006, p3116)

Example 4: ADMETox, drug design, cost and ethics
Applying QSAR algorithms to toxicity data and corresponding chemical structures led to the development of in silico tools that predict toxicity response (mutagenicity, carcinogenicity) and toxicity dosing (no observed effect level, NOEL; maximum recommended starting dose, MRSD). For example, carcinogenicity QSAR model using 53 descriptors and data from a 2‐year rodent study stored in a FDA database exhibited 76% sensitivity and 84% specificity (Contrera et al, QSAR modeling of carcinogenic risk using discriminant analysis and topological molecular descriptors, Curr. Drug Discov. Technol. 2, 2005, 55–67, see also Regul. Toxicol. Pharmacol. 40, 2004, 185–206). Rodent carcinogenicity studies are required for the marketing of most chronically administered drugs. These studies are the most costly and time consuming nonclinical regulatory testing requirement in the development of a drug. The cost is approximately $2 millions for a study on rats and mice, requiring 2 years of treatment, and at least an additional 1‐2 years for histo‐pathological analysis and report writing. Thus, computational or predictive toxicology computations have potential regulatory and drug development applications that can ultimately benefit the public health as well as reduce the use of animals in the assessment of safety

Example 5: compound optimization, myocardial infarction and ligandbased screening
In a recent review, Clark (Expert Opinion on Drug Discovery (2008) 8: 841‐851) commented on Aggrastat (Tirofiban). This molecule, from Merck, a GP IIb/IIIa antagonist (myocardial infarction, it is an anticoagulant and platelet aggregation inhibitor, protein-protein interaction inhibitor) results from a lead compound that was further optimized using ligand-based pharmacophore screening and medicinal chemistry. This compound appears to modulate a protein‐protein interaction (between Integrin glycoprotein Alpha IIb and Beta III and Fibrinogen receptors on platelets). It is among the first drug whose origins can be traced back to in silico designed. (See Hartzman et al. (1992). "Non-Peptide Fibrinogen Receptor Antagonists. Discovery and Design of Exosite Inhibitors". J Med Chem 35: p4640)

Example 6: 1,2,4-Oxadiazoles identified by virtual screening and their non-covalent inhibition of the human 20S proteasome
Although several constitutive proteasome inhibitors have been reported these recent years, potent organic, noncovalent and readily available inhibitors are still poorly documented. Two studies have been performed by two different groups, one using experimental HTS screening, the other virtual screening. Ozcan et al. screened 50000 molecules coming from Chembridge while Marechal et al., screened in silico 400000 molecules from the same vendor and tested experimentally some molecules to end up, like in the case of the HTS work on oxadiazole noncovalent proteasome inhibitors (see figure to the left). The cellular effects of these compounds validate their utility as potential pharmacological agents for anti-cancer pre-clinical studies

Example 7: Relenza, combining Xray studies with computer modeling and medicinal chemistry
Zanamivir is a neuraminidase inhibitor (transition‐state analogue inhibitor) used in the treatment and prophylaxis of influenza caused by influenza A virus and influenza B virus. Zanamivir was the first neuraminidase inhibitor commercially developed, the initial steps were indeed performed by a small company and in a university in Melbourne. It is currently marketed by GlaxoSmithKline under the trade name Relenza as a powder for oral inhalation. The strategy relied on the availability of the Xray structure of influenza neuraminidase while computational chemistry techniques were also used. The active site was investigated in silico and suggestions were made to optimize the initial hits up to the design of Zanamivir

Example 8: A recent 2012 report from GlaxoSmithKline about the contributions of in silico drug design

  • Direct contributions to the discovery and design of 2 molecules that reached positive proof-of-concept clinical decisions
  • Direct contributions to 8 candidate and pre-candidate decisions
  • 37 contributions resulting in new hit/lead series
  • 18 examples of significant contributions to lead optimization
  • More than 70 examples of screening data analysis resulting in program progression
  • Contributions to drug-discovery programs recognized in 25 published manuscripts and 12 issued or published patents
    See Green DV et al., 2012, J Comput Aided Mol Des (2012) 26:51–56

Example 9: SYK
We found, in collaboration with the group of Dr. P Dariavach, tyrosine kinase SYK non-enzymatic inhibitors and potential anti-allergic drug-like compounds by using virtual and in vitro screening.


The most likely binding area of these compounds that are inhibiting a protein-protein interaction instead of acting on the kinase site was found using binding pocket prediction and validated by site directed mutagenesis. This region was not known prior to our computational analysis. Some compounds are working on animal models and some molecules are patented. (See Villoutreix et al., PloS One 2011; Mazuc et al., J Allergy Clin Immunol. 2008).

 

Conclusion

In silico methods help the drug discovery process, they can be (or are) combined with biophysical approaches, experimental high throughout screening and biology/chemistry/toxicology/clinical studies; they assist decision making, contribute to reducing the cost, to the generation of new ideas and concepts, they bring solutions to problems, allow to rapidly test new hypothesis and to explore “areas” that could not be assessed experimentally either because the experiments could not be performed, or because they would cost too much or else as they would not be ethical. They for instance allow to investigate new compounds before they are even synthesized. In silico tools help to analyze, mine and rationalize millions of (heterogeneous) data points coming from multiple sources, assist in defining the functions of a molecule, help to understand and to predict polypharmacology, off‐targets, ADMETox properties, they provides supplemental information to resolve conflicting experimental results, reduces the necessity to repeat studies or to perform some experiments, they accelerate clinical trials by supporting the entry of subjects in clinical trials before standard toxicology studies are completed, support risk‐based testing and the reduction of animal testing, and supply additional supporting information for the selection of the first dose in humans to be used in standard phase I clinical trial…
Although promising, in silico methods are not without limitations… and thus have to be continuously developed and challenged and that research and funding in this field are needed, not only for applications but also for methodological developments. Apart from the screening tools, we now see an aggressive development of data mining and pipelining tools to keep pace with the massive amount of data generated by both experimental and computational experiments. Choosing the right strategy is critical and increasing the interaction between experimentalists and computational groups should increase the quality and efficiency of the lead discovery stage and the development of new and safer drugs.

NB: (It is also important to have in mind when trying to improve a process)
A possible way for further improving productivity and drug discovery (this applies to many things) is enhancing decision-making processes along the R&D pipeline. in fact, successful drug development is the culmination of thousands of decisions over a 10-15 year period, involving the combined judgment and experience of many individuals.
Cognitive research has shown that the human decision-making processes are inherently flawed or biased. Although this can be a problem, it is thought that these many biases serve as shortcuts to help the brain process the vast amount of information it receives and thousands of decisions it makes daily. Using these observations definitively suggest that drug discovery can be optimized by acting on these many biases.

Additional comments about cost and the need to combine and integrate in silico and in vitro strategies

In silico strategies usually contributes to understand better a molecular event, they tend to shed new light and contribute new ideas, they allow exploring data that the human mind cannot rationalize. This is very valuable as mentioned above but the impact on cost and time is also important. For example, in a recent talk from P. Ertl, these points were commented: It is possible to predict some ADMETox properties like interaction of a compound with the potassium ion channel protein hERG (potential fatal disorder called long QT syndrome). One experimentally measured value of hERG blockade with the patch‐clamp technique occupies one laboratory assistant for 1 day and consumes many research chemicals. Thus, if the prediction method is accurate, one can easily conceive the impact on time and money. Further, computer models allow making predictions of compounds that are not yet synthesized (for a research chemist it typically takes 1 week to synthesize a compound if everything goes right). But clearly computer tools are not only about cost and number crunching, they allow us to gain new knowledge and can guide experimental design

Note about experimental work

In silico tools are sometimes considered to be inaccurate by some scientists. While this may be true in some cases (such problems are usually alleviated when the tools are used by experts in the field), it is important to note that there is nearly no experimental measurement without error. Even for a simple log P value, different scientists working in different laboratories will measure different values. So wisdom and humility are needed on both sides, at the bench and behind the computer.

Some codes for machine learning

Random forests
Random Forests http://www.stat.berkeley.edu/~breiman/RandomForests/
randomForest R package http://cran.r-project.org/web/packages/randomForest/index.html
FastRandomForest https://code.google.com/p/fast-random-forest/

KNN
kNN classifier http://www.fit.vutbr.cz/~bartik/Arcbc/kNN.htm
k Nearest Neighbor demo http://www.cs.cmu.edu/~zhuxj/courseproject/knndemo/KNN.html
GPU-FS-kNN http://sourceforge.net/projects/gpufsknn/
GA/KNN http://www.niehs.nih.gov/research/resources/software/biostatistics/gaknn/
Dense K Nearest Neighbor http://www.autonlab.org/autonweb/10522.html

SVM
mySVM http://www-ai.cs.uni-dortmund.de/SOFTWARE/MYSVM/index.html
e1071 R package http://cran.r-project.org/web/packages/e1071/index.html
BSVM http://www.csie.ntu.edu.tw/~cjlin/bsvm/
LS-SVMlab http://www.esat.kuleuven.be/sista/lssvmlab/
LIBSVM http://www.csie.ntu.edu.tw/~cjlin/libsvm/
SVM light http://svmlight.joachims.org/
M-SVM http://www.loria.fr/~guermeur/

Neural network
NuClass http://www.uta.edu/faculty/manry/new_software.html
sciengyrpf http://sourceforge.net/projects/sciengyrpf/
Sharky Neural Network http://sharktime.com/us_SharkyNeuralNetwork.html
BrainMaker http://www.calsci.com/
fann http://leenissen.dk/fann/

Decision tree
Simple Decision Tree https://sites.google.com/site/simpledecisiontree/
OC1 http://www.cbcb.umd.edu/~salzberg/announce-oc1.html
SMILES http://users.dsic.upv.es/~flip/smiles/
PC4.5 http://www.cs.nyu.edu/~binli/pc4.5/
YaDT http://www.di.unipi.it/~ruggieri/software.html
C4.5 and C5.0 http://www.rulequest.com/Personal/

Further reading: some recent reviews in the field of in silico screening

  • Virtual screening ‐ what does it give us?. Koppen (Boehringer, Germany). Current Opinion Drug Discovery & Dev (2009) 12: 397‐407
  • From virtuality to reality. Rester (Bayer, Germany). Current Opinion Drug Discovery & Dev (2008) 11: 559‐568
  • Caldwell GW. In silico tools used for compound selection during target-based drug discovery and development. Expert Opin Drug Discov. 2015 May 8:1-23. (Janssen Research, USA)
  • What has virtual screening ever done for drug discovery? Clark (Argenta Discovery Ltd, UK). Expert Opinion on Drug Discovery (2008) 8: 841‐851
  • Docking and chemoinformatic screens for new ligands and targets. Kolb et al., Current Opin Biotech (2009) 20:1‐8
  • High‐throughput and in silico screenings in drug discovery. Phatak et al., Expert Opin Drug Discov (2009) 4: 947‐959
  • Structure‐based virtual ligand screening: recent success stories. Villoutreix et al., Comb Chem High Throughput Screen (2009) 12:1000‐16
  • Successful Applications of Computer Aided Drug Discovery: Moving Drugs from Concept to the Clinic. Talele et al, Curr Topics in Med Chemistry (2010) 10:127‐141
  • Computer‐aided drug discovery and development (CADDD): In silico‐chemico‐biological approach. Kapetanovic. ChemicoBiological Interactions 171 (2008) 165–176
  • Impact of high‐throughput screening in biomedical research. Macarron et al. Nat Rev Drug Discov. (2011)10:188‐95
  • Streamlining lead discovery by aligning in silico and high‐throughput screening. Davies et al. Curr Opin Chem Biol. (2006) 10:343‐51
  • Docking‐based virtual screening: recent developments. Tuccinardi. Comb Chem High Throughput Screen. (2009), 12:303‐14
  • Established and emerging trends in computational drug discovery in the structural genomics era. Taboureau et al., Chem Biol. (2012) 19:29‐41
  • Toward in silico structure‐based ADMET prediction in drug discovery. Moroy et al., Drug Discov Today. (2012) 17:44‐55
  • Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Perot et al., Drug Discov Today. (2010) 15:656‐67
  • Rationalizing the chemical space of protein‐protein interaction inhibitors. Sperandio et al. Drug Discov Today. (2010) 15:220‐9
  • Computational Drug Design Targeting Protein‐Protein Interactions. Bienstock, Curr Pharm Des. (2012) in press
  • 1,2,4-Oxadiazoles identified by virtual screening and their non-covalent inhibition of the human 20S proteasome. Maréchal X, Genin E, Qin L, Sperandio O, Montes M, Basse N, Richy N, Miteva MA, Reboud-Ravaux M, Vidal J, Villoutreix BO. Curr Med Chem. 2013;20(18):2351-62
  • Oxadiazole-isopropylamides as potent and noncovalent proteasome inhibitors. Ozcan S, Kazi A, Marsilio F, Fang B, Guida WC, Koomen J, Lawrence HR, Sebti SM. J Med Chem. 2013;56:3783-805
  • Last updated on .

Email

bruno.villoutreix(at)gmail.com

Address

Follow me


© Bruno Villoutreix. A first version of this Website was launched in 2006. Thank to Natacha Oliveira