In silico Drug Design: some concepts & tools
Drug discovery, chemical biology & precision medicine
There are many in silico tools to carry out, for instance, ADMET predictions, binding pocket analysis, predict protein-protein interaction binding site, analyse PPI network, drug repositioning, predict the 3D structure of a macromolecule, graft a sugar onto a protein structure, peptide docking, systems pharmacology, adverse drug reaction predictions, compound collection annotation, virtual screening, analysis of point mutations observed in patients (instead of mutation, variation is nowadays most often used), protein docking...
If you are new to the field, here are some general ideas: the first step is to define the in silico tools you need. This is directly linked to the type of questions you want to address, the type of project, the stage of the project and the data that you have to start with. In some cases, in silico approaches can not really help initially, some experiments have to be performed first and then, in silico prediction engines will become very valuable. In other situations, you need to start both at the same time, in vitro and in silico experiments while in other situations, in silico study can come first (notion of parallel integration of the approaches, iterative integration like experimental and then silico and then experimental while focused integration starts with silico to filter out unwanted molecules...).
Also, there are some key differences between chemical biology projects and drug discovery projects and these have to be considered, chemical probes can be critical for drug discovery projects, they are challenging to produce, but at the same time easier to develop with regard to many ADMET properties...etc. It also depends how far you want to go... meaning in most cases how much funding you have, which is obviously a key limiting factor.
About 90% of projects entering clinical trials fail
If your project is about target-based screening, you may need the 3D structure of your target. You can check the PDB to find some experimental 3D structures (Xray or NMR in general). If not, you can try to predict the 3D structure, for instance via comparative model building (over 35 million structures can be built, yet there are in 2019 about 200 million sequences in UniProt). You can find many valuable standalone and online servers in the Shortlist page and in the sections Modeling Molecules, Simulations, etc. These belong to the overall field of Structural bioinformatics.
Next, assuming you have your selected protein in 3D, you may need a peptide or a small non-peptidic chemical compound or an antibody that binds to your target. If you know the binding pocket, then you can use structure-based virtual screening approaches (Chemoinformatics section) or peptide - protein docking (Bioinformatics section). In general you'll need to prepare a compound collection if you search for a small molecule modulating your system (small molecule or biologics such as peptides). Once you have prepared this collection or found it online, then, using in silico screening or related approaches you should be able to define a small list of molecules (maybe 20 peptides or maybe 200-500 small chemical compounds) that will need to be tested experimentally (important to think about the assays, how mechanisms are going to be investigated, can you get direct binding data..). Maybe you do not know the binding pocket and then you can use tools that will predict binding cavities and the so-called druggable pockets (Bioinformatics section). Your therapeutic target might be flexible, then, you need simulation tools like molecular dynamics and many others. For all these tasks, you have in silico tools that have been developed these last 10 to 20 years or so... If your therapeutic molecules are RNA, DNA, mAb, you can also find tools that help, for instance you may want to stabilize a protein, you may want to graft a small compound on a mAb, on a peptide....
If you know a small molecule that binds to your target, you can search in databases other molecules that are similar to your query, then you can test in vitro these new molecules and build some SAR (see for instance the ligand-based virtual screening tools) (Chemoinformatics section).
If you search a hit compound that could be used as starting point for drug discovery, you will need to predict some ADME-Tox properties (this can be very valuable also for chemical biology projects). You should check if the molecules have structural alerts, PAINS or promiscous cmpds. According to such analysis, you may have to perform additional experiments to double check your initial results. Tools that can be of interest at this stage can belong to the QSAR section or the virtual screening sections, and obviously to the ADME-Tox section...(Chemoinformatics section).
In most cases, you will have to look at databases to see if your target has been screened already or if your favorite compounds are already known to hit many targets. Databases that are open are for instance PubChem and ChemBL, these will be in the chemoinfo section.
You may want to know if your compound could bind to other secondary targets, often called off-targets and if the effect on health is not favorable, these secondary targets are called anti-targets. To do this, you can use tools that belong to the off-targets, repurposing, repositioning section. There, different approaches are available, from ligand-similarity searches to reverse docking...etc (Chemoinformatics section). If you use phenotypic screening, then several of these methods can also help to try to identify the target(s) involved.
As mentioned, to develop a new drug it is likely to take many years and success is far from certain. It has been estimated that it takes 13.5 years to bring a new molecular entity to market and the success rate for taking oncology drugs from phase I to approval by the US Food and Drug Administration (FDA) was only around 7%. These numbers change a bit in different reports but it gives an overall idea. Repositioning could thus be valuable in some cases as it builds on previous research, allowing compounds to progress more quickly as well as saving a substantial amount of money when it works. In silico strategies can help here. One concept that helps to understand repositioning is for instance the notion of polypharmacology that is one small molecule drug is likely to have an average of six to seven targets. Thus ones may reposition a drug onto another target.
If you are interested in protein-protein interactions (Bioinformatics section) and the modulation of these interactions with a small compound, you may need to use protein docking methods. You may want to see all the known interactions with your target and thus will need some "network" tools. If you have a 3D structure of your protein-protein complex, you may want to analyze the interface and predict hotspot residues. We have some recent reviews about "in silico approaches and compound design", for instance about protein-protein interaction inhibitors, see Villoutreix et al. Molecular Informatics June 2014).
If your protein has point mutations (experimental or naturally occurring, ...idea of precision medicine), you may want to predict the impact of the amino acid substitutions on folding, function etc... Then again, you need a different set of tools and you can go to the sections Simulations and Mutations...(Bioinformatics section).
You may need to search patent databases, find databases on diseases, find tools to help represent and visualize the data, you may want to find some commercial tools... These will be in the section related tools.
- Last updated on .