In silico Drug Design: some concepts & tools - Structural Bioinformatics: molecular modeling & mutations...

Article Index

Bioinformatics: molecular modeling, mutations...

There are obviously many tools in these sections because many research groups started to work this field since at least the 90s. I just give the names of some names below essentially for proteins but there are also packages to try to predict the structure of RNA, DNA, or sugar molecules... When you have a 3D structure, if the quality is acceptable, then you can try to find binding pockets, run protein-protein docking engines, look at mutations (notion of biostructural pathology)....run simulations, like electrostatics or molecular dynamics..., dock a peptide... perform virtual screening...

Databases of models: MODBASE, Protein Model Portal, SWISS-MODEL Repository...

Tools to complete missing loops in a PDB file: one module of PDB_hydro, ...
Template selection: Phyre2, HHpred, PSIPRED (pGenThreader...)...
Alignment tools for sequences: CLUSTALW, MUSCLE, T-Coffee...
Homology modeling online: RaptorX, 3DJIGSAW, CPHModel, ESyPred3D, GeneSilico, Geno3D, HHpred, LOMETS (Meta-server combining 9 different programs), MODELLER (ModWeb: A Server for Protein Structure Modeling), Phyre and Phyre2, Protinfo, ROBETTA, BHAGEERATH-H, SWISS-MODEL, TIP-STRUCTFAST, WHAT IF...FALCON@home

Threading online: RaptorX, 3D-PSSM (now Phyre2), HHpred, I-TASSER, LOOPP, mGenTHREADER/GenTHREADER, MUSTER, Phyre and Phyre2, SPARKSx/SP series...

Ab initio structure prediction: EVfold, QUARK, I-TASSER, ROBETTA, Bhageerath, PEP-FOLD...

Secondary structure prediction: RaptorX-SS8, NetSurfP, Jpred, Meta-PP, PREDATOR, PredictProtein, PSIPRED, SymPred, YASSPP, PSSpred...

Transmembrane helix prediction: TMHMM, Phobius, PHDhtm, MEMSAT, HMMTOP...

Model Evaluation: DFIRE, COLORADO-3D (ANOLEA, PROSA, PROVE, VERIFY3D), FRST, HARMONY, ModFOLD, MolProbity, PROCHECK, ProQ, QMEAN...SAVES, model 3D structure optimization (ModRefiner)...

Macromolecule simulation with NMA: iMODS, DFprot, NOMAD-ref, MolMovDB, HingeProt, PATH-ENM, iENM, NMSim, KOSMOS, FlexServ, ElNemo, AD-ENM...

Mutations (or variations) - Personalized medicine - Precision medicine
Based on genome sequencing of individuals, it is estimated that each person’s proteome contains roughly 10 000–11 000 mutations compared to a reference proteome. A subset of these mutations has severe functional consequences, however, for the great majority, it is difficult to predict a priori what their effect will be on the resultant protein’s structure and or function.There are different types of mutations. Single nucleotide polymorphisms (SNPs) fall either within non-coding or coding regions of the DNA molecules. Synonymous mutations do not change the encoded protein sequence while non-synonymous SNPs (the most common disease-promoting mutations) produce either polypeptide sequences that have an amino acid substitution (missense mutations) or are truncated (nonsense mutations, this is a less common event as compared to amino acid change). Some mutations exert their effects via changes to the mRNA that can lead to altered mRNA splicing, folding or stability. Some mutations impact the responsiveness of patients to certain drug treatments (concept of pharmacogenomics).

With the decreasing cost of sequencing technologies, the study of the human genome on a large scale is now possible. Rapid advances in this field of research foreshadows the use of whole-genome or whole exome sequencing towards the goal of personalized medicine also called precision medicine. This can be defined by therapy decisions tailored to individual patients (or small groups), such as to improve therapeutic efficiencies and minimize side effects.
It is important to note that synonymous mutations could cause human diseases, thus, synonymous mutations cannot be ignored in Genome Wide Association Studies. Also, single nucleotide polymorphisms represent the most common source of genetic variation in the human population; they often determine which patients are most likely to respond to or suffer adverse consequences from specific medical treatments. The key to how a synonymous mutation can affect proteins most likely lies in RNA molecules. In addition to genomic studies, epigenetics investigations are also of major importance.

Here, in silico approaches can help gaining understanding over sequence-structure-function and the disease state. For instance, only to introduce this huge field of research, in silico approaches can help to discriminate between deleterious nsSNPs leading to a protein disorder and neutral polymorphisms. There are numerous tools, one way to cluster them is to consider methods that make use of machine learning approaches and methods that attempt to compute a delta-delta G, using the 3D structure of the protein and or rule-based. Then you have meta-tools that combine approaches. As always, some methods could fit in several groups. A possible way to list the methods that attempt to predict deleterious variants is suggested here:
Approaches using machine learning of some kinds:
CUPSAT, I-Mutant2.0, LS-SNP (Large-scale annotation of coding nsSNPs), Parepro, PhD-SNP, PON-P2, SNPs&Go, SNPs3D, MutPred, nsSNPAnalyzer, PMUT, SNAP, Polymorphism Phenotyping (PolyPhen-2), MutationTaster, AUTO-MUTE....

Approaches that can be considered as rule-based:
Sorting Intolerant From Tolerant (SIFT), Align-GVGD, D-Mutant, DS-SCORE, FASTSNP, FoldX, GERP++, Gumby, LogR E-value, MAPP, Mutation-Assessor, PANTHER, PhastCONS, PolyPhen-1, PopMuSic, SCONE, Skippy, SNPeffect...

Meta-tools:
F-SNP, pfSNP, SNP functional portal, SNPit, Vista, CONDEL, META-SNP, PolyDOMS, Pro-Maya, .. (PON-P is no longer running, it has been replaced by PON-P2, which however is not a meta-predictor)...

Some Databases that do not list the same type of data, there are overlaps ...but not easy to compare:
dbSNP, HGMD (human gene mutation database), OMIM, ClinVar, UniProt/Swiss-Prot, 1000 genomes.... some are specific to cancer, COSMIC, TCGA.... Some others: NewHumanVar, ProNit, VariBench, MutDB, SNPedia, StSNP, ProTherm, PicSNP, HOPE....

In addition to these tools, if the protein is known in 3D, you can also run your own simulations and perform structural analysis. Just check the Simulation section to see if you find a tool that can help you.

See for instance the table below published in Current Protein and Peptide Science, 2002, 3, 341-364: Title: Structural Bioinformatics: Methods, Concepts and Applications to Blood Coagulation Proteins by Villoutreix BO. Section Biostructural Pathology and Conformational Diseases: Some Rules for Assigning the Effects of Missense Mutations on Molecular Functions, Folding and Stability

It should be mentioned also that in general "mutation" data have not been fully explored to improve the effectiveness and efficiency of drug discovery. Genetic, epigenetic and environmental factors define pathophysiological states. For complex diseases, the one gene one drug paradigm may not be the best approach.

Additional notes: Mutation versus variation
There are several recommendations to systematically use the term variation for the products of the mutation process, for see the Human Genome Variation Society (HGVS) nomenclature
The VariOtator tool (http://variationontology.org/VariOtator.php) provides VariO variation-type annotations automatically from the sequence and variation details

See Resources for a unified genetic nomenclature (by Dr Vihinen, Trends in Genetics, 2015)
Gene Ontology http://geneontology.org/
HGNC http://www.genenames.org/
HGVS http://www.hgvs.org/
HVP http://www.humanvariomeproject.org/
Global Alliance for Genomics and Health (GA4GH) http://ga4gh.org/
LRG (Locus Reference Genomic sequences) http://www.lrg-sequence.org/
VariO http://variationontology.org/
Sequence Ontology (SO) http://www.sequenceontology.org/
VarioML data exchange format http://www.varioml.org/
The Phenotype and Genotype Object Model (PAGE-OM) http://www.omg.org/spec/PAGE-OM/

Peptides (binding sites and/or folding and/or docking)
Peptides can be used as chemical probes, they are usually easy to synthesize and are thus very often used in biology labs. In fact, during many years, they were used in priority as small chemical compounds were difficult to obtain in academic labs. Time is changing, but, yet, in academia, it is almost a "cultural thing": I have a target, I need to act on it, I try a peptide and I patent it if it does something without really looking further and ask questions like, for this protein or for this disease, do I need a peptide, a therapeutic protein, a mAb, a small compound...

There are many debates about peptides as drug, pros and cons, many claims such as peptides have greater chances in clinical trials or PK/PD is not an issue, etc, cost is not an issue... In my opinion, it is important to really read several reviews, listing strengths and weaknesses, and decide for yourself if a peptide is appropriate for your target and for the type of disease.

 

  • Last updated on .

Email

bruno.villoutreix(at)gmail.com

Address

Follow me


© Bruno Villoutreix. A first version of this Website was launched in 2006. Thank to Natacha Oliveira