Exploration of Novel Sulpho Tyrosine Based Unnatural Amino Acid Ligand for Inhibition of Human Shp2; A Computational Approach

The SHP2 protein is a Protein tyrosine phosphates (PTPs) protein family, it catalyze the dephosphorylation of phosphotyrosine residues in protein substrates and play a critical roles in regulating intracellular signal transduction and is responsible for controlling cell growth, differentiation, motility, and metabolism. Whereas, Shp2 has non-receptor PTP containing two N-terminal Src homology 2 (SH2) domains, a PTP domain, and a C-terminal tail. The SHP2 adopts an autoinhibited conformation in its basal state, whereby the N-terminal SH2 domain interacts with the PTP domain and blocks access to the catalytic site. The phosphorylated proteins bind to the SH2 domains of SHP2 and activate the dephosphorylation, which imparts down regulation of RTK-dependent signaling leads to activate oncogenes. Hence, The Shp2-PTPs interaction in physiological processes and that modulation of their enzymatic activity may constitute a therapeutic approach for the treatment of cancer. In the present work we have designed the four sulpho tyrosine based unnatural amino acid libraries through the Insilico modeling, to demonstrate the utility of, Phenyl sulfoaceticacid (PSAA) based Cap-group (a novel sulpho-Tyrosine Mimic) incorporated with novel N-heterocyclic based unnatural amino acid as a Spacer in Library-1, n-Dioxothiazolidene spacer in Library-2, n-pyridazine spacer in library-3 and n-imidazole spacer in library-4 respectively, which was development for novel anti cancerous Shp2inhibitors, resulted in the five most potential ligand such as Ligand-1a &1b, 2a, 4a & 4b has shown to significant anti-cancerous shp2 inhibitor activity when compared with standard ligand SHP099.


Introduction
The various enzymes catalyze most post-translational modification (PTM) events in biological processes and PTMs are regulatory processes by the chemical modification of a protein after its translation. One of the PTM is phosphorylation/dephosphorylation of a protein is a key reversible modification that regulates protein activity, localization, degradation and complex formation. The Reversible protein phosphorylation and dephosphorylation reactions are catalyzed by protein kinases (PKs), and protein phosphatases [1]. These proper levels of enzymatic process is play critical roles in regulating intracellular signal transduction pathways responsible for controlling cell growth, differentiation, motility, and metabolism. Whereas, Shp2 is a phosphatase, encoded by the PTPN11 gene, is a non-receptor PTP containing two N-terminal Src homology 2 (SH2) domains, a PTP domain, and a C-terminal tail. The SHP2 adopts an auto inhibited conformation in its basal state; where by the N-terminal SH2 domain interacts with the PTP Space for Allosteric Site of Human Shp2 Protein Tyrosine Phosphatase; A Computational Approach domain and blocks access to the catalytic site. Others have previously demonstrated that bis-phosphotyrosyl peptides (e.g., IRS-1) or phosphorylated proteins bind to the SH2 domains of SHP2 and activate the phosphatase, which imparts cancer dependence [2]. Hence The Shp2-signalling and protein-protein interaction in physiological processes and that modulation of their enzymatic activity may constitute a therapeutic approach for the treatment of cancer, diabetes, and certain immunologic disorders.
The positive-charged environment of the Shp2-PTP catalytic pocket presents unique drug discovery challenges, the most catalytic site inhibitors require multiple ionizable functional groups in order to inhibit the enzyme, these functional groups, in turn, complicate drug discovery and development due to low cell permeability and bioavailability. Most of the Shp2-PTP catalytic site inhibitors often lack robust selectivity among other PTPs (e.g., SHP1, PTP1B) and also suffer from poor cell permeability and oral bioavailability due to the presence of polar and ionic functional groups [3].
Despite these challenges, The present work we have designed the four focused libraries through the Insilico modeling, to demonstrate the utility of, Phenyl sulfoaceticacid (PSAA) based Cap-group (A Novel Sulpho-Tyrosine Mimic) incorporated with novel Nheterocyclic based unnatural amino acid as a Spacer in Library-1, Dioxothiazolidene scaffold based pharmacophore linker in Library-2, Pyridazine based linker in Library-3 and Imidazole based Linker in library-4 respectively, which is for development of novel anti cancerous Shp2-inhibitors. As a proof of concept we will archive the ell permeability and to minimize the toxicity, reduce the dosage and to eradicate the cancer cells and normal cells unharmed.

Selection of Target Protein Sequences
Sequences of SHP2 domains of PTPs enzymes signaling pathways were obtained from UniProtKB/Swiss-Prot protein knowledgebase (ID: Q06124). The availability of protein sequences has made a telling difference in countless studies of biologically important molecules. The SHP2 (567aa), enzymes regulate the dephosporylation of RTK signaling pathways to leads to down regulation of RTK dependent protein signaling, families and transcription factor resulted in the activation of various oncogenes. The vast quality of domains data associated with these proteins poses enormous challenges to attempt at sequence/structure/function annotation. In addition, structure based programmatic initiatives for estimation of taxonomic classification at the molecular level and pattern recognition based approaches and it helps for study disease targets.

Templates Identification
The search for template sequences was performed using PSI-BLAST program and the search was performed in the PDB database. This protein structure was modeled based on the template sequences of SHP2 domains. The conserved protein sequence regions are extremely useful for identifying and studying functionally and structurally important regions. Sequence conservation of homologous sequences is rarely homogeneous along their length; as sequences diverge, their conservation is localized to specific regions. In order to obtain the sequence similarities of conserved region proteins, it is necessary to decide the scale of protein clustering, conserved.

Homology Modeling
We build the three dimensional protein structures of targets and templates using Swiss-PDB viewer. The target protein structures of SHP2 and template protein sequences include C-SH domain region, N-SH region and a PTP-Domain [3]. The mapping of 3D structure (template) shows significant amino acid sequence similarity with the target sequence. A building of homology model comprises of alignment of target and template structures, model building and model quality evaluation. The quality of modeled structure and the stereochemical quality of a protein structure by analyzing residue-by-residue geometry and overall structure geometry to estimate overall accuracy of the different modeling servers with regards to different sequence identities between target and template.

Structure Validation and Analysis
Predicted 3D models were verified by SAVS and validation of stereo-chemical quality of a models was performed through WHATIF, ERRAT, PROCHECK, PDBSUM, Ramachandran Plot2.0 and Verify3D. A Ramachandran plot was generated for each computationally predicted modeled 3D protein structures to testify the steric hindrances of protein residues, and the outlier errors were corrected accordingly using WinCoot. Most of the structural mis-alignment errors in rotamers occurred due to side chain packaging and folding. The values for RMSD (Root mean square deviation), SDM (Structural deviation measure), Bfactor and Q (quality)-factor were used to verify further the quality of the resulting structures of SHP2-protein using SAVS.

Active Site Prediction
A prediction of active site and binding sites of ligand are in a protein using CastP. The interaction can be calculated using interaction energy between the protein and a simple Van-der-Waals probe to locate energetically favorable binding sites. Energetically favorable probe sites are clustered according to their spatial proximity and clusters are then ranked according to the sum of interaction energies for sites within each cluster. A probable ligand binding pockets of SHP2-protein is calculated according with geometry accuracy of RMSD and superimposition of the target to its native structure.
RMSD depends on the number of equivalent atom pairs of both proteins (target and template) that are compared, which in turn depends upon the maximum allowed distance between atom pairs.

Ligand Preparation
The entry point for any chemistry program within drug discovery research is generally the identification of specifically acting low-molecular-weight modulators with an adequate activity in a suitable target assay. The four Sulpho Tyrosine Based Unnatural Amino Acid Ligand Library was drawn and retrieved from ACD-Chemsketch which is contains totally 50 molecules. The physicochemical and ADMET analysis of all ligand were obtained from molinspiration and Chemicalize data bases.

In Silico-High Throughput Screening
Novel libraries of four Sulpho Tyrosine Based Unnatural Amino Acid [4], Ligand Library which is contains totally 50 molecules were designed based on drug-like properties of their ability is due to the ligand to interact and inhibit the polar nature catalytic site of SHP2-protein. The ligand structures were designed by using ACD/Chemsketch and saved in. MOL file format. A lead compounds or scaffolds can be identified from diversified compound pool and accelerated screening, a screened pools is focused for biotargets to inhibit the diseases. We used structural based screening, through molecular docking by using binding and activation to further probe the parent library. The structure of the lead fragments i.e., "the testing ligands" was designed based on the basis of docking studies of Sulpho Tyrosine Based Unnatural Amino Acid Ligand Library with SHP2protein. The fragments were identified on the basis of "Lipinski's Rule of Five" and may therefore represent suitable starting point for evolution of good quality lead compounds.

Molecular Docking
Novel designed libraries of Sulpho Tyrosine Based Unnatural Amino Acid Ligand Library were used as the initial coordinated for docking with SHP2 protein and Auto Dock 4.2 is used to check the binding energies of the chosen compounds at the active sites of SHP2-proteins. Grid maps of different grid points, centered on the ligand of the complex structures, were used for receptors respectively, to cover binding pockets. A set of Lamarckian genetic algorithm was used for molecular docking simulations. Population size of 150, mutation rate of 0.02, and crossover rate of 0.8 were set as the parameters. Simulations were performed using up to 2.5 million energy evaluations with a maximum of 27 000 generations. Each simulation was performed 10 times, yielding 10 docked conformations. The lowest energy conformations were regarded as the binding conformations between the ligands and the proteins. Reverse Validation. In this validation process, the complete strategy followed in this study was reversed to ensure that the identified hits really fit the generated models and active sites of both targets. All the parameters required for molecular docking were set as used in actual process.

Sequence Similarity Scores
The selected protein (SHP2) sequences were sequenced, modeled and analyzed using PSI-BLAST. The scores of the proteins in descending order such that the proteins most likely to bind to a ligand would be clustered at the top. Scores are generated by calculate on similarity between each protein sequence as compared to the length of the protein sequence of the SHP2-protein (table 1). The sequence similarity threshold is selected both empirically and experimentally identified with three identities and one similarity for every five amino acids. The SHP2 contains C-Terminal SH regions is more similar and is showing functional domains of 99.62% similarity in N-terminal SH and PTP-Domain end regions. Has a result of these templates similar sequences are uncharacterized, but it shows more functional properties in diseases causing. These sequences are more helps to used homology models to build three dimensional structure predictions.

Homology Modeling of Protein Structures
The target and the template sequences (PDB ID: 5XZR_A) were aligned using a comparative protein modeling program Swiss Model. The theoretical models were subjected to model using Swiss-PDB-viewer. The protein structures were superimposed in order to deduce structural alignment (Figure: 1).
The SHP2 protein posses C -SH, N-SH and PTP-Domain shows 82.9% similarity and used for model, after modeling the structure will shows 99% of model. The geometric accuracy of the theoretical 3D modeled proteins was corrected to study on quality of proteins.  The reliability of protein structures can be validates using protein structure analysis and verification server (SAVS). The modeled protein structures with statistical measures of quality derived from X-ray crystallographic data. Ramachandran Plots and stereochemical reports of SHP2 make it easy to identify and isolate regions of predicted structures that required further treatment ( Figure  2).

Q-Score
The Quality assessment is important of modeled protein structures. A scoring function tends to achieve a higher objective value as it is based solely on sequence similarity of target and template. Values of Q-score range from zero to one, where 1 represents the identical structures and 0 represent the dissimilar structures. The predicted models of SHP2 showed high Q-score. The Structural deviation measure (SDM) of SHP2 proteins should be lower when compared with its template protein. This confirms the good quality of the predicted structure. Some of the disturbing noise values in SDM were observed because of the sequence diversity between target and template proteins. However, the important functional domains and motifs showed 98%, 99% conservation. The overall RMSD values for the predicted models of SHP2 domains range from 0.3 to 0.6 A°. Evolutionarily highly conserved backbone structures were observed in all models with few fluctuating residues and rotamers.
We measure predicted site maps onto the ligand coordinates using a RMSD calculations. The clustering of coordinates is 1.6 Å of a ligand atom. A normal energy threshold values is (-1.0 to -1.9 kcal/mol) for retaining methyl binding site but predicted SHP2 were varying according the binding energy cut off of 1.4 kcal/mol. There are more than 15 active site amino acids were predicted and these amino acids are used for ligand docking studies. The sulpho tyrosine based all unnatural amino acid ligands were containing n-heterocyclic ring in first library, the ndioxothiozole ring in second library, n-pyradizine ring in third library and imidazole based scaffold in fourth library respectively, which was mainly focused for these novel pharmacophoric chemical space to be interacting polar nature shp2 catalytic site, used for the inhibition of shp2 enzyme in various cancers. All generated ligand from four libraries (Figure 4), totally 50 molecules and their biological properties were analyzed using Hyperchem 7.5 and moleinspiration Professional.  A desired Log P value (octonol-water partition coefficient) is no more than 5 (also part of the so-called Lipinski rule-offive; LogP 5 = 1:100,000 concentration difference between water and octonol phases). Based on Log P values the ligand 2a and 4a having the optimum log p 1-2 for persisting hydrophilic and hydrophobic is shown best molecules is strongly accepting Lipinski rule and it is a best molecule for molecular descriptors studies based on mole-inspiration and Hyperchem 7.5 shown in (Table: 3 & Figure 5)

Molecular Docking
The docking of competitive bioactive molecules of newly designed sulpho tyrosine based unnatural amino acid ligand library of totally 50 structures and the standard ligand SHP099 onto the conserved domain regions of SHP2 were performed using Autodock4.2 software package. The homology model of SHP2 was added polar hydrogen atoms and its non-polar hydrogen atoms were merged. For the ligand, non-polar hydrogen atoms were merged with Gustier charges assigned. All rotatable bonds of ligand were set to be rotatable. Docking was performed using genetic algorithm and local search methods. A population size of 150 and 10 millions energy evaluations were used for 100 times searches, with a 80 x 80 x 80 dimension of grid box size and 0.375 Å grid spacing around the domain. Clustering histogram analyses were performed after the docking searches. The best conformations were chosen from the lowest docked energy that populated in the highest number of molecules in a particular cluster with not more than 1.5 Å root-mean-square deviations (RMSD). The Hbond interactions and its binding energy were evaluated for the best affinity by using Pymol molecular visualize. The docking of SHP2 domains binds with most active ligand such as Ligand-1a, Ligand-1b, Ligand-2b, Ligand-4a and Ligand-4b along with STD were represented in Figure: 6-11 and Table 4.

Discussion
A foremost goal of anti-cancer agents is to particularly eradicate the tumor cells while leaving the normal cells unharmed. Hence, in this work we have focused on Allosteric inhibition of Shp2-PTP catalytic sites through small molecules, particularly this protein -protein interaction altered or deregulated in cancer cells. As a result, an attractive approach for anti-cancer therapy will be developed which produces molecules that can modulate tumor specific proteins and protein -protein interactions.
The present work was aimed to develop Novel sulpho-Tyrosine based Allosteric inhibitors for Shp2 enzymes, as Anti-Cancer therapeutics. Shp2 down regulates the RTK dependent signaling proteins leads to activate the oncogenes thus, activation of proper level of RTK dependent phosphorylation via inhibition of Shp2-PTP catalytic site. In this context we developed sulfotyrosine based unnatural amino acid ligand for Shp2 allosteric inhibitors [5].
The positive-charged environment of the Shp2-PTP catalytic pocket presents unique drug discovery challenges, the most catalytic site inhibitors require multiple ionizable functional groups in order to inhibit the enzyme, these functional groups, in turn, complicate drug discovery and development due to low cell permeability and bioavailability. Most of the Shp2-PTP catalytic site inhibitors often lack robust selectivity among other PTPs (e.g., Shp1, PTP1B) and also suffer from poor cell permeability and oral bioavailability due to the presence of polar and ionic functional groups [6][7][8][9][10]. Despite these challenges, we have to minimize the toxicity effect, hence the present work was focused on instead of phosphate group, due to the sensitivity to hydrolysis, we generated four novel sulpho-tyrosine based unnatural amino acid library for allosteric inhibitors of Shp2, which was described in the In-silico high throughput design via molecular docking, resulted in the novel sulpho tyrosine unnatural amino acid, Shp2-ligands were containing the spacer of n-heterocyclic ring in first library, the ndioxothiozole ring in second library, n-pyradizine ring in third library and n-imidazole in fourth library respectively, these novel pharmacophores chemical space were interacted with polar nature shp2 catalytic site with the help of nprotonation. Hence, the molecular docking results reported strong hydrogen bonds and electrostatic interaction and pi-pi interactions shown on table 4 and Figure 6 to 11, hence, our result clearly indicating that the most potential ligand such as ligand 1a & 1b, 2a, 4a & 4b has shown significant Shp2 inhibitor activity through binding and activation compared with standard ligand SHP099.

Conclusion
Demonstrated the utility of in-silico design, we were able to achieve the novel sulfo tyrosine based unnatural amino acid chemical space for Shp2 protein tyrosine phosphates enzyme family in these ligand 1a & 1b, 2a, 4a & 4b shown remarkable anti-cancerous Shp2 inhibitor activity through binding and activation.