Data Analysis of Single Nucleotide Polymorphism in Human AGT Gene Using Computational Approach

Background: The AGT gene is gene responsible for regulation of protein called angiotensinogen which regulates blood pressure and balances fluids in the body. Hypertension happens due to many causes one of this is the defect in AGT gene. Hypertension usually has no symptoms. However, it is a major risk factor for heart diseases, stroke, kidney failure, and eye problems. Objectives: in this study we use software to analyze the gene using different software and represented statistically and to detect the SNPs that can cause the disease. Material and Method: In this analysis using many software tools that can analyze the nsSNPs retrieved from NCBI website. These software include SIFT, I-mutant, Polyphen-2, PHD SNP and SNP& Go, Projecthop and GeneMANIA. Results: The study showed that from 172 nsSNPs only 46 nsSNPs were deleterious while 126 were tolerated using SIFT. Two were benign, 11 were possibly damaging and 33 were probably damaging by Polyphen-2. Using Provean, 19 nsSNPs were neutral and 27 were deleterious. For PHD-SNP software 20 nsSNPs were disease related and 18 were neutral. Also SNPs were checked using SNP & Go software that showed 32 neutral nsSNPs and 14 nsSNPs were disease associated variants. Using I-Mutant software 13 nsSNPs increase the stability of the protein and 33 decrease the protein stability. Conclusions: In conclusion, extensive functional and structural analyses are carried out to predict potentially damaging and deleterious nsSNPs of AGT gene using bioinformatics and computational methods. In the study, 14 high confidence damaging nsSNPs are identified from 172 nsSNPs. Although bioinformatics tools have their limitations, the results from the present study may be convenient in future for further population based research activities and towards development of accuracy medicines.


Introduction
AGT is a gene use to control type of protein called angiotensinogen, which represent a part of angiotensin system, that function to regulate blood pressure and fluid in the body [1]. This regulation happens by converting angiotensinogen into angiotensin I [1]. And angiotensin I is converted into angiotensinogen II, that cause blood vessels to narrow leading to increase in blood pressure [2]. Also angiotensinogen II induce production of aldosterone hormone [2], which plays a role in salt absorption by kidney [1], leading to increase body fluids hence increase in blood pressure [3]. Normal blood pressure is important during fetal life which delivers oxygen to body tissue, also need it for kidney development especially the proximal tubules and growth factors involving in kidney structure. [4].
Many health conditions associated with the disease were caused by mutation in AGT gene, among these is hypertension, a specific mutation in AGT gene causes the disease [5]. However, hypertension is a major risk factor for heart disease, stroke, kidney failure, and eye problems. When blood pressure is elevated, the heart and arteries have to work harder than normal to pump blood through the body [6]. The extra work thickens the muscles of the heart and arteries and hardens or damages artery walls [7]. As a result, the flow of blood and oxygen to the heart and other organs is reduced.

Material and Methods
Data retrieval: this was done using the dbSNP (http://www.ncbi.nlm.nih.gov/SNP/). Information regarding SNPs of AGT gene was obtained during the year 2019. Interaction of this gene with other genes was investigated using GeneMANIA. Functional effect of the nsSNPs on the protein was investigated using SIFT, Polyphen-2, and Provean. The stability of the protein as the result of the mutation was studied using I-Mutant lastly the effect of the SNPs on the structure was predicted using Project hope.

GeneMANIA
(http://www.genemania.org) [8]. It is a web interface that finds other genes related to a set of input genes, using a very large set of functional association data. Gene name was entered into the software and the result show that an association data include protein and genetic interactions, pathways, coexpression, co-localization and protein domain similarity.

SIFT (Sorting Intolerant from Tolerant) http://blocks.fhcrc.org/sift/SIFT.html [9]
It is an online tool that predicts if an amino acid substitution affects protein function or not by using sequence homology. The dbSNP that were retrieved from NCBI were entered into the software and the result appears as deleterious or not according to whether amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids.

Polyphen-2 (Polymorphism Phenotyping v2)
http://genetics.bwh.harvard.edu/pph2/. It is used to predict the possible impact of an amino acid substitution on both structure and function of protein by analysis of multiple sequence alignment and protein 3D structure [1]. The software estimates the position specific independent count score (PSIC) for every variant and then determines the difference between them, the higher the PSI, the higher the functional impact of the amino acid on the protein function may be. Prediction outcomes could be classified as probably damaging, possibly damaging or benign according to the score ranging from (0-1).

Provean (Protein Variation Effect Analyzer)
(http://provean.jcvi.org/index.php). It is a software tool which predicts whether an amino acid substitution has an impact on the biological function of a protein. SNPs were entered using protein sequence. The Prediction outcomes could be classified as tolerated or deleterious.

I-Mutant3.0
(http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi For studying the effect of mutations on protein Stability. I-Mutant3.0 software was used.). It is a neural network based tool, predicts the change in the stability of the protein upon mutation [10]. The output is obtained in the form of protein stability change upon mutation and Gibbs-free energy change (DDG) either increased or decreased stability.

Project Hope
(http://www.cmbi.ru.nl/hope/). It is an automatic program that analyzes the structural and functional effects of point mutations. Five SNPs were inserted into the software and the results shows the effect of the mutation in the amino acid properties and how this affects the also plus an image for protein structure is displayed whenever available [10].

SNPs &GO (Single Nucleotide Polymorphism &Gene Ontology), PHD-SNP
(http://snps.biofold.org/snps-and-go) [2]. SNPs& GO is an accurate method that, starting from a protein sequence, can predict whether a variation is disease related or not by exploiting the corresponding protein functional annotation. SNPs& GO collects in unique framework information derived from protein sequence, evolutionary information, and function as encoded in the Gene Ontology terms, and outperforms other available predictive methods. [2] The protein sequences is submitted in FASTA format that is obtained from UniprotKB / ExPASY after submitting the sequence the mutations were submitted in the XPOSY format where X and Y are the wild-type and mutant residues respectively. The result is shown as Neutral or disease. PHD-SNP results are presented as part of SNPs& GO output

Results
In this study AGT gene was found to have an association with 20 other different genes. Among them the most important one is REN (responsible for production of renin in the kidney) and AGTR2 (responsible for encoding receptor for angiotensin II) ( Figure 1 and Table 1). The physical interaction and co expression of this gene with other related gene are shown in figure 1. The genes expressed with AGT gene were shown in Table 3, Appendix.
The total number of SNPs obtained was 173 the nonsynonymous SNPs that were predicted to be deleterious by mutation were 46 and that not causing damage or tolerated were 127 SNPs using SIFT software. Analysis using Polyphen -2 revealed one SNP as benign, 12 as possibly damaging and 33 are probably damaging. Analysis with provean showed that 19 were neutral and 27 were deleterious. Protein stability was checked using I-mutant software which showsed13 SNPs increasing the protein stability and 33 were decreasing the protein stability. Prediction of whether SNPs were deleterious were checked using PhD-SNP software and showed 20 disease related and 18 were neutral in all SNPs. Also SNPs were checked using SNP & Go software that showed 32 neutral and 14 were disease associated variation. The detailed results for SIFT, Polyphen-2, Provean-2 were shown in Table A1. The detailed results for I-mutant, SNP&GO and PHD.  The mutant residue is smaller than the wild-type residue. This will cause a possible loss of external interactions The mutant residue is NEUTRAL, the mutant residue charge is POSITIVE. this can cause repulsion between the mutant residue and neighboring residues.
The wild-type residue is more hydrophobic than the mutant residue. The mutation might cause loss of hydrophobic interactions with other molecules on the surface of the protein.
2. rs137858911, Isoleucine into a Serine at position 345 The mutant residue is smaller than the wild-type residue. The mutation will cause an empty space in the core of the protein.
The wild-type residue is very conserved, but a few other residue types have been observed at this position too.
The wild-type residue is more hydrophobic than the mutant residue. The mutation will cause loss of hydrophobic interactions in the core of the protein.
3. rs137858911, Isoleucine into a Serine at position 345. The mutant residue is smaller than the wild-type residue. The mutation will cause an empty space in the core of the protein.
The hydrophobicity of the wild-type and mutant residue differs.
The mutation will cause loss of hydrophobic interactions in the core of the protein 4. rs375261929, Arginine into a Cysteine at position 458. The mutant residue is smaller than the wild-type residue. The wild-type residue forms a hydrogen bond with Proline at position 347. The size difference between wild-type and mutant residue makes that the new residue is not in the correct position to make the same hydrogen bond as the original wild-type residue did.
The wild-type residue charge was POSITIVE, the mutant residue charge is NEUTRAL. The difference in charge will disturb the ionic interaction made by the original, wild-type residue.
The mutant residue is more hydrophobic than the wild-type residue. The difference in hydrophobicity will affect hydrogen bond formation. The wild-type residue forms a salt bridge with Aspartic Acid at position 352. Results were shown in Figure 3.

Conclusions
In conclusion, extensive functional and structural analyses are carried out to predict potentially damaging and deleterious nsSNPs of AGT gene using bioinformatics and computational methods. In the study, 14 high confidence damaging nsSNPs are identified from 172 nsSNPs. Although bioinformatics tools have their limitations, the results from the present study may be convenient in future for further population based research activities and towards development of accuracy medicines.

Recommendations
SNPs in AGT gene cause may diseases mainly hypertension and other related diseases for the fact that hypertension is a main chronic disease worldwide. More wetlab research regarding these 14 SNPs is recommended. The SNPs in the non-coding region also needs to be considere  Figure A1. Genes co-expressed with AGT gene using GeneMANIA software.