Analysis of the Differentially Expressed Genes and microRNAs and Prediction of miRNA-mRNA Negative Regulatory Network in Nasopharyngeal Carcinoma

: Introduction: Nasopharyngeal carcinoma (NPC) is an endemic cancer in southern China, particularly in Guangdong population, but the prognosis of NPC is poor. Recently microRNA (miR) has been shown to have function in aiding the treatment of cancer. Thus, in this study, miRNAs and genes associated with NPC were analyzed. Methods: mRNA-sequencing and miR-sequencing data were obtained from the Gene Expression Omnibus. The differentially expressed genes (DEGs) and miRNAs (DEMs) were filter out. Then, the gene function annotations about the DEGs were predicted using Gene Ontology (GO) and KEGG pathway. Subsequently, the protein-protein interaction (PPI) network was established based on the STRING database, and function modules were identified using Cytoscape. Finally, DEGs targeted by DEMs were predicted by using the miRDB, miRTarBase, TargetScan and DIANA databases, and the DEM-DEG negative interaction network was built. Results: In all, 704 DEGs (about 49.9% upregulated) were enriched in 234 GO terms and 53 KEGG pathways. Seven hub genes (APP, GNG2, VAV1, RAC2, YES1, EGFR and GNB5) in 6 function modules were found for the PPI network. In addition, 86 DEMs were identified containing 56 upregulated and 30 downregulated miRNAs. There were 538 DEM-DEG pairs, of which miR-93-5p/TGFBR2, miR-455-3p/STK17B and miR-766-5p/ITGAV had functions in other cancers, moreover, these pairs may potentially contributed to NPC pathogenesis. Conclusion: The constructed miRNA-mRNA negetive regulatory network will give help in elucidating the molecular mechanisms of NPC. The important DEGs, DEMs and DEM-DEG pairs associated with NPC may contribute to the diagnosis and treatment of NPC in the future.


Introduction
Nasopharyngeal carcinoma (NPC) is an endemic cancer in southern China, particularly in Guangdong population [1,2].There are about 129,000 new NPC cases and 73,000 associated deaths in 2018, moreover, it showed an uptrend every year [3].Multiple factors contribute to the accumulation of the NPC development, including predisposing genetic factors, environmental carcinogens, and Epstein-Barr virus infection [2,4].So far, the NPC pathogenesis has not been well known, which limits the diagnosis and therapy of NPC [5].Thus, exploring the molecules taken part in the NPC progression and their relationships is essential.
RNA sequencing (RNA-Seq) is a revolutionary tool that has been widely used in cancer research [6].MicroRNAs (miRNAs) are useful diagnostic and prognostic indicators for human cancer [7,8].By comparing the RNA sequencing results of cancer tissue and normal tissue samples can provide insights into the molecular mechanism of tumorigenesis [9].The subsequent combination of bioinformatics analysis plays an important role in screening tumor biomarkers [10].Ge et al [11] performed bioinformatic analysis of the GSE12452 dataset which included 31 cancer tissues of NPC and 10 control samples, and identified some important genes that may contribute in NPC development.Zhu et al [12] analyzed three mRNA expression datasets (GSE12452, GSE34573 and GSE64634) comprising of 59 NPC cases and 17 controls and also identified some important pathways and significant expressed genes in NPC development by bioinformatic tools.Liu et al [13] collected 3 mRNA expression profiles (GSE53819, GSE13597, GSE12452) and 5 miRNA expression profilies (GSE32906, GSE46172, GSE22587, GSE32960 and GSE36682) and found some miRNAs and genes might play essential functions in the NPC development process.Thus, screening the differentially expressed miRNAs or genes in case and control helps in elucidating the molecular mechanisms of NPC.Nevertheless, the studies focused on the bioinformatics analysis of the regulation between miRNAs and mRNAs in NPC are still limited, the combination analysis of the miRNAs and mRNAs may be more helpful in uncovering the mechanisms of the occurrence and development of NPC.
Based on the previous research, here, three expression profile datasets were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/).Among which, GSE68799 and GSE118719 were mRNA data and GSE118720 was miRNA data.The differentially expressed genes (DEGs) and miRNAs (DEMs) were screened out by comparing the data of the NPC cancer tissues and normal nasopharyngeal mucosal controls.Then, the DEGs were carried out a series of bioinformatics analysis, including the functional enrichment analysis, the protein-protein interactions, the functional modules and the hub genes analysis.After that, for the DEMs, the potential downstream target DEGs were predicted, and the miRNAs-mRNAs negative regulated network in NPC progression were constructed.The results will be helpful in further uncovering the NPC pathogenesis.

The mRNA and miRNA Sequencing Data
The mRNA expression profiles (GSE68799 and GSE118719) and miRNA expression profile (GSE118720) were obtained from GEO.Data GSE68799 included 42 Chinese NPC specimens and 4 control specimens, while data GSE118719 and GSE118720 were from the same group containing 7 NPC tissuess and 4 normal nasopharyngeal mucosal controls.In addition, GSE68799 was achieved using the platform of GPL11154 Illumina HiSeq 2000, while GSE118719 and GSE118720 were obtained using GPL20301 Illumina HiSeq 4000 (Illumina, San Diego, CA, USA).

Screening out NPC-associated mRNAs and microRNAs
After three expressing profiles GSE68799, GSE118719 and GSE118720 were downloaded, the data were processed using R/Bioconductor package limma linear models for sequencing data to analyze the DEGs and DEMs between the NPC tissues and non-NPC tissues.The false discovery rates (FDR) < 0.05 and the fold-change (FC) ≥ 2 were used as a cut-off.

Functional Annotation and Pathway Enrichment
Analysis of the DEGs DAVID (https://david.ncifcrf.gov) is a well-known database which can supply functional interpretation of gene clusters [14].The functional enrichment analysis includes Gene Ontology (GO) functional enrichment analysis comprised of biology process (BP), molecular function (MF) and cellular component (CC).In this study, DEGs and the predicted miRNA target genes were analyzed using DAVID, including enrichment analysis of GO functional terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.P < 0.05 was regarded as significant.

Protein-protein Interaction (PPI) Network Construction and Module Analysis
The interactions between different proteins encoded by the DEGs were performed using STRING database (https://string-db.org/)[15], and the combined score of > 0.9 was thought as a cut-off.The PPI network was visualized using Cytoscape software v3.7.2 (http://www.cytoscape.org).The functional modules within the PPI network were established using the Molecular Complex Detection (MCODE) in Cytoscape, and the significance score of the module was set to > 5.0.

miRNA-mRNA Interaction Network Construction
The mRNAs targeted by different DEMs in this study were predicted by four databases: miRDB (http://mirdb.org/),miRTarBase (http://mirtarbase.mbc.nctu.edu.tw/),TargetScan (http://www.targetscan.org/vert_72/)and DIANA (http://carolina.imis.athena-innovation.gr/diana_tools/web/index.php).It should be noted that the DIANA database was used for predicting the mRNAs targeted by EBV BART miRNAs specially.All the predicted mRNA targets were taken intersected with the DEGs, and the overlapping genes were defined as the mRNA target dataset.Then the miRNA-mRNA (DEMs-DEGs) interaction network with negative interaction relationships was constructed and visualized using Cytoscape software v3.7.2.

Expression Profiling Datasets Retrieved from GEO Datasets of NCBI
The sequencing samples contained in the present study  1.

DEGs and DEMs Analysis
Following analysis of GSE68799, GSE118719, GSE118720 datasets, there were 2339 DEGs (1352 up and 987 down), 2732 DEGs (945 up and 1787 down) and 86 DEMs (56 up and 30 down) were screened out, respectively.The cluster analysis of DEGs showed great differences in the normal nasopharyngeal controls and NPC specimens (Figure 1, Figure 2 and Figure 3).Furthermore, using Venn diagram analysis, GSE68799 and GSE118719 shared 704 identical DEGs, among which, 351 were upregulated (Figure 4).
The pathway enrichment analysis was performed and there were 53 KEGG significant pathways.The top ten significant pathways were shown in Figure 5D, among which, the three primarily associated pathways were as follows: B cell receptor signaling pathway (p=2.32E-14),Pathways in cancer (p=1.30E-08),Focal adhesion (p=3.95E-07)(Figure 5D).
Moreover, the DEGs included in different modules were carried out pathway enrichment analysis.For DGEs in module 1, the most significant enriched KEGG pathway was Chemokine signaling pathway.The DEGs in module 2 were all down-regulated, and the most significant pathway was B cell receptor signaling pathway.In module 3, the top enriched pathway was ECM-receptor interaction.The DEGs involved in module 4 and module 5 were upregulated, the top enriched pathways were Protein digestion and absorption and Arrhythmogenic right ventricular cardiomyopathy, respectively.In module 6, the top enriched KEGG pathway was Regulation of actin cytoskeleton (Table 2).

Discussion
In the present research, bioinformatics methods were applied to analyze three public NPC sequencing data, and found some potential important genes and miRNAs related with the pathogenesis of NPC, the miRNA-mRNA correlation pairs may supply some clues for the molecular mechanisms of NPC pathogenesis.
Through protein-protein interaction network analysis, 7 hub genes (APP, GNG2, VAV1, RAC2, YES1, EGFR and GNB5) were found, which suggested their potential important roles in the pathogenesis of NPC.Among these genes, EGFR is a well known cancer-related gene [13].It were reported to be overexpressed in most head-and-neck cancers [14].Aberrant expression of EGFR presented potential unfavorable prognosis for patients with NPC [16,17], and activation of EGFR signal transduction can promote the invasion and metastasis of NPC cells [15].Interesting, another hub gene APP was reported to be a downstream protein of EGFR in NPC.Knockdown of APP can suppress the progression of NPC through inhibiting cell viability, migration and invasion [18], and upregulation of secretory APP by EGFR may relate to the pathogenesis of NPC [19].Our studies were consistent with these previous research that EGFP/APP took part in NPC progression.RAC2 can regulate many biological processes in cells, such as secretion, and phagocytosis [20].RAC2 was also reported to play roles in cancer, and it may be a prognostic biomarker in clear cell renal cell carcinoma [21].Moreover, RAC2 had also been regarded as a hub gene in the NPC high-throughput data bioinformatics analysis [11], but the molecular mechanism of RAC2 in NPC was not clear and deseared researching in the next experiment.The research of other 4 hub genes (GNG2, VAV1, GNB5 and YES1) in NPC was limited.However, VAV1, GNB5 and YES1 were reported to affect the progression of other cancer types.For example, YES1 can drive lung cancer growth and progression and predict sensitivity to Dasatinib [22].What's more, YES1 has an important role in antitumor immunity, and phosphorylated Yes (p-Yes) level combined with the pIgR level can work as a prognostic biomarker in specific hepatocellular carcinoma patients [23].Among the 7 hub genes, GNG2 was the least reported in tumor.It was reported the GNG2 expression level was droped in malignant melanoma, which indicated GNG2 may be a cancer suppress gene in NPC and this needed further confirmation [24].
In this study, a total of 86 DEMs were found in NPC.We focused on the DEMs which had more target genes within the DEGs.Among the up-regulated DEMs, hsa-miR-548p, hsa-miR-93-5p, hsa-miR-548ah-3p, hsa-miR-329-3p and hsa-miR-455-3p had the top 5 target mRNAs (Table 2).Among which, many of the miRNAs or the related miRNA-gene pairs have been reported in NPC or other cancer types.For example, hsa-miR-93-5p was probably to be a candidate biomarker in NPC [25].The hsa-miR-93/TGFBR2 (Transforming Growth Factor Beta Receptor 2, TGFBR2) pair promoted the proliferation and invasion in prostate tumor [13,14].hsa-miR-455-3p inhibited the proliferation and invasion of esophageal squamous cell carcinoma, and could serve as a prognostic marker [9].The hsa-miR-455-3p/STK17B (serine/threonine kinase 17B, STK17B) pair can stimulate hepatocellular carcinoma cell proliferation and metastasis [14].Interesting, in this study, we also predicted hsa-miR-93/TGFBR2 and hsa-miR-455-3p/STK17B took part in NPC progression, and this will need further experimental verification.It should be noted that hsa-miR-548p was reported to be a tumor suppressors in breast cancer and hepatitis B virus X protein associated hepatocellular carcinoma [26,27], however, it was upregulated in this study, which suggested it may be a tumor promoter in NPC.Now the research about hsa-miR-548p is not enough, and whether hsa-miR-548p plays different roles in various cancer types will need further experiments.
For the down-regulated DEMs, hsa-miR-1273h-5p, hsa-miR-29a-3p, hsa-miR-766-5p, hsa-miR-29b-3p and hsa-miR-29c-3p had the top 5 target mRNAs.All the 5 miRNAs were cancer-related.hsa-miR-766-5p was downregulated in lung cancer tissues and cells [28] and inhibited the cancer progression in colorectal cancer [29].hsa-miR-1273h-5p can inhibit cell proliferation, migration and invasion by targeting chemokine ligand (CXCL12) in gastric cancer [30].The hsa-miR-29 family members (miR-29a, -29b, and -29c) were cancer suppressors in different cancer types.In nasopharyngeal carcinoma cells, hsa-miR-29-3p can improve the radiotherapy sensitivity by targeting COL1A1 3'-UTR [31].In this study, we predicted the other 4 collagen genes (COL4A2, COL4A5, COL5A1, COL7A1) may be potential targets of hsa-miR-29a-3p (Table 2).MiR-29b-3p was sharply reduced in NPC and this result was in consistence with a meta analysis of NPC [32].Notably, most predicted targeted mRNAs of miR-29 family were cancer associated, such as vascular endothelial growth factor A (VEGFA) [16], serpin family h member 1 (SERPINH1) [15], collagen type IV alpha 2 (COL4A2) [11].Of these targets, VEGFA was a key target for the hsa-miR-29 family.For example, miR-29a can suppress the growth and invasion of gastric cancer cells by targeting VEGFA [23], hsa-miR-29b can inhibit angiogenesis by targeting VEGFA in endometrial carcinoma [23] and hsa-miR-29c can function as a tumor suppressor through targeting VEGFA in lung cancer [33].SERPINH1 was also an important target for the hsa-miR-29a, -29b, and -29c, it had been reported that hsa-miR-29a, -29b, and -29c can directly regulate SERPINH1 and participates in the pathogenesis of renal cell carcinoma [24].Therefore, hsa-miR-29 was a potential important regulator in the NPC progression.There are still some limitation in this study, the research was performed by a series of bioinformatics analysis and lack of experiment verification.In the future, more biological experiments will be needed to study the functions of the hub genes and the predicted miRNA-mRNA pairs in NPC.

Conclusions
Taken together, the present study have found many DEGs and DEMs associated with NPC, and constructed a miRNA-mRNA negetive regulatory network.The important DEGs, DEMs and DEM-DEG pairs associated with NPC may contribute to the diagnosis and treatment of NPC in the future.The constructed miRNA-mRNA negetive regulatory network will give help in elucidating the molecular mechanisms of NPC.

Figure 1 .
Figure 1.Cluster analysis of the mRNAs in GSE68799.Row and column represented mRNAs and tissue samples, respectively.The color scale represented the expression levels.

Figure 3 .
Figure 3. Cluster analysis of the microRNAs in GSE118720.Row and column represented microRNAs and tissue samples, respectively.The color scale represented the expression levels.

Figure 4 .
Figure 4. Venn diagram analysis of the DEGs in different datasets.The blue circle represented the GSE68799 dataset and the yellow circle represented the GSE118719 dataset.The intersection of the two circles represented the identical DEGs between the two datasets.(A) The upregulated DEGs from both datasets; and (B) The downregulated DEGs from both datasets.

Figure 5 .
Figure 5.The top 10 significantly enriched GO terms and KEGG pathways of the DEGs.The X-axis represented -log10 (P-value) and the Y-axis showed GO terms or KEGG pathways.(A) Biological Process; (B) Cellular Component; (C) Molecular Funtion; and (D) KEGG pathways.

Figure 6 .
Figure 6.Protein-protein interaction (PPI) networks of the DEGs.The red and green ellipses represented proteins encoded by the up and downregulated DEGs.Ellipses with black border were the 7 hub DEGs.
Analysis of the Differentially Expressed Genes and microRNAs and Prediction of miRNA-mRNA Negative Regulatory Network in Nasopharyngeal Carcinoma were NPC tissue samples and non-NPC tissue samples.In this study, mRNA expression profiles GSE68799 included a collection of 42 Chinese Nasopharyngeal Carcinoma patients tissues and 4 non-NPC nasopharyngeal tissues.mRNA expression profiles GSE118719 and miRNA expression profiles GSE118720 included a collection of 7 NPC samples and 4 normal nasopharyngeal mucosal samples.Information of expression profiling sequencing datasets was present in Table Ping Ouyang et al.:

Table 1 .
Information of sequencing datasets.

Table 2 .
The most significant pathways for the DEGs in different modules of the PPI network.

Table 3 .
The top 5 up-and down-regulated miRNAs and their targets in the miRNA-mRNA network.