Multiple sclerosis (MS) is an autoinflammatory disease that might lead to severe disability. The diagnosis of MS is defined due to the urgency for biomarkers with both reliability and efficiency. Demyelination of axons are deeply involved in the pathogenesis of MS. Our study aims to identify the underlying molecular mechanism and screening for related biomarkers and signaling pathways. We obtained next generation sequencing (NGS) dataset (GSE138614) from the GEO database. Differentially expressed genes (DEGs) were screened by the DESeq2 package in R bioconductor with considering specific criteria. Gene Ontology (GO) enrichment analysis, REACTOME pathway enrichment analysis were performed; a protein-protein interaction (PPI) network was constructed; significant modules were analyzed and hub genes were identified by Human Integrated Protein-Protein Interaction rEference (HiPPIE). Subsequently, miRNA-hub gene regulatory network, TF-hub gene regulatory network and drug-hub gene interaction network were built by Cytoscape to predict the underlying microRNAs (miRNAs), transcription factors (TFs) and drugs associated with hub genes. Receiver operating characteristic (ROC) curves analysis was performed to calculate diagnostic value of hub genes. Finally, we performed molecular docking study for prediction of drug molecules against protein targets. A total of 959 DEGs (479 up-regulated and 480 down-regulated genes) were identified in the MS samples and compared with normal control samples. The DEGs were predominantly enriched in an ensemble of genes encoding the immune system process, developmental process, immune system and regulation of cholesterol biosynthesis by SREBP (SREBF). A PPI network was obtained through HiPPIE analysis, and the results were imported into Cytoscape software. The DEGs were sequenced by the Network Analyzer plug-in by various calculation methods, and 10 hub genes (LCK, PYHIN1, SLAMF1, DOK2, TAB2, CFTR, RHOB, LMNA, EGLN3 and ERBB3) were finally selected. Based on the miRNA-hub gene regulatory network and TF-hub gene regulatory network construction, miRNAs including hsa-mir-6794-3p, hsa-mir-3689a-3p, hsa-mir-4651, hsa-mir-548q, BRCA1, HNF4A, TFAP2C and NR2F1 were determined to be potential key biomarkers. Drug-hub gene interaction network constructed from DrugBank, which identified targeted therapeutic drugs (Palivizumab, Cu-Bicyclam, Lumacaftor and Zonisamide) for the hub genes. From molecular docking study we showed good drug - protein bind affinity and amino acid interactions. This study identified novel biomarkers for MS and established a reliable diagnostic model as well as predicted novel drug molecules. The transcriptional changes identified may help to reveal the pathogenesis and molecular mechanisms of MS.
Multiple sclerosis (MS) is the commonest autoimmune inflammatory disorder of CNS white matter [1]. MS is one of leading causes of disability in children’s, young aged adults and middle aged adults in many developed countries [2–4] and it affects 2.3 million people worldwide with an expected increase in the number of cases in future [5]. Main features of MS include selective primary demyelination with partial preservation of axons and reactive astrocytic gliosis [6]. It might affect many other organs and cause other MS associated complications include spasticity [7], paralysis [8], bladder, bowel and sexual dysfunction [9], psychiatric disorders [10], cardiovascular diseases [11], neurodegenerative diseases [12], spinal cord injury [13], obesity [14], diabetes mellitus [15] and hypertension [16]. Autoimmune inflammation is triggered by both genetic susceptibility [17] and environmental factors [18]. However, the etiology and symptoms of MS are complex in clinical practice, making its diagnosis challenging.
Correct early diagnosis assessment of MS is very difficult, even though much disease related genes and cellular pathways related to MS have appeared [19]. The common treatments of MS are interferons, glatiramer acetate, teriflunomide, sphingosine 1-phosphate receptor modulators, fumarates, cladribine and monoclonal antibodies [20], but there are no valid treatment tactics available to treat MS. Therefore, it is vitally important to explore potential diagnostic and prognostic biomarkers, and therapeutic targets of MS.
The fast development of next generation sequencing (NGS) technology and bioinformatics analysis based on high-throughput data provide new tactics to identify differentially expressed genes (DEGs) and discover therapeutic targets for the initiation and evolution of MS [21]. Altered expression of key genes plays an essential role in the initiation and progression of MS, so mastering the alteration in the characteristics of critical genes and signaling pathways promotes to comprehensively understanding of MS progression [22]. Several biomarkers including CLEC16A (C-type lectin domain containing 16A) [23], IL2RA (interleukin 2 receptor subunit alpha), RPL5 (ribosomal protein L5) and CD58 (CD58 molecule) [24], SLC9A9 (solute carrier family 9 member A9) [25], IL22RA2 (interleukin 22 receptor subunit alpha 2) [26] and ANKRD55 (ankyrin repeat domain 55) [27] were identified for the MS. Indeed, some researchers found signaling pathways include mTOR signaling pathway [28], Keap1/Nrf2/ARE signaling pathway [29], NF-κB signaling pathways [30], wnt signaling pathway [31] and Nogo-A/NgR signaling pathway [32] have been shown to be associated with MS. The identification of novel biomarkers and signaling pathway might be helpful to improve the clinical outcome of MS patients.
However, the use of bioinformatics analysis method to find the relevant genes and signaling pathways of MS has not yet been confirmed. In the current study, we downloaded NGS dataset GSE138614 [33] from the NCBI Gene Expression Omnibus (GEO) [https://www.frankenthalerfoundation.org [34] database, including NGS data from 73 patients with MS brain white matter samples and 25 normal control brain white matter samples. We identified DEGs using the DESeq2 in R bioconductor tool. The GO term and REACTOME pathway enrichment analyses were performed via g:Profiler. Moreover, protein-protein interaction (PPI) network and modules, and the hub genes were identified according to toplogical properties include node degree, betweenness, stress and closeness scores. Subsequently, miRNA-hub gene regulatory network, TF-hub gene regulatory network and drug-hub gene interaction network were constructed and analyzed according to the screened hub genes. Subsequently, ten hub genes were subject to receiver operating characteristic curve (ROC) analysis to determine diagnostic value of hub genes. Finally, molecular docking study was performed for prediction of possible binding affinity of selected ligand molecules. This approach will enhance our understanding of the role of key genes and signaling pathways in MS and help to elucidate the specific molecular mechanisms of damage of white matter in brain. Additionally, the identification of these genes and signaling pathways might provide novel targets for intervention in the treatment of
GEO is a public functional genomics data repository of high throughput NGS data (RNA-Sequencing). The GSE138614 [33] NGS dataset generated using the GPL21697 NextSeq 550 (Homo sapiens). The GSE138614 dataset contained 73 MS brain white matter samples and 25 normal brain white matter control samples.
Determining DEGs between MS and normal control samples was performed using DESeq2 package [35] in R bioconductor tool.The adjusted P-values (adj. P) and Benjamini and Hochberg false discovery rate were applied to strike a balance between finding statistically important genes and limiting false positives [36]. |log FC (fold change)| > 2.2 for up regulated genes, |log FC (fold change)| < −0.866 for down regulated genes and adj. P-value < 0.05 were considered statistically significant. DEGs were visualized by the “volcano” and “heatmap” by using ggplot2 and gplot packages in R software.
Online tool g:Profiler (https://www.frankenthalerfoundation.org [37] was used to perform the GO functional annotation (https://www.frankenthalerfoundation.org) [38] and REACTOME (https://www.frankenthalerfoundation.org [39] pathway enrichment analyses. GO enrichment analysis is a premier bioinformatics analysis for identifying high-quality functional gene annotation based on biological process (BP), cellular component (CC) and molecular function (MF). The REACTOME is a resource of pathway database for the clarification of high-level features and effects of biological systems. Adj. P- value <0.05 was considered as the threshold.
Information of DEGs’ protein experimental interactions and prediction was obtained by Human Integrated Protein-Protein Interaction rEference (HiPPIE, https://www.frankenthalerfoundation.org [40]. Subsequently, a specific PPI network of DEGs was constructed by cytoscape (version 3.10.1, https://www.frankenthalerfoundation.org [41] based on the interactions retrieved from HiPPIE. To screen the hub genes that might be involved in MS, we applied the Network Analyzer plug-in, using various topological parameters such as node degree (refers to the number of connections (edges) a given protein (node) has within the network) [42], betweenness (measure that identifies the extent to which a node (protein) serves as a bridge or intermediary for communication between other nodes (proteins) in the network) [43], stress (network metric that measures how much a node (protein) is involved in “stress” or bottleneck situations within the network) [44] and closeness (measure that indicates how close a node (protein) is to all other nodes in the network) [45]. In addition, PEWCC analysis [46] in cytoscape was used to identify the significant modules of the PPI network
miRNet database (https://www.frankenthalerfoundation.org [47] is used to explore miRNA- hub gene interactions for the input hub genes from PPI network and assess the effect of the miRNA on the expression of the hub gene. The parameters used were as follows: specify organism: H. sapiens (human); set ID type: Official Gene Symbol; hub gene - miRNA interaction database: TarBase, miRTarBase, miRecords, miRanda (S mansoni only), miR2Disease, HMDD, PhenomiR, SM2miR, PharmacomiR, EpimiR, starBase, TransmiR, ADmiRE and TAM 2. Cytoscape (v. 3.10.1) [41] software was used to visualize the miRNA-hub gene regulatory network.
NetworkAnalyst (https://www.frankenthalerfoundation.org [48] is used to explore TF- hub gene interactions for the input hub genes from PPI network and assess the effect of the TF on the expression of the hub gene. The parameters used were as follows: specify organism: H. sapiens (human); set ID type: Official Gene Symbol; hub gene -TF interaction database: JASPER. Cytoscape (v. 3.10.1) [41] software was used to visualize the TF-hub gene regulatory network.
ROC curve analysis, which shows indicators of accuracy such as the area under the curve (AUC), provides the fundamental principle and rationale for distinguishing between the specificity and sensitivity of diagnostic performance of hub genes. The diagnostic utility of hub genes in MS was then assessed using a pROC package in R bioconductor tool [49]. Expressions of hub genes within the training and validating datasets were illustrated for assessment. When the area under the curve (AUC) exceeds 0.9, the hub gene will be regarded as good diagnostic value for MS.
Swiss-model, RCSBPDB, Prank web, ChemDraw, Avogadro tool, Autodock 1.7.1, and Autodock Vina tools, Biovia Discovery Studio client 2021.
The 3D crystal structure of the respective proteins was obtained from the respective genes by using the PDB IDs were downloaded from the RCSB Protein Data Bank (PDB) [50] (Table 1). The missing residues were remodelled by the Swiss-Model web server and downloaded in the pdb format [51]. Then the binding sites were identified by using the server Prank web [52], the pdb format of protein was inserted in the Software Auto Dock Tools 1.7.1, water molecules and hat atoms were removed. The receptor has also been screened for them issuing aminoacid residues, repaired missing residues, Kollman charges, and only Polar hydrogens were added. The grid was fixed using the Autogrid utilityin such a way that to covered the whole receptor, and later saved the grid dimension file [53].The structures of ligands were drawn by using ChemDraw (Fig.1) and copied as SMILES, followed by loading them into Avogadro software for converting them from 2D to 3D structures in pdb format, then pdb format of the ligand was inserted in Auto Dock Tools 1.7.1, then the root of the ligand molecule was detected and chosen. Finally, the ligand molecule was saved in the pdbqt format.
There are two ways to run Auto Dock Vina: from the Autodock tool or the command line (cmd). For running Autodock Vina, the configuration file was prepared, the grid dimension file saved previously consists of active site, x, y, and z coordinates, and n-pointsof the protein. That data was inserted into the config file; aconfig file was created which included the information of the active site of the protein, also it includes the log file in the form of .txt format and output file in the form of pdbqt. Autodock Vinawas run on the command line. The command “vina.exe -- config config.txt” was given to run Auto Dock Vina once the run was completed, docked coordinates were out in the