Identification of Key Modules of Lung Cancer in Gene Regulatory Network using Greedy Modularity Optimization Approach(مقاله علمی وزارت علوم)
Cancer is a complex and dangerous disease in which cells uncontrollably begin to grow. Some cells, with mutated genes, cause abnormalities in the cell. These abnormalities are transferred to other genes through specific interactions between genes, leading to disruptions in the normal function of cells. The result of these cell abnormalities will be the occurrence of cancer. In cancer, modules are considered as clusters of genes and regulatory molecules that play a role in the processes of cancer initiation and progression. These modules usually have a specific gene sequence as a central unit that is important in controlling and regulating cellular processes related to cancer.In this study, a novel network-based method called mdGRN is proposed for identifying modules effective in lung cancer occurrence in the gene regulatory network. In this method, first, using gene expression data and regulatory interactions, a lung cancer regulatory network is constructed. Then, using a greedy modularity optimization approach, communities related to lung cancer are identified. Subsequently, the obtained communities are ranked using influence diffusion metrics in the network. Finally, the top-ranked communities are introduced as effective modules.To assess the efficacy of the proposed method, the standard Cancer Genome Atlas (TCGA) database and four classifiers including a decision tree, k-nearest neighbors, support vector machine, and random forest were utilized. The results obtained demonstrated that the proposed mdGRN method outperforms other methods in identifying cancer modules in terms of the average harmonic mean metric with the support vector machine classifier. Additionally, in terms of the AUC metric, the proposed method achieved a value of 0.997 using the random forest classifier, indicating better performance compared to other previous methods in identifying cancer modules. Furthermore, the number of genes identified by the top module is compared with other previous computational and network methods. The results show that the top-ranked module, besides containing a considerable number of driver genes, contains unique genes that have not been identified by other methods.