Identification of KLF9 and FOSL2 as Endoplasmic Reticulum Stress Signature Genes in Osteoarthritis with Multiple Machine Learning Approaches
keywords: Osteoarthritis, endoplasmic reticulum stress, machine learning
Objective: This study aims to screen osteoarthritis (OA) endoplasmic reticulum (ER) stress signature genes using a machine learning approach to provide new insights and methods for OA treatment. Methods: We obtained GSE55235 and GSE98918 datasets from the gene expression omnibus (GEO) database and identified ER stress-related genes from the GeneCard database. We used R software to perform data batch correction, extract OA endoplasmic reticulum stress-related genes, and conduct differential analysis. We performed functional Gene Ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) signaling pathway analysis, and gene set enrichment analysis (GSEA) on differentially expressed genes (DEGs). Additionally, we used machine learning algorithms, including Least Absolute Shrinkage and Selection Operator (LASSO) regression, SVM-RFE, and weighted gene co-expression network analysis (WGCNA), to screen OA endoplasmic reticulum stress signature genes. Human chondrocytes were selected for OA model establishment, cells without any treatment were served as the control. Results: We obtained 236 DEGs related to OA ER stress. GO and KEGG enrichment analysis showed that these genes were mainly involved in the positive regulation of leukocyte activation, collagen-containing extracellular matrix, phagosome, and other biological functions or signaling pathways. GSEA-GO analysis revealed that ER stress genes were significantly enriched in the negative regulation in metabolic processes of nucleobase-containing compounds (NES = -2.50, P < 0.001), while OA ER stress genes were significantly enriched in the processing and presentation of peptide antigens (NES = 2.40, P < 0.001). Using WGCNA analysis, LASSO regression analysis, and SVM-RFE analysis of intersection, we identified KLF9 and FOSL2 as potential OA endoplasmic reticulum stress signature genes, which were found to be more accurate as OA signature genes after validation. KLF9 expression in OA group was higher than that in control group, while FOSL2 expression was lower (P < 0.05). Conclusion: Machine learning and co-expression network analysis can effectively identify the genes and potential factors characteristic of ER stress in OA, which can help elucidate its pathogenesis and provide a new direction for better clinical treatment.
reference: Vol. 43, 2024, No. 4, pp. 777–796