Published Online:January 2026
Product Name:The IUP Journal of Computer Sciences
Product Type:Article
Product Code:IJCS040126
DOI:10.71329/IUPJCS/2026.20.1.38-54
Author Name:Ishola Olabisi Bolarinwa, Mistura Mohammed Usman and Akanbi Bello Muhammed
Availability:YES
Subject/Domain:Engineering
Download Format:PDF
Pages:38-54
Breast cancer (BC) has become the most prevalent type of cancers and is a major source of mortality among females around the world. The paper compares several machine learning (ML) algorithms to determine which model is most effective for early diagnosis and detection of the disease. To create the prediction model, the study used five different ML techniques, namely, Decision Tree (DT), Logistic Regression (LR), Naïve Bayes(NB), Random Forest (RF), and K-Nearest Neighbor (KNN), on BC dataset. Feature importance was done using principal component analysis (PCA). The performance was enhanced by adjusting various hyper parameters. The results demonstrated that RF method surpassed other algorithms in terms of prediction performance with 98.4% and DT with 93.8%. The paper identifies radiation, number of children, age at menopause, residence, smoking, chemical and pesticides, and alcohol intake as the seven significant risk factors for BC. Then explainable artificial intelligence (XAI) was applied to the best performing model to improve model interpretability, trust, and transparency.
Cancer significantly contributes to mortality rates and poses a substantial challenge to improving global life expectancy (Devi et al., 2024).