Automatic Classification of BI-RADS in Mammography Reports Using Data Fusion

Zahabi , M; Shiri , ME; Haj Seyed Javadi,  H; Broumandzadeh , M

doi:10.61186/armaghanj.29.3.365

[Home ] [Archive]

[ فارسی ]

Main Menu

Home

Journal Information

Articles archive

For Authors

For Reviewers

Registration

Contact us

Site Facilities

Indexing & Abstracting

Publication Ethics

Search in website

Receive site information

Indexing & Abstracting

DOAJ
GOOGLE SCHOLAR

Volume 29, Issue 3 (4-2024)

__Armaghane Danesh__ 2024, 29(3): 365-385

Back to browse issues page

Automatic Classification of BI-RADS in Mammography Reports Using Data Fusion

M Zahabi¹

, ME Shiri

², H Haj Seyed Javadi³

, M Broumandzadeh⁴

1- Department of Computer Engineering, Borujerd Branch, Islamic Azad University, Borujard, Iran
2- Department of Mathematics and Computer Science, Amir-Kabir University of Technology, Tehran, Iran , shiri@aut.ac.ir
3- Department of Mathematics and Computer Science, ShahedUniversity, Tehran, Iran
4- Department of Computer Engineering and Information Technology, Payam-e-Nour University, Tehran, Iran

Abstract: (713 Views)

Background & aim: Breast cancer is one of the most common cancers in women and the main cause of death in cancer diseases, and mammography is the primary imaging method for early detection of breast masses. Rapid diagnosis with high accuracy is one of the serious concerns of doctors and healthcare centers when facing certain diseases, so the purpose of this article was to determine the automatic classification of BI-RADS in mammography reports using data fusion.

Methods: The present descriptive, analytical, and retrospective study was conducted in 2023, the mammography report and the electronic file of the patients were extracted from the archiving and communication system of the patient's image and records obtained from the available information in the medical training center of Shahidzadeh hospital in Behbahan, Iran, which includes the mammography reports and the electronic record of 250 patients who had ample information. To model the proposed method using the collected dataset, Python software was used in the Visual Studio Code environment. Finally, cross-validation was used to evaluate the quality and validity of the results.

Results: The results confirmed that the proposed approach, namely the use of Word2vec combined with TFIDF, and their integration with HIS, had a significant impact on the accuracy of medical text classification. The output vectors of Word2vec were used for BI-RADS level classification when TFIDF was applied or not applied, as well as with and without the integration of HIS, for classifiers such as CNN, MLP, DT, and k-NN, and the results were compared using evaluation measures such as accuracy, precision, sensitivity, positive predictive value, negative predictive value, and F1 score. The results indicated that the best accuracy with the proposed method using the multilayer perceptron classifier was 98.74%, but without HIS, the accuracy for the same classifier was 92.23%.

Conclusion: By combining Word2vec with TFIDF, the accuracy of text classification could be increased, but the medical history of patients was important in the diagnosis of disease and could improve the accuracy. The results indicated that one should not focus only on medical reports and other clinical information and patients' history should also be used. Therefore, the use of HIS along with medical text reports could improve BI-RADS classification and have a positive effect on diagnosis and treatment processes.

Keywords: medical text classification, breast cancer, feature extraction, BI-RADS, HIS

Full-Text [PDF 1114 kb] (127 Downloads)

Type of Study: Research | Subject: General
Received: 2023/11/8 | Accepted: 2024/02/26 | Published: 2024/05/20

References

1. Redaniel MT, Martin RM, Ridd MJ, Wade J, Jeffreys M. Diagnostic intervals and its association with breast, prostate, lung and colorectal cancer survival in England: historical cohort study using the Clinical Practice Research Datalink. PLoS ONE 2015; 1(5): e0126608.## [DOI:10.1371/journal.pone.0126608] [PMID] []

2. Castro SM, Tseytlin E, Medvedeva O, Mitchell K, Visweswaran S, Bekhuis T, et al. Automated annotation and classification of BI-RADS assessment from radiology reports. Journal of Biomedical Informatics. 2017; 69: 177-87. ## [DOI:10.1016/j.jbi.2017.04.011] [PMID] []

3. Mendonca SC, Abel GA, Saunders CL, Wardle J, Lyratzopoulos G. Pre‐referral general practitioner consultations and subsequent experience of cancer care: evidence from the english cancer patient experience survey. European Journal of Cancer Care 2016; 25(3): 478-90. ## [DOI:10.1111/ecc.12353] [PMID] []

4. Gao F, Yoon H, Wu T, Chu X. A feature transfer enabled multi-task deep learning model on medical imaging. Expert Systems with Applications 2020; 143: 112957. ## [DOI:10.1016/j.eswa.2019.112957]

5. Gong J, Bai X, Li D-a, Zhao J, Li X. Prognosis Analysis of Heart Failure Based on Recurrent Attention Model. Innovation and Research in BioMedical engineering (IRBM) 2019; 41(2): 71-9. ## [DOI:10.1016/j.irbm.2019.08.002]

6. Yang M, Kiang M, Shang W. Filtering big data from social media-Building an early warning system for adverse drug reactions. Journal of biomedical informatics. 2015; 54: 230-40. ## [DOI:10.1016/j.jbi.2015.01.011] [PMID]

7. Genes N, Chandra D, Ellis S, Baumlin K. Validating emergency department vital signs using a data quality engine for data warehouse. The Open Medical Informatics Journal 2013; 7: 34-9. ## [DOI:10.2174/1874431101307010034] [PMID] []

8. Sun P, Wang L, Xia Q. The Keyword Extraction of Chinese Medical Web Page Based on WF-TF-IDF Algorithm. International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). IEEE 2017; 193-198. ## [DOI:10.1109/CyberC.2017.40]

9. Dreisbach C, Koleck TA, Bourne PE, Bakken S. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. International Journal of Medical Informatics 2019; 125: 37-46. ## [DOI:10.1016/j.ijmedinf.2019.02.008] [PMID] []

10. Ding X, Zhang X. Research on text structuralization in medical field. 2nd International Conference on Cloud Computing and Internet of Things (CCIOT): IEEE; 2016; 155-61. ##

11. Nii M, Tuchida Y, Iwamoto T, Uchinuno A, Sakashita R. Nursing-care text evaluation using word vector representations realized by word2vec. International Conference on Fuzzy Systems (FUZZ-IEEE); 2016: IEEE; 2165-2169. ## [DOI:10.1109/FUZZ-IEEE.2016.7737960]

12. Narváez F, Díaz G, Poveda C, Romero E. An automatic BI-RADS description of mammographic masses by fusing multiresolution features. Expert Systems with Applications 2017; 74: 82-95. ## [DOI:10.1016/j.eswa.2016.11.031]

13. Østerås BH, Martinsen ACT, Brandal SHB, Chaudhry KN, Eben E, Haakenaasen U, et al. BI-RADS density classification from areometric and volumetric automatic breast density measurements. Academic Radiology 2016; 23(4): 468-78. ## [DOI:10.1016/j.acra.2015.12.016] [PMID]

14. Diab DM, El Hindi KM. Using differential evolution for fine tuning naïve Bayesian classifiers and its application for text classification. Applied Soft Computing 2017; 54: 183-99. ## [DOI:10.1016/j.asoc.2016.12.043]

15. de Lima SML, da Silva-Filho AG, Dos Santos WP. Detection and classification of masses in mammographic images in a multi-kernel approach. Computer Methods and Programs in Biomedicine 2016; 134: 11-29. ## [DOI:10.1016/j.cmpb.2016.04.029] [PMID]

16. Moayedi F, Azimifar Z, Boostani R, Katebi S. Contourlet-based mammography mass classification using the SVM family. Computers in Biology and Medicine 2010; 40(4): 373-83. ## [DOI:10.1016/j.compbiomed.2009.12.006] [PMID]

17. Eltoukhy MM, Faye I, Samir BB. A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation. Computers in Biology and Medicine 2012; 42(1): 123-8. ## [DOI:10.1016/j.compbiomed.2011.10.016] [PMID]

18. Reyad YA, Berbar MA, Hussain M. Comparison of Statistical, LBP, and Multi-Resolution Analysis Features for Breast Mass Classification. Journal of medical systems 2014; 38: 100. ## [DOI:10.1007/s10916-014-0100-7] [PMID]

19. Isfahani P, Hossieni Zare SM, Shamsaii M. The Prevalence of Depression in Iranian Women With Breast Cancer: A Meta-Analysis. Journal of Internal Medicine Today 2020; 26(2): 170-181. ## [DOI:10.1155/2020/5871402] [PMID] []

20. Bouvry C, Tvardik N, Kergourlay I, Bittar A, Arnod-Prin P, Segond F, et al. The SYNODOS Project: System for the Normalization and Organization of Textual Medical Data for Observation in Healthcare. Innovation and Research in BioMedical engineering (IRBM) 2016; 37(2): 109-15. ## [DOI:10.1016/j.irbm.2016.03.002]

21. Yang L, Liu B, Lin H, Lin Y. Combining local and global information for product feature extraction in opinion documents. Information Processing Letters 2016; 116(10): 623-7. ## [DOI:10.1016/j.ipl.2016.04.009]

22. Kim D, Seo D, Cho S, Kang P. Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec. Information Sciences 2019; 477: 15-29. ## [DOI:10.1016/j.ins.2018.10.006]

23. Kalchbrenner N, Grefenstette E, Blunsom P. A Convolutional Neural Network for Modelling Sentences. arXiv:1404.2188 2014; 655-65. ## [DOI:10.3115/v1/P14-1062]

24. Jais IKM, Ismail AR, Nisa SQ. Adam optimization algorithm for wide and deep neural network. Knowledge Engineering and Data Science 2019; 2(1): 41-6. ## [DOI:10.17977/um018v2i12019p41-46]

25. Tharwat A. Classification assessment methods. Applied Computing and Informatics 2020; 17(1): 168-92. ## [DOI:10.1016/j.aci.2018.08.003]

26. Pennington J, Socher R, Manning CD. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) 2014; 1532-1543. ## [DOI:10.3115/v1/D14-1162]

27. Balakumar P, Maung-U K, Jagadeesh G. Prevalence and prevention of cardiovascular disease and diabetes mellitus. Pharmacological Research 2016; 113: 600-9. ## [DOI:10.1016/j.phrs.2016.09.040] [PMID]

28. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians 2018; 68(6): 394-424. ## [DOI:10.3322/caac.21492] [PMID]

29. Siu AL, US Preventive Services Task Force. Screening for Breast Cancer: U.S. Preventive Services Task Force Recommendation Statement. Annals of Internal Medicine 2016; 164(4): 279-96. ## [DOI:10.7326/M15-2886] [PMID]

30. Shahmoradi L, Farzanehnejad AR. Guideline-based clinical decision support systems as an inseparable tool for better cancer care management. Iranian journal of public health 2016; 45(7): 962. ##

Send email to the article author

Add your comments about this article

‎ 10.61186/armaghanj.29.3.365

Mendeley

Zotero

RefWorks

Zahabi M, Shiri M, Haj Seyed Javadi H, Broumandzadeh M. Automatic Classification of BI-RADS in Mammography Reports Using Data Fusion. armaghanj 2024; 29 (3) :365-385
URL: http://armaghanj.yums.ac.ir/article-1-3554-en.html

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Volume 29, Issue 3 (4-2024)

Back to browse issues page

Persian site map - English site map - Created in 0.05 seconds with 38 queries by YEKTAWEB 4660