Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images
https://doi.org/10.46563/0044-197X-2020-64-6-343-350
Abstract
Introduction. Medical imaging techniques can diagnose many diseases at the early stages of their development, improving the patient survival. Artificial intelligence (AI) systems, requiring the high-quality annotated and marked-up sets of medical images, are a suitable and promising means of improving the diagnostics’ quality.
The purpose of the study was to develop a methodology and software for creating AIS training sets.
Material and methods. We compared the main annotation methods’ performance and accuracy and based the information system on the most efficient method in both domains to develop an optimal approach. To markup objects of interest, we used the cluster model of lesions localization previously developed by the authors. We used C++ and Kotlin programming languages for software development.
Results. A structured annotation template with delivered a glossary of terms became the basis of the information system. The latter consists of three interacting modules, two of which are executed on a remote server’s capacities and one on a personal computer or mobile device of the end-user. The first module is a web service responsible for the workflow logic. The second module, a web server, is responsible for interacting with client applications. Its role is to identify users and manage the database and Picture Archiving and Communication System (PACS) connections. The front-end module is a web application with a graphical interface that assists the end-user in images’ markup and annotation.
Conclusions. An algorithmic basis and a software package have been created for annotation and markup of CT images. The resulting information system was used in a large-scale lung cancer screening project for the creation of medical imaging datasets.
About the Authors
Nikolay S. KulbergRussian Federation
MD, Ph.D., head of the Department, Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies, Moscow, 109029, Russia.
e-mail: kulberg@npcmr.ru
Maxim A. Gusev
Russian Federation
Roman V. Reshetnikov
Russian Federation
Alexey B. Elizarov
Russian Federation
Vladimir P. Novik
Russian Federation
Sergey B. Prokudaylo
Russian Federation
Yuriy N. Philippovich
Russian Federation
Victor A. Gobmolevsky
Russian Federation
Anton V. Vladzymyrskyy
Russian Federation
Natalya N. Kamynina
Russian Federation
Sergey P. Morozov
Russian Federation
References
1. Riquelme D., Akhloufi M.A. Deep learning for lung cancer nodules detection and classification in CT scans. AI. 2020; 1(1): 28–67. https://doi.org/10.3390/ai1010003
2. Bell D.J., Morgan M.A. Lung-RADS. National Cancer Institute (NCI). Available at: https://radiopaedia.org/articles/lung-rads
3. Morozov S.P., Kul’berg N.S., Gombolevskiy V.A., Ledikhova N.A., Sokolina I.A., Vladzimirskiy A.V., et al. Tagged Chest Computed Tomography (CT) Images. Patent RU № 2018620500; 2018. (in Russian)
4. Morozov S.P., Kul’berg N.S., Gombolevskiy V.A., Ledikhova N.A., Sokolina I.A., Vladzimirskiy A.V., et al. Chest Computer Tomography (CT) set for Machine Learning. Patent RU № 2018620427; 2018. (in Russian)
5. Li Z., Wang C., Han M., Xue Y., Wei W., Li L.J., et al. Thoracic Disease Identification and Localization with Limited Supervision. Available at: https://arxiv.org/abs/1711.06373
6. Armato S.G., McLennan G., Bidaut L., McNitt-Gray M.F., Meyer C.R., Reeves A.P., et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Med. Phys. 2011; 38(2): 915–31. https://doi.org/10.1118/1.3528204
7. Kan S.H. Metrics and Models in Software Quality Engineering. Boston: Addison-Wesley Professional; 2003.
8. Kovalev V.A., Levchuk V.A., Kalinovskiy A.A., Fridman M.V. Tumor segmentation in whole-slide histology images using deep learning. Informatika. 2019; 16(2): 18–26. (in Russian)
9. Xu R., Zhou X., Hirano Y., Tachibana R., Hara T., Kido S., et al. Particle system based adaptive sampling on spherical parameter space to improve the MDL method for construction of statistical shape models. Comput. Math. Methods Med. 2013; 2013: 196259. https://doi.org/10.1155/2013/196259
10. Armato S.G., Meyer C.R., Mcnitt-Gray M.F., McLennan G., Reeves A.P., Croft B.Y., et al. The Reference Image Database to Evaluate Response to therapy in lung cancer (RIDER) project: A resource for the development of change analysis software. Clin. Pharmacol. Ther. 2008; 84(4): 448–56. https://doi.org/10.1038/clpt.2008.161
11. Bakr S., Gevaert O., Echegaray S., Ayers K., Zhou M., Shafiq M., et al. A radiogenomic dataset of non-small cell lung cancer. Sci. Data. 2018; 5: 180202. https://doi.org/10.1038/sdata.2018.202
Review
For citations:
Kulberg N.S., Gusev M.A., Reshetnikov R.V., Elizarov A.B., Novik V.P., Prokudaylo S.B., Philippovich Yu.N., Gobmolevsky V.A., Vladzymyrskyy A.V., Kamynina N.N., Morozov S.P. Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images. Health care of the Russian Federation. 2020;64(6):343-350. (In Russ.) https://doi.org/10.46563/0044-197X-2020-64-6-343-350