Conference Proceedings

Evaluating classification power of linked admission data sources with text mining

S Kocbek, L Cavedon, D Martinez, C Bain, C MacManus, G Haffari, I Zukerman, K Verspoor, K Barbuto, L Schaper, K Verspoor

CEUR Workshop Proceedings | CEUR Workshop Proceedings | Published : 2015


Lung cancer is a leading cause of death in developed countries. This paper presents a text mining system using Support Vector Machines for detecting lung cancer admissions. Performance of the system using different clinical data sources is evaluated. We use radiology reports as an initial data source and add other sources, such as pathology reports, patient demographic information and hospital admission information. Results show that mining over linked data sources significantly improves classification performance with a maximum F-Score improvement of 0.057.