Data Mining Problems in Medicine
CGroselj
University Medical Center, Nuclear Medicine Department, Ljubljana, Slovenia
ciril.groselj@kclj.si
Abstract
The principle of any retrospective on patient data based investigation is searching the
patients by problem or sign, but no name. With a proper, by problem encoded computer
archived data base, the data mining process would be easy. One would need only input the
request and get the proper data in short time.
The medical archives frequently base on paper records only, with a patient name as
entering key. To find the proper record in such archive, a detective strategy is needed. The
process continues with collecting the usually enormous amount of papers, finding between
them the appropriate records and finally encoding and arranging them in a table. The whole
named process can be separated on patients, paper and data mining. Because of their
dilatory, these phases can be the most time loosing part of an on-medical data based
investigation. Author describes his data mining experience.
Key words: data mining, medicine, coronary artery disease, data bank
1. Introduction
Any retrospective, on patients medical data based investigation has four main phases: plan
of the study, data mining, processing of the data and interpretation of the results. For a good
accomplishment of an investigation, each of these phases is equally important.
The data mining process starts with defining the pool in which finding of sufficient number
of patients fulfilling selected criteria is expected, continues with identifying a planned number
of such patients, collecting their records, verifying the relevance of each patient and his record,
catching the proper data, encoding the data - the qualitative and the quantitative - and
arranging the data in table. The data are now ready for processing.
In case a computerized medical data bank of suitable patients exists, the main steps of this
process pass easy. One only needs to put the requests regarding the patients diagnosis, age,
gender or accomplished observed investigations results into the computer and receives a list of
rough suitable data, depending of data encoding level.
If we operate only with a paper archive, the data mining process can be more difficult…
2. The Medical Data Bank
In general, any patient's medical process consists of diagnosing and treatment. The process
goes on in an office, at bedside in hospital or in diagnostic or intervention facilities. The
majority of diagnostic results are images. All on-line remarks, the majority of diagnostic,
therapeutic and final results are described qualitatively. The obvious final report is a review
where the problem, findings, interventions and further suggestions are briefly described. For
the needs of epidemiologists the final diagnosis is encoded.
There are probably few possibilities how to computerize all these data in the purpose of
creating a data bank. All descriptive records, numerical results, digital image records or their
descriptions should be collected in computer database with patient's name or diagnosis as an
opening key.
Proceedings of the 15 th IEEE Symposium on Computer-Based Medical Systems (CBMS 2002)
1063-7125/02 $17.00 © 2002 IEEE