Outlier detection has been a very important concept in the realm of data analysis and the complex relationships that appear with regard to patient symptoms, diagnoses and behavior are the most promising areas of outlier mining. This paper elaborates how the outliers can be detected by using statistical methods. The importance of outlier detection is due to the fact that outliers in data predict significant (and often critical) information in a wide variety of application domains. There are numerous different formulations of an outlier detection problem which have been explored in diverse disciplines such as statistics, machine learning, data mining and information theory. In fact, the study with medical data by using the DM techniques is virtually an unexplored frontier which needs extraordinary attention. In this study, the pima data set was used in the simulation carried out by TANAGRA. A total of 193 outliers were detected for the statistics namely leverage, R-standard, R-student, DFFITS, Cook’s D and covariance ratio. The results of the present investigation suggested that the extraordinary behavior of outliers facilitates the exploration of the valuable knowledge hidden in their domain and help the decision makers to provide improved, reliable and efficient healthcare services.
Prof. Dr. Jonas Contiero, Brasil
Chief Scientific Officer and Head of a Research Group