Hybrid Approach to Outlier Detection in Medical Dataset

Authors

  • Archana kadam Assistant Professor, Department of Computer Engineering, PCCOE, Pune, India
  • Sagar G. Powar Senior Technical Leader, Xpanxion International Pvt. Ltd. Pune, India

DOI:

https://doi.org/10.51983/ajcst-2017.6.2.1785

Keywords:

K-means clustering technique, density-based algorithm, trajectory outlier detection (TRAOD), Heart disease dataset

Abstract

Outlier detection has been a very important concept in the realm of data analysis and the complex relationships that appear with regard to patient symptoms, diagnoses and behavior are the most promising areas of outlier. The data typically consists of records which may have several different types of features such as patient age, blood group and weight. Recently, density-based outlier detection has emerged as a viable and scalable alternative to traditional statistical and geometric approaches. Density Based Outlier Detection Algorithm along with K-means partition technique is used for detecting outliers in Heart Disease dataset which is used to diagnose the abnormal data. This analysis can be used by doctor to predict heart disease of particular patient.

References

M.-C. Hung, J. Wu, J.-H. Chang, and D.-L. Yang, “An Efficient k-Means Clustering Algorithm Using Simple Partitioning,” 2005.

Z. Liu, D. Pi, and J. Jiang, “Density-based trajectory outlier detection algorithm,” April 2013, pp. 335–340.

J. Lee, J. Han, and X. Li, “Trajectory Outlier Detection: A Partition-and-Detect Framework,” in Proc. 24th ICDE Int’l Conf., Cancún, México, pp. 140–149 (April 2008).

L. X. Liu, S. J. Qiao, B. Liu, et al., "Efficient trajectory outlier detection algorithm based on R-tree," Journal of Software, 2009, Vol. 20, No. 9, pp. 2426–2435. (in Chinese)

C. C. Aggarwal and P. S. Yu, "Outlier detection for high dimensional data," in Proc. 2001 ACM SIGMOD Int'l Conf. on Management of Data, Santa Barbara, California, May 2001, pp. 37-46.

S. Ramaswamy, R. Rastogi, and K. Shim, "Efficient algorithms for mining outliers from large data sets," in Proc. 2000 ACM SIGMOD Int'l Conf. on Management of Data, Dallas, Texas, May 2000, pp.427-438.

M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander, "LOF: Identifying density-based local outliers," in Proc. 2000 ACM SIGMOD Int'l Conf. on Management of Data, Dallas, Texas, May 2000, pp. 93-104.

E. M. Knorr and R. T. Ng, "Algorithms for mining distance-based outliers in large datasets," in Proc. 24th Int'l Conf. on Very Large Data Bases, New York City, New York, Aug. 1998, pp. 392-403.

D. P. Huttenlocher, G. A. Klanderman, W. J. Rucklidge, "Comparing images using the Hausdorff distance," IEEE Trans. On Pattern Analysis and Machine Intelligence, 1993, Vol. 15, No. 9, pp. 850–863.

J. Han and M. Kamber, Data Mining: Concepts and Techniques, 2nd ed. Morgan Kaufmann, 2006.

V. Barnett and T. Lewis, Outliers in Statistical Data. John Wiley & Sons, 1994.

Downloads

Published

10-08-2017

How to Cite

kadam, A., & Powar, S. G. (2017). Hybrid Approach to Outlier Detection in Medical Dataset. Asian Journal of Computer Science and Technology, 6(2), 18–22. https://doi.org/10.51983/ajcst-2017.6.2.1785