Diabetes Prediction using Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbors, Logistic Regression Classifiers

S. Peerbasha; Y. Mohammed Iqbal; Praveen K.P; M. Mohamed Surputheen; A Saleem Raja

doi:10.46947/joaasr542023680

Authors

S. Peerbasha Department of Computer Science, Jamal Mohamed College, Affiliated to Bharathidasan University, Trichy-20, Tamil Nadu, India.
Y. Mohammed Iqbal Department of Computer Science, Jamal Mohamed College, Affiliated to Bharathidasan University, Trichy-20, Tamil Nadu, India.
Praveen K.P Department of Computer Science, Jamal Mohamed College, Affiliated to Bharathidasan University, Trichy-20, Tamil Nadu, India.
M. Mohamed Surputheen Department of Computer Science, Jamal Mohamed College, Affiliated to Bharathidasan University, Trichy-20, Tamil Nadu, India.
A Saleem Raja Information Technology Department, University of Technology and Applied Sciences-Shinas, Sultanate of Oman

DOI:

https://doi.org/10.46947/joaasr542023680

Keywords:

Logistic Regression, Data Mining, K-Nearest Neighbors, Decision Tree, Machine Learning, Random Forest, Support Vector Machine.

Abstract

One of the world's deadliest diseases is diabetes. It is an additional creator of different assortments of problems. Ex: Coronary disappointment, Visual impairment, Urinary organ illnesses, and so forth. In such cases, the patients are expected to visit a hospital to get a consultation with doctors and their reports. They must contribute their time and cash every time they visit the hospital. Yet, with the development of AI techniques, we have the adaptability to search out a response to the present problem. We have progressed an advanced framework for handling data that can figure regardless of whether the patient has polygenic sickness. In addition, being able to foresee the onset of the disease is crucial for patients. Data withdrawal has the adaptability to eliminate concealed information from an enormous amount of diabetes-related data. The most important outcomes of this research are the establishment of a theoretical framework that can reliably predict a patient's level of risk for developing diabetes. We have utilized the existing categorization methods such as DT (Decision Tree), RF (Random Forest), SVM (Support vector Machine), LR (Logistic Regression) as well as K-NN (K-Nearest Neighbors) for predicting the severity of Type-II Diabetes patients. We got an accuracy of 99% for the Random Forest, 98.40% for the Decision Tree, 78.54% for Logistic Regression, 77.94% for SVM (Using RBF Kernal SVM), and 77.64% for KNN.