• Home
  • Submit Paper
  • Check Paper Status
  • Download Certificate/Paper
  • FAQs
  • Contact Us
Email: ijraset@gmail.com
IJRASET Logo
Journal Statistics & Approval Details
Recent Published Paper
Our Author's Feedback
 •  ISRA Impact Factor 7.894       •  SJIF Impact Factor: 7.538       •  Hard Copy of Certificates to All Authors       •  DOI by Crossref for all Published Papers       •  Soft Copy of Certificates- Within 04 Hours       •  Authors helpline No: +91-8813907089(Whatsapp)       •  No Publication Fee for Paper Submission       •  Hard Copy of Certificates to all Authors       •  UGC Approved Journal: IJRASET- Click here to Check     
  • About Us
    • About Us
    • Aim & Scope
  • Editorial Board
  • Impact Factor
  • Call For Papers
    • Submit Paper Online
    • Current Issue
    • Special Issue
  • For Authors
    • Instructions for Authors
    • Submit Paper
    • Download Certificates
    • Check Paper Status
    • Paper Format
    • Copyright Form
    • Membership
    • Peer Review
  • Past Issue
    • Monthly Issue
    • Special Issue
  • Pay Fee
    • Indian Authors
    • International Authors
  • Topics
ISSN: 2321-9653
Estd : 2013
IJRASET - Logo
  • Home
  • About Us
    • About Us
    • Aim & Scope
  • Editorial Board
  • Impact Factor
  • Call For Papers
    • Submit Paper Online
    • Current Issue
    • Special Issue
  • For Authors
    • Instructions for Authors
    • Submit Paper
    • Download Certificates
    • Check Paper Status
    • Paper Format
    • Copyright Form
    • Membership
    • Peer Review
  • Past Issue
    • Monthly Issue
    • Special Issue
  • Pay Fee
    • Indian Authors
    • International Authors
  • Topics

Ijraset Journal For Research in Applied Science and Engineering Technology

  • Home / Ijraset
  • On This Page
  • Abstract
  • Introduction
  • Conclusion
  • References
  • Copyright

Prediction of Diabetes Using Ensemble Learning

Authors: V. Joe Nithin, Prof. S. Pallam Setty

DOI Link: https://doi.org/10.22214/ijraset.2022.47114

Certificate: View Certificate

Abstract

Diabetes mellitus is a chronic condition that influences everyday life of the individual having this disease. Diabetes can only be treated to maintain controlled blood glucose levels than to achieve a permanent cure to lead a normal life. As the proverb goes, “prevention is better than cure”, this model aims at “predicting the probability”, of getting this condition, which help early prognosis enough to either avoid it or delay it. Ensemble method is used for prediction of probability of getting diabetes. Classification models in machine learning are used for decision making and enlisted in sequence of accuracy. Hyperparameters are tuned for top five accurate models. Comparison of different classifiers are carried out and then subjected to voting to choose the best possible method of prediction. Voting is carried out in hard voting and soft voting procedures. The results obtained are better compared to general classifiers individually.

Introduction

I. INTRODUCTION

Diabetes mellitus is a chronic condition where the pancreas loses the ability to produce enough insulin to breakdown glucose. Over the long-term high glucose levels are associated with damage to the body and failure of various organs and tissues. According to the World Health Organization, the population with diabetes rose from 108 million in 1980 to 422 million in 2014. Moreover, in 2016, it was the primary cause of 1.6 million deaths [1].

Approximately, 537 million adults, between 20 - 79 years, are living with diabetes. Undiagnosed adults account to 50% (240 million) of those living with diabetes. The total number of people living with diabetes is projected to rise to 643 million by 2030 and 783 million by 2045 [2].

Prediction of a desired outcome can be achieved through machine learning algorithms through analysis of available deciding factors. Based on this there were many predictions like weather forecasting can be made. To achieve this, classification algorithms like decision trees, regression models are used. Pre-existing techniques for diabetes prediction include classification methods and manual choice of any one of those methods depending on their accuracy of prediction. This new method aims at choosing an appropriate method by itself by stacking models and selection of the best method by voting, known as ensemble of models. Ensemble is a combination of different classifiers, results of which are then used as a classification model for the purpose of choosing the best model, which always yields higher accuracy. Large database is maintained by healthcare sectors that can be used by these kinds of techniques as a part of big data analytics, that contribute significantly to make healthcare better. Health care initiatives like Ayushman Bharat [3], Aarogyasri [4] and family doctor [5] etc., by governments can benefit with this approach with the slightest changes.

II. METHODOLOGY

Machine learning techniques are existent for basic classification of data. These classifiers are used for complex learning of parameters and predict the possible outcome. Existing methods are simple use of a classifier, believed to be better by the developer. Proposed technique uses ensemble technique, which is a collective use of different predefined classifiers, to choose the best from them by the algorithm itself. Ensemble learning [6] use evaluation through different models of classifier. Here, linear discriminant analysis, logistic regression, catboost classifier, random forest classifier, gradient boost classifier, extra trees classifier and ada boost classifier are used for comparison.

The constructed models are then stacked to create a model for the models which chooses the best model based on voting criteria. Stacking is done for the afore mentioned models in this proposed technique but can be used for any number of models which may increase the runtime.

Stacked models are ranked according to accuracy and hyperparameters are tuned to improve accuracy of each model using cross validation. Top five accurate models are considered for voting the best method for the provided data. Voting methods used are

  1. Soft Voting
  2. Hard Voting

Soft voting is a method of choosing the best class of classifiers based on the average probability given to that class. Hard voting is a method of choosing the best class of classifiers if majority of them yielded similar outcome. The choice of model may differ based on exploratory data analysis of the trained dataset. Dataset available for testing the model was Pima Indians diabetes data of 768 women. 75% of the data is used to train the model and 25% of the records were used to test the model.

Conclusion

The final model was chosen by selecting a suitable model from the stacked generalization, and then by performing voting, which in the end yielded 90% accuracy, compared to various individual classifiers which accounted to less than around 78% accuracy. This ensemble method proved to be better than individual classifiers which are to be manually checked and anticipated for results at every outcome for comparison.

References

[1] World Health Organization. Diabetes (who.int) (Accessed on 16 December 2021) [2] International Diabetes Federation, https://www.idf.org/aboutdiabetes/what-is-diabetes/facts-figures (Accessed 16 December 2021) [3] Ayushman Bharat, Official Website Ayushman Bharat | HWC (nhp.gov.in) [4] Aarogyasri Scheme, Aarogyasri Health Care Trust - Quality Medicare For All (telangana.gov.in) [5] Family doctor, Andhra Pradesh CM launches Family Doctor system to provide better health services (medicaldialogues.in) [6] A Gentle Introduction to Ensemble Learning Algorithms (machinelearningmastery.com) [7] Ensemble Learning: Stacking, Blending & Voting | by Fernando López | Towards Data Science

Copyright

Copyright © 2022 V. Joe Nithin, Prof. S. Pallam Setty. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

ijraset47114

Download Paper

Authors : Joe Nithin

Paper Id : IJRASET47114

Publish Date : 2022-10-18

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here

About Us

International Journal for Research in Applied Science and Engineering Technology (IJRASET) is an international peer reviewed, online journal published for the enhancement of research in various disciplines of Applied Science & Engineering Technologies.

Quick links
  • Privacy Policy
  • Refund & Cancellation Policy
  • Shipping Policy
  • Terms & Conditions
Quick links
  • Home
  • About us
  • Editorial Board
  • Impact Factor
  • Submit Paper
  • Current Issue
  • Special Issue
  • Pay Fee
  • Topics
Journals for publication of research paper | Research paper publishers | Paper publication sites | Best journal to publish research paper | Research paper publication sites | Journals for paper publication | Best international journal for paper publication | Best journals to publish papers in India | Journal paper publishing sites | International journal to publish research paper | Online paper publishing journal

© 2022, International Journal for Research in Applied Science and Engineering Technology All rights reserved. | Designed by EVG Software Solutions