top of page
Search

Transportation Issues - Humana HealthCare

  • ckevinkusuma
  • Jun 3, 2021
  • 1 min read

Summary

This project aims to provide Humana Healthcare with more profound and broader insights into what traits and characteristics most associate with patients’ likelihood of having transportation issues that prevent them from getting to medical appointments, meetings, work, or getting necessities for daily living. The second aim is to provide accurate predictions on which patients will most likely have issues with transportation.


The generalized linear model will determine the most influential variables from 826 available variables provided by the competition committee. The model identifies 49 variables that are most associative with the patients that have transportation issues. Several of them give fascinating insights that lead to the discovery of a subgroup that has a distinct trait that separates it from the rest of the observations. This subgroup plays an essential role in the process of formulating recommendations for this project.


This project’s predictive part will involve various data cleaning and preparation methods to prepare it for training several machine learning algorithms on RStudio. The Gradient Boosting Machine model outperforms other models when verified against the testing dataset derived from the original data. The Receiver Operating Characteristic (ROC) score of the best model is 75%, which is not very high for most classification modeling. However, because the source of the competition data is a survey, it is generally acceptable to have a ROC that is lower than normally accepted. Due to the nature of the survey, which is often incomplete and prone to human error, having a high ROC score can lead to questions about the data’s validity or issues with bias.





 
 
 

Comments


Post: Blog2_Post

©2021 by Christian Kevin Kusuma. Proudly created with Wix.com

bottom of page