Diabetes & Prediabetes Feature Selection with XGBoost (Feature Selection Workshop)

Overview

This repository contains code and analyses for building an XGBoost model to classify individuals with diabetes based on healthcare and lifestyle survey data. The primary focus is on feature selection using a variety of methods to reduce the number of features while maximising f1.

Folder structure

+---notebooks
|   +---Feature_Importance_Cheatsheet        <- This provides example code for feature importance for an XGBoost model.
|   +---Feature_Importance_Workshop          <- This trains and XGBoost model for Feature Selection.
|   +---load_diabetes_data                   <- This code was used to load the data from UCI and balance the classes. 
|
|   README.md                                <- Quick start guide

Getting Started

The Feature_Importance_Workshop and the Feature_importance_cheatsheet runs in google collab notebooks.

Dataset Overview: CDC Diabetes Health Indicators

This dataset comprises healthcare statistics and lifestyle survey information about individuals in the United States. It was collected by the Centers for Disease Control and Prevention (CDC) and is publicly available through the UCI Machine Learning Repository.

Citation: Markelle Kelly, Rachel Longjohn, Kolby Nottingham, The UCI Machine Learning Repository

Useful Resources

The work in this repository has been influenced by a number of helpful articles and tutorials:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Diabetes & Prediabetes Feature Selection with XGBoost (Feature Selection Workshop)

Overview

Folder structure

Getting Started

Dataset Overview: CDC Diabetes Health Indicators

Useful Resources

License

About

Uh oh!

Releases

Packages

Languages

scarlett-k-nhs/feature_selection_classification

Folders and files

Latest commit

History

Repository files navigation

Diabetes & Prediabetes Feature Selection with XGBoost (Feature Selection Workshop)

Overview

Folder structure

Getting Started

Dataset Overview: CDC Diabetes Health Indicators

Useful Resources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages