Skip to content

theaathish/Stroke-Prediction-using-Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Stroke Prediction using Machine Learning

Course Project: 191ROE051T - Machine Learning for Robotics

Python License: MIT GitHub stars GitHub issues

Predicting stroke risk using patient health metrics and machine learning

📋 Table of Contents

🌟 Overview

This project implements a machine learning pipeline to predict the likelihood of a patient having a stroke based on various health parameters. The model helps in early detection of stroke risk, enabling timely medical intervention.

Key Metrics (on test set):

  • Accuracy: 95.3%
  • Precision: 0.72
  • Recall: 0.52
  • F1-Score: 0.60
  • AUC-ROC: 0.86

✨ Features

  • Comprehensive EDA with interactive visualizations
  • Feature Engineering with domain-specific transformations
  • Multiple ML Models including Random Forest, XGBoost, and LightGBM
  • Hyperparameter Tuning using Optuna
  • Model Explainability with SHAP values
  • Deployment-ready API using FastAPI

🚀 Installation

  1. Clone the repository

    git clone https://github.com/theaathish/stroke-prediction.git
    cd stroke-prediction
  2. Create and activate virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Download the dataset

    • Get the dataset from Kaggle
    • Place healthcare-dataset-stroke-data.csv in the data/ directory

💻 Usage

1. Run the full pipeline

# Preprocess data
python -m src.data_preprocessing

# Train model
python -m src.model

# Start the web app
python -m src.app

2. Explore the notebooks

Check out the Jupyter notebooks in the notebooks/ directory for detailed analysis and experimentation.

📁 Project Structure

stroke-prediction/
├── data/                    # Raw and processed data
│   ├── raw/                 # Original dataset
│   └── processed/           # Processed datasets
│
├── notebooks/               # Jupyter notebooks
│   └── Stroke_Prediction_Analysis.ipynb
│
├── src/                     # Source code
│   ├── __init__.py
│   ├── data_preprocessing.py
│   ├── feature_engineering.py
│   ├── model.py
│   ├── train.py
│   └── app.py
│
├── models/                  # Trained models
│   └── stroke_model.pkl
│
├── reports/                 # Reports and visualizations
│   └── figures/
│
├── tests/                   # Unit tests
│   └── test_*.py
│
├── .gitignore
├── requirements.txt
└── README.md

📊 Results

Feature Importance

Feature Importance

Confusion Matrix

Confusion Matrix

ROC Curve

ROC Curve

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📜 License

Distributed under the MIT License. See LICENSE for more information.

🙏 Acknowledgements


Developed with ❤️ by @theaathish

📅 August 2025

About

Course Project: 191ROE051T - Machine Learning for Robotics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors