- 🎯 Aspiring Software Engineer fueled by curiosity.
- 💼 Backend Python Developer specializing in Django and FastAPI
- 🧠 Passionate about applying Machine Learning & Deep Learning to Cybersecurity problems
- 🔭 Current focus: Malware detection using advanced NLP techniques like multi-view DistilBERT
- 💻 Check out my portfolio:
- 📫 Reach me at: soaebhasan04@gmail.com
-
CyberSecurity Using Machine Learning and Deep Learning Intern | IIT Roorkee (Jan'26 - Apr'26)
- Developed MalBERT-XAI, a novel multi-view DistilBERT model with cross-attention fusion for Android malware detection (99.5% accuracy, 5-class family classification)
- Built end-to-end pipeline: APK feature extraction, multi-view tokenization, transformer training, and 3-level explainability framework (LIME + SHAP + Attention)
-
Open Source Mentor | GirlScript Summer of Code (GSSOC) (Sep'25 - Nov'25)
- Mentored 10+ student contributors on best practices in Full Stack development using Python and Django, reviewing code and providing constructive feedback
- Facilitated a collaborative environment, leading to the successful merging of 30+ pull requests into main project repositories
-
Android Malware Detection using Multi-View Transformer (Python | PyTorch | Transformers | Deep Learning | NLP | DistilBERT | LIME | SHAP)
- Built MalBERT-XAI, a multi-view transformer architecture with cross-attention fusion for Android malware detection, achieving 99.5% binary accuracy and 94.3% family classification
- Engineered a 4-view feature pipeline extracting permissions, API calls, intents, and opcodes from APKs, replacing single-input approach and improving accuracy from 91.6% to 99.5%
- Implemented a 3-level explainability framework (attention weights + SHAP + LIME) to interpret model decisions at view-level, global, and token-level granularity
-
AI-Powered Organ Matching Platform (Django | Python | Scikit-learn | Tailwind | HTMX | Alpine.js)
- Built an AI-powered organ matching platform using TF-IDF vectorization and cosine similarity to quantify donor-recipient compatibility, achieving 95% model accuracy
- Designed a multi-layered prediction pipeline combining unsupervised similarity scoring with rule-based clinical constraints to produce medically sound ranked recommendations
- Engineered end-to-end data preprocessing, feature engineering, and model serialization workflows using Scikit-learn, enabling real-time inference with sub-second response times
-
Multivariable Regression for Housing Price Prediction (Pandas | Numpy | Scikit-learn | Matplotlib)
- Developed a regression model to predict Boston housing prices, achieving an R² score of 0.85 through rigorous exploratory data analysis (EDA) and feature engineering
- Utilized Pandas for data cleaning and Matplotlib for data visualization to identify key price-influencing factors
|
|
|
|
|
|
| 🌐 Web Development | |||||
|---|---|---|---|---|---|
| 📱 Software Development | |||||
|
|
|||||
| 🧠 Machine Learning & Cybersecurity | |||||
| ⚙️ Backend & Frameworks | |||||
| 🗄️ Databases & Tools | |||||
| 🧠 DevOps | |||||
|
|
|
|||
|
|
|||
|
|
|||
















