UNSW Categorical Analysis – Final Project

This repository contains a Jupyter Notebook for analyzing and testing categorical features of the UNSW-NB15 dataset, a benchmark dataset widely used in network security and intrusion detection research.

📌 Project Overview

Focused on categorical preprocessing and evaluation for the UNSW dataset.
Explores encoding techniques, feature separation, and classification performance.
Provides a reproducible framework for testing categorical handling methods on security-related data.

📂 Repository Structure


unsw-categorical-final-separate-test.ipynb   # Main notebook with preprocessing and experiments
data/                                        # (Optional) Folder for dataset storage
results/                                     # (Optional) Folder for experiment outputs
README.md                                    # Project documentation

⚙️ Requirements

Make sure you have the following installed before running the notebook:

Python 3.8+
Jupyter Notebook or JupyterLab
Required libraries:
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn

Install dependencies with:

pip install -r requirements.txt

(If no requirements.txt is provided, install the libraries manually.)

🚀 How to Run

Clone the repository:

git clone https://github.com/YOUR_USERNAME/unsw-categorical-analysis.git
cd unsw-categorical-analysis

Launch Jupyter Notebook:
```
jupyter notebook
```
Open unsw-categorical-final-separate-test.ipynb and run the cells in order.

📊 Key Features

Data Cleaning – handling missing values, categorical separation.
Encoding Methods – one-hot encoding, label encoding, frequency encoding.
Model Training & Testing – supervised learning models on processed categorical features.
Evaluation – accuracy, precision, recall, and F1-score metrics.

📈 Results

Insights into the effect of different categorical encoding techniques.
Model performance benchmarks on the UNSW-NB15 dataset.
Framework extendable to other cybersecurity datasets.

🧠 Inspiration

This project was inspired by the need to better understand how categorical data preprocessing impacts intrusion detection and cybersecurity analytics.

🤝 Contributing

Contributions are welcome! If you’d like to extend the project (e.g., new encoding techniques, model architectures, or visualizations), please fork the repo and submit a PR.

📜 License

This project is licensed under the MIT License. Feel free to use, modify, and share with attribution.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Feature_Selection_Notebook.ipynb		Feature_Selection_Notebook.ipynb
README.md		README.md
unsw-categorical-final-separate-test-Copy1.ipynb		unsw-categorical-final-separate-test-Copy1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UNSW Categorical Analysis – Final Project

📌 Project Overview

📂 Repository Structure

⚙️ Requirements

🚀 How to Run

📊 Key Features

📈 Results

🧠 Inspiration

🤝 Contributing

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UNSW Categorical Analysis – Final Project

📌 Project Overview

📂 Repository Structure

⚙️ Requirements

🚀 How to Run

📊 Key Features

📈 Results

🧠 Inspiration

🤝 Contributing

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages