Skip to content

loki52501/autoencode_detection_research-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

UNSW Categorical Analysis – Final Project

This repository contains a Jupyter Notebook for analyzing and testing categorical features of the UNSW-NB15 dataset, a benchmark dataset widely used in network security and intrusion detection research.


πŸ“Œ Project Overview

  • Focused on categorical preprocessing and evaluation for the UNSW dataset.
  • Explores encoding techniques, feature separation, and classification performance.
  • Provides a reproducible framework for testing categorical handling methods on security-related data.

πŸ“‚ Repository Structure


unsw-categorical-final-separate-test.ipynb   # Main notebook with preprocessing and experiments
data/                                        # (Optional) Folder for dataset storage
results/                                     # (Optional) Folder for experiment outputs
README.md                                    # Project documentation


βš™οΈ Requirements

Make sure you have the following installed before running the notebook:

  • Python 3.8+
  • Jupyter Notebook or JupyterLab
  • Required libraries:
    • pandas
    • numpy
    • scikit-learn
    • matplotlib
    • seaborn

Install dependencies with:

pip install -r requirements.txt

(If no requirements.txt is provided, install the libraries manually.)


πŸš€ How to Run

  1. Clone the repository:

    git clone https://github.com/YOUR_USERNAME/unsw-categorical-analysis.git
    cd unsw-categorical-analysis
  2. Launch Jupyter Notebook:

    jupyter notebook
  3. Open unsw-categorical-final-separate-test.ipynb and run the cells in order.


πŸ“Š Key Features

  • Data Cleaning – handling missing values, categorical separation.
  • Encoding Methods – one-hot encoding, label encoding, frequency encoding.
  • Model Training & Testing – supervised learning models on processed categorical features.
  • Evaluation – accuracy, precision, recall, and F1-score metrics.

πŸ“ˆ Results

  • Insights into the effect of different categorical encoding techniques.
  • Model performance benchmarks on the UNSW-NB15 dataset.
  • Framework extendable to other cybersecurity datasets.

🧠 Inspiration

This project was inspired by the need to better understand how categorical data preprocessing impacts intrusion detection and cybersecurity analytics.


🀝 Contributing

Contributions are welcome! If you’d like to extend the project (e.g., new encoding techniques, model architectures, or visualizations), please fork the repo and submit a PR.


πŸ“œ License

This project is licensed under the MIT License. Feel free to use, modify, and share with attribution.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors