CMPE255-Team-Project-Fall-2021

Discover Malicious Websites Using Data Mining Algorithms

Team Members

[Srinivas Gutta] (https://github.com/97gutta)

[Sai Gowtham Ande] (https://github.com/SaiGowtham-11)

[Manish Mapakshi] (https://github.com/manishm96)

[Pratibha Awasthi] (https://github.com/PratibhaAwasthi)

What data you’ll use and where you’ll get it?

This dataset contains a collection of legitimate and phishing website examples. Each website has a set of characteristics that indicate if it is real or not. Data can be used as a source of information in the machine learning process.

Dataset Link: https://data.mendeley.com/datasets/72ptz43s9v/1

DESCRIPTION OF THE PROBLEM:

Web attackers mostly target people in order to steal their personal information. Hackers attempt to duplicate the original website and then exploit the user. The website appears to be legitimate to the user, but it is not. When a user enters their credentials on a fraudulent website, the information is sent to the attackers' servers, where they can obtain credit card information, personal information, or install malware on the user's laptop. As the quantity of online transactions grows, one becomes increasingly vulnerable to these attacks.

POTENTIAL METHODS:

We propose a strategy that employs data mining algorithms to identify dangerous websites by tracking down the URL, in order to limit the amount of attacks leading to phony websites. Attackers change the subdomain and file path (if it occurs in the URL) or introduce a typographical error to resemble a legitimate website. As a result, in order to detect phishing websites, we must study the URL and see what each element contains.The Potential methods can be like Logistic regression,Naive Bayes Classifier, Random forest. Once we start getting deep into the project we plan to include some more methods.

Measurement of Success

The goal of our project is to find the best optimal data mining algorithm based on its accuracy in order to identify a fraudulent website that steals information from users.

Packages Required to Run the Notebook

Install Missing No library, pip install missingpy

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Archive		Archive
images		images
CMPE 255 Project Slide.pptx		CMPE 255 Project Slide.pptx
Milestone3Final.ipynb		Milestone3Final.ipynb
Milestone3Presentation.pptx		Milestone3Presentation.pptx
Milestone_Final_Report.pdf		Milestone_Final_Report.pdf
README.md		README.md
Report.pdf		Report.pdf
dataset_full.csv		dataset_full.csv
dataset_small.csv		dataset_small.csv
delete.txt		delete.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMPE255-Team-Project-Fall-2021

Discover Malicious Websites Using Data Mining Algorithms

Team Members

What data you’ll use and where you’ll get it?

DESCRIPTION OF THE PROBLEM:

POTENTIAL METHODS:

Measurement of Success

Packages Required to Run the Notebook

About

Releases

Packages

Languages

manishm96/Discovering_malicious_websites_using_data_mining_algorithms

Folders and files

Latest commit

History

Repository files navigation

CMPE255-Team-Project-Fall-2021

Discover Malicious Websites Using Data Mining Algorithms

Team Members

What data you’ll use and where you’ll get it?

DESCRIPTION OF THE PROBLEM:

POTENTIAL METHODS:

Measurement of Success

Packages Required to Run the Notebook

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages