Skip to content

Machine Learning Module #382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
110 changes: 110 additions & 0 deletions SaiPallavi_ML/Supervised and Unsupervised Machine Learning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Machine Learning

The definition of Machine learning goes like : Machine learning (ML) is a type of artificial intelligence (AI) that allows software applications to become more accurate at predicting outcomes without being explicitly programmed to do so.
Machine learning algorithms use historical data as input to predict new output values.

We have been using or witnessing the importance of Machine Learning since we have started using web extensively .
A few daily life examples of machine learning include :

* YouTube recommendation system
* Spam mails filtration
* Virtual Personal Assistants (Google Assistant, Apple's Siri, and business based chatbots)
* Traffic predictions (GPS)
* Product recommendation system on online websites and many more


# Types of Machine Learning
* Supervised Machine Learning
* Unsupervised Machine Learning
* Reinforcement Learning


Let us have a quick look of what they are and their applications.

* Supervised Machine Learning :

In this type of machine learning, data scientists supply algorithms with labeled training data and define the variables they want the algorithm to assess for correlations.
Both the input and the output of the algorithm is specified. This is one of the most simplest types as we know the expectations from data we are supplying to the model.

![Supervised Learning](https://github.com/sai-pallavi-2003/Open-contributions/blob/master/SaiPallavi_ML/supervised-machine-learning.png)

Types of Supervised Machine Learning:
* Classification :
Classification algorithms are used when the output variable is categorical, which means there are two classes such as Yes-No, Male-Female, True-false, etc.
* Spam Filtering,
* Random Forest
* Decision Trees
* Logistic Regression
* Support vector Machines
* KNN algorithm (K Nearest Neighbour)

One of the many applications of Classification is calssifying handwritten digits using the KNN algorithm.

Link to the project : [Classification using KNN](https://github.com/sai-pallavi-2003/Hand-Written-Digit-Classification)

* Regression :
Regression algorithms are used if there is a relationship between the input variable and the output variable.
It is used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc.
Below are some popular Regression algorithms which come under supervised learning:
* Linear Regression
* Regression Trees
* Non-Linear Regression
* Bayesian Linear Regression
* Polynomial Regression

One of the many applications of regression is prediction of house rates(or any variable) using Linear Regression

Link to the project : [House Rate Prediction using Linear Regression](https://github.com/sai-pallavi-2003/HousePricePrediction-using-Linear-Regression)

Applications of Supervised learning :
* Text categorization
* Face Detection
* Signature recognition
* Customer discovery
* Spam detection
* Weather forecasting


* Unsupervised Machine Learning :
Unsupervised learning is a machine learning technique in which models are not supervised using training dataset. Instead, models itself find the hidden patterns and insights from the given data. It can be compared to learning which takes place in the human brain while learning new things. It can be defined as:

*Unsupervised learning is a type of machine learning in which models are trained using unlabeled dataset and are allowed to act on that data without any supervision*

![Unsupervised Machine Learning](https://github.com/sai-pallavi-2003/Open-contributions/blob/master/SaiPallavi_ML/unsupervised-machine-learning-1.png)

Types of unsupervised machine learning:

* Clustering:
Clustering is a method of grouping the objects into clusters such that objects with most similarities remains into a group and has less or no similarities with the objects of another group. Cluster analysis finds the commonalities between the data objects and categorizes them as per the presence and absence of those commonalities.

* Association: An association rule is an unsupervised learning method which is used for finding the relationships between variables in the large database. It determines the set of items that occurs together in the dataset. Association rule makes marketing strategy more effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.
Note: We will learn these algorithms in later chapters.

Applications of Unsupervised Learning:
* Products Segmentation
* Customer Segmentation
* Similarity Detection
* Recommendation Systems
* Labelling unlabelled datasets



# Reinforcement Learning

Reinforcement learning is an area of Machine Learning. It is about taking suitable action to maximize reward in a particular situation. It is employed by various software and machines to find the best possible behavior or path it should take in a specific situation.

Main points in Reinforcement learning –

* Input: The input should be an initial state from which the model will start
* Output: There are many possible outputs as there are a variety of solutions to a particular problem
* Training: The training is based upon the input, The model will return a state and the user will decide to reward or punish the model based on its output.
* The model keeps continues to learn.
* The best solution is decided based on the maximum reward.





By : Sai Pallavi Sistla

[Reach me at my mail]([email protected])
53 changes: 53 additions & 0 deletions SaiPallavi_ML/Underfitting and overfitting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@

# Overfitting and Underfitting

Overfitting and Underfitting are the two main problems that occur in machine learning and degrade the performance of the machine learning models.

The main goal of each machine learning model is to generalize well. Here generalization defines the ability of an ML model to provide a suitable output by adapting the given set of unknown input. It means after providing training on the dataset, it can produce reliable and accurate output.
Hence, the underfitting and overfitting are the two terms that need to be checked for the performance of the model and whether the model is generalizing well or not.

Before diving further let’s understand two important terms:

* Bias: Assumptions made by a model to make a function easier to learn.
* Variance: If you train your data on training data and obtain a very low error, upon changing the data and then training the same previous model you experience a high error, this is variance.

# Underfitting

Underfitting occurs when our machine learning model is not able to capture the underlying trend of the data.
To avoid the overfitting in the model, the fed of training data can be stopped at an early stage, due to which the model may not learn enough from the training data. As a result, it may fail to find the best fit of the dominant trend in the data.

In the case of underfitting, the model is not able to learn enough from the training data, and hence it reduces the accuracy and produces unreliable predictions.

*An underfitted model has high bias and low variance.*

Techniques to reduce underfitting:

* Increase model complexity
* Increase the number of features, performing feature engineering
* Remove noise from the data



# Overfitting

Overfitting occurs when our machine learning model tries to cover all the data points or more than the required data points present in the given dataset
Because of this, the model starts caching noise and inaccurate values present in the dataset, and all these factors reduce the efficiency and accuracy of the model.

*The overfitted model has low bias and high variance.*

Techniques to reduce overfitting:

* Increase training data.
* Reduce model complexity.
* Early stopping during the training phase (have an eye over the loss over the training period as soon as loss begins to increase stop training).


A general view of both the problems :

![Overfitting and underfitting](https://github.com/sai-pallavi-2003/Open-contributions/blob/master/SaiPallavi_ML/overfitting_2.png)



By : Sai Pallavi Sistla

[Reach me here]([email protected])
Binary file added SaiPallavi_ML/overfitting_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added SaiPallavi_ML/supervised-machine-learning.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added SaiPallavi_ML/unsupervised-machine-learning-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.