Skip to content

Latest commit

 

History

History

gradient-descent

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Gradient descent

Gradient Descent is an iterative optimization algorithm used to find the minimum (or maximum) of a function, typically used in the context of training machine learning models. Its primary objective is to adjust the parameters of a model in the direction that reduces the value of a given loss function, allowing the model to better fit the training data and make more accurate predictions.

Here's how the gradient descent algorithm works…

Initialization: The algorithm starts with an initial guess for the model's parameters (weights and biases). These parameters represent the internal variables that the model uses to make predictions.

Forward Pass: The input data is fed into the model, and the forward propagation process computes the model's predictions for the given inputs using the current values of the parameters.

Loss Computation: The loss function is evaluated, measuring how well the model's predictions match the true target values.

Backpropagation: The key step in gradient descent is to compute the gradients of the loss function with respect to each model parameter.

Gradient Update: With the gradients calculated, the algorithm updates the model's parameters by subtracting a small fraction (learning rate) of the gradients from their current values.

Iteration: Repeat the above, each time adjusting the parameters in the direction that reduces the loss. The algorithm continues until a stopping condition is met, such as a maximum number of iterations or when the change in loss becomes negligible.

Variants of gradient descent, such as stochastic gradient descent (SGD), mini-batch gradient descent, and adaptive learning rate methods (e.g., Adam), introduce additional optimizations and efficiencies to improve the convergence speed and stability of the algorithm during model training.