Adaptive moment estimation (Adam) Algorithm for deep
learning optimization
AdamGh.m
Sure! Let's go through the code line by line:
1. `clc, clear, close all`: These are commands to clear the command window, clear
all variables from the workspace, and close all open figures.
3. `% Define hyperparameters`: This is a comment indicating that the following
lines define hyperparameters for the algorithm.
5. `X=1:0.1:10;`: This creates a row vector `X` with values from 1 to 10,
incrementing by 0.1.
6. `X=X';`: This transposes the vector `X` to make it a column vector.
9. `y=sin(X);`: This assigns the sine of each element in `X` to the variable `y`.
11-14. These lines define the hyperparameters `alpha`, `beta1`, `beta2`, and
`epsilon`.
17. `theta = randn(91, 1);`: This initializes the variable `theta` as a column vector of
size 91 with random values drawn from a standard normal distribution.
20-23. These lines initialize the variables `m`, `v`, and `t` used in the optimization
algorithm.
26. `max_iterations=10;`: This sets the maximum number of iterations for the
optimization algorithm to 10.
29. `loss_history = zeros(max_iterations, 1);`: This initializes a column vector
`loss_history` of size `max_iterations` to store the loss values at each iteration.
30. `theta_history = zeros(max_iterations, length(theta));`: This initializes a matrix
`theta_history` of size `max_iterations` by the length of `theta` to store the values
of `theta` at each iteration.
32-78. This block of code is a while loop that performs the optimization algorithm
for a maximum of `max_iterations` iterations.
34. `t = t + 1;`: This increments the iteration counter `t` by 1.
37-38. These lines compute the gradient of the loss function and the loss value
using the function `compute_gradient()`.
41. `m = beta1 * m + (1 - beta1) * grad;`: This updates the first moment estimate
`m` using exponential averaging.