Gradient Descent "until converge" question-CodePudding

Below is a python implementation of Gradient Descent for Linear Regression supplied by my Machine Learning professor. I believe I understand how it works, however, the professor suggested in lecture that this is not a complete implementation of the algorithm, as it is supposed to repeat until theta_0(m) and theta_1(c) converge. He suggested we attempt to revise this implementation to have it perform until converging instead of on epochs to get more familiar with the algorithm, though I'm not quite sure where to start with that. Any tips?

# Building the model
m = 0
c = 0

L = 0.0001  # The learning Rate
epochs = 1000  # The number of iterations to perform gradient descent

n = float(len(X)) # Number of elements in X

# Performing Gradient Descent 
for i in range(epochs): 
    Y_pred = m*X   c  # The current predicted value of Y
    D_m = (-2/n) * sum(X * (Y - Y_pred))  # Derivative wrt m
    D_c = (-2/n) * sum(Y - Y_pred)  # Derivative wrt c
    m = m - L * D_m  # Update m
    c = c - L * D_c  # Update c
    
print (m, c)```

CodePudding user response：

Essentially, convergence is when the loss settles at a certain value, i.e. it becomes more or less constant. So to stop the gradient descent at convergence, simply calculate the cost function (aka the loss function) using the values of m and c at each gradient descent iteration. You can add a threshold for the loss, or check whether it becomes constant and that is when your model has converged.

CodePudding user response：

I think what he means to say is to keep running till the derivative of Loss with respect to theta_0 (m) and theta_1 (c) becomes close to 0 or there is no more improvement to the Loss function. What you can do is remove the for loop which is running till number of epochs and add a while loop that will keep running until m and c no more improves (the loss function value becomes almost static).