Loss function

As we start with random values, our learnable parameters, w and b, will result in y_pred, which will not be anywhere close to the actual y. So, we need to define a function which tells the model how close its predictions are to the actual values. Since this is a regression problem, we use a loss function called sum of squared error (SSE). We take the difference between the predicted y and the actual y and square it. SSE helps the model to understand how close the predicted values are to the actual values. The torch.nn library has different loss functions, such as MSELoss and cross-entropy loss. However, for this chapter, let's implement the loss function ourselves:

def loss_fn(y,y_pred):
loss = (y_pred-y).pow(2).sum()
for param in [w,b]:
if not param.grad is None: param.grad.data.zero_()
loss.backward()
return loss.data[0]

Apart from calculating the loss, we also call the backward operation, which calculates the gradients of our learnable parameters, w and b. As we will use the loss function more than once, we remove any previously calculated gradients by calling the grad.data.zero_() operation. The first time we call the backward function, the gradients are empty, so we zero the gradients only when they are not None