DEV Community

Super Kai (Kazuya Ito)
Super Kai (Kazuya Ito)

Posted on • Updated on

The loss functions for Neural Network in PyTorch

A loss function is the function which can get the losses(differences) between a model's predictions and true values to evaluate how good a model is. *Loss function is also called Cost Function or Error Function.

There are popular loss function as shown below:

(1) L1 Loss:

  • can compute the average of the sum of the absolute losses(differences) between a model's predictions and true values.
  • 's formula is as shown below: Image description
  • 's pros are as shown below:
    • Less sensitive to outliers.
    • We can easily complare the losses because they are just made absolute so the range of them is not big.
  • 's cons are as shown below:
  • is used for a regression model.
  • is also called Mean Absolute Error(MAE).
  • is L1Loss() in PyTorch.

(2) L2 Loss:

  • can compute the average of the sum of the squared losses(differences) between a model's predictions and true values.
  • 's formula is as shown below: Image description
  • 's pros are as shown below:
    • All squared losses can be differentiable.
  • 's cons are as shown below:
    • Sensitive to outliers.
    • We cannot easily complare the losses because they are squared so the range of them is big.
  • is used for a regression model.
  • is also called Mean Squared Error(MSE).
  • is MSELoss() in PyTorch

(3) Huber Loss:

  • can do the similar computation of either L1 Loss or L2 Loss depending on the absolute losses(differences) between a model's predictions and true values compared with delta which you set. *Memos:
    • delta is 1.0 basically.
    • Be careful, the computation is not exactly same as L1 Loss or L2 Loss according to the formulas below.
  • 's formula is as shown below. *The 1st one is L2 Loss-like one and the 2nd one is L1 Loss-like one: Image description
  • 's pros are as shown below:
    • Less sensitive to outliers.
    • All losses can be differentiable.
    • We can more easily complare the losses than L2 Loss because only small losses are squared so the range of them is smaller than L2 Loss.
  • 's cons are as shown below:
    • The computation is more than L1 Loss and L2 Loss because the formula is more complex than them.
  • is used for a regression model.
  • is HuberLoss() in PyTorch.
  • with delta of 1.0 is same as Smooth L1 Loss which is SmoothL1Loss() in PyTorch.

(4) BCE(Binary Cross Entropy) Loss:

  • can compute the losses(differences) between a model's binary predictions and true binary values.
  • s' formula is as shown below: Image description
  • is used for Binary Classification. *Binary Classification is the technology to classify data into two classes.
  • is also called Binary Cross Entropy or Log(Logarithmic) Loss.
  • is BCELoss() in PyTorch. *Memos:

(5) Cross Entropy Loss:

  • can compute the losses(differences) between a model's predictions and true values. *A loss is between 0 and 1.
  • s' formula is as shown below: Image description
  • is used for Multiclass Classification and Computer Vision. *Memos:
    • Multiclass Classification is the technology to classify data into multiple classes.
    • Computer vision is the technology which enables a computer to understand objects.
  • is CrossEntropyLoss() in PyTorch.

Top comments (0)