CrossEntropyLoss与NLLLoss的总结
简介
nll_loss(negative log likelihood loss):最大似然 / log似然代价函数
CrossEntropyLoss: 交叉熵损失函数。交叉熵描述了两个概率分布之间的距离,当交叉熵越小说明二者之间越接近。
NLLLoss 的 输入 是一个对数概率向量和一个目标标签. 它不会为我们计算对数概率. 适合网络的最后一层是log_softmax. 损失函数 nn.CrossEntropyLoss() 与 NLLLoss() 相同, 唯一的不同是它为我们去做 log softmax.
参考博客:https://zhuanlan.zhihu.com/p/383044774
官方NLLLoss链接:https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html
CrossEntropyLoss()=log_softmax() + NLLLoss()
NLLLoss说明:
对于NLLLoss:
import torch
import torch.nn.functional as F
import torch.nn as nn
import numpy as np
if __name__ == '__main__':
predict = torch.Tensor([[2, 3, 1],
[3, 7, 9]])
label = torch.tensor([1, 2])
loss = nn.NLLLoss()(predict, label)
print(loss)
loss = nn.NLLLoss(reduction="sum")(predict, label)
print(loss)
loss = nn.NLLLoss(reduction="mean")(predict, label)
print(loss)
结果:
tensor(-6.)
tensor(-12.)
tensor(-6.)
默认的NLLLoss就是 把选中的那些值 求个均值 然后加个负号。
把reduction改成sum就是全部加起来,然后加个负号。
可见NLLLoss并不是直接用来做loss的,它不适合,需要搭配上log softmax。
CrossEntropyLoss说明
a是一个n元的向量, $$a_i\in[-\infty,+\infty]$$
y是一个n元的向量,是真实one_hot label,$$y_i = {0 , 1}$$
示例:
a=[2, 1]
y=[1, 0]
$$ CrossEntropyLoss(a, y) = -(y[0]log(softmax_a[0])+y[1]log(softmax_a[1])) $$
import torch
import torch.nn.functional as F
import numpy as np
if __name__ == '__main__':
a = torch.tensor([[2, 1]], dtype=torch.float32)
softmax_a = torch.softmax(a, dim=1)
print("softmax_a")
print(softmax_a)
log_softmax_a = torch.log(softmax_a)
print("log_softmax_a")
print(log_softmax_a)
one_hot_gt = torch.tensor([[1, 0]], dtype=torch.long)
ans_index = torch.tensor([0], dtype=torch.long)
loss1 = 0
for i in range(len(one_hot_gt)):
for j in range(len(one_hot_gt[i])):
loss1 += -1*(one_hot_gt[i][j]*log_softmax_a[i][j])
print("loss1")
print(loss1)
loss2 = F.nll_loss(log_softmax_a, ans_index)
print("loss2")
print(loss2)
loss3 = F.cross_entropy(a, ans_index)
print("loss3")
print(loss3)
结果:
tensor([0.7311, 0.2689])
tensor([-0.3133, -1.3133])
tensor(0.3133)
tensor(0.3133)
$$ softmax(a) = \frac{e^{a_i}}{\sum_{i=1}^ne^{a_i}} $$
这里的n是指a向量的长度
$$ NLLLoss(a, y) = -\sum_{i=1}^ny_i*a_i $$
$$ CrossEntropyLoss(a, y) = NLLLoss(log\_softmax(a), y) $$
$$ CrossEntropyLoss(a, y) = -\sum_{i=1}^nylog(softmax(a)) $$
代码
测试计算逻辑:
import torch
import torch.nn.functional as F
import numpy as np
if __name__ == '__main__':
# class1->class1 class2->class2
data = torch.tensor(np.array([[2, 1], [1, 2]]), dtype=torch.float32)
target = torch.tensor([0, 1])
print("data:", data)
print("target", target)
print("")
entropy_out = F.cross_entropy(data, target)
print(entropy_out)
print("")
# class1->class1 class1->class2
data = torch.tensor(np.array([[2, 1], [1, 2]]), dtype=torch.float32)
target = torch.tensor([0, 0])
print("data:", data)
print("target", target)
print("")
entropy_out = F.cross_entropy(data, target)
print(entropy_out)
print("")
# class1->class2 class2->class1
data = torch.tensor(np.array([[1, 2], [2, 1]]), dtype=torch.float32)
target = torch.tensor([0, 1])
print("data:", data)
print("target", target)
print("")
entropy_out = F.cross_entropy(data, target)
print(entropy_out)
print("")
# class2->class1 class2->class2
data = torch.tensor(np.array([[2, 1], [1, 2]]), dtype=torch.float32)
target = torch.tensor([1, 1])
print("data:", data)
print("target", target)
print("")
entropy_out = F.cross_entropy(data, target)
print(entropy_out)
print("")
运行结果:
data: tensor([[2., 1.],
[1., 2.]])
target tensor([0, 1])
tensor(0.3133)
data: tensor([[2., 1.],
[1., 2.]])
target tensor([0, 0])
tensor(0.8133)
data: tensor([[1., 2.],
[2., 1.]])
target tensor([0, 1])
tensor(1.3133)
data: tensor([[2., 1.],
[1., 2.]])
target tensor([1, 1])
tensor(0.8133)