CrossEntropyLoss与NLLLoss的总结

简介

nll_loss(negative log likelihood loss):最大似然 / log似然代价函数

CrossEntropyLoss: 交叉熵损失函数。交叉熵描述了两个概率分布之间的距离,当交叉熵越小说明二者之间越接近。

NLLLoss 的 输入 是一个对数概率向量和一个目标标签. 它不会为我们计算对数概率. 适合网络的最后一层是log_softmax. 损失函数 nn.CrossEntropyLoss() 与 NLLLoss() 相同, 唯一的不同是它为我们去做 log softmax.

参考博客:https://zhuanlan.zhihu.com/p/383044774

官方NLLLoss链接:https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html

CrossEntropyLoss()=log_softmax() + NLLLoss()

NLLLoss说明:

对于NLLLoss:

import torch
import torch.nn.functional as F
import torch.nn as nn
import numpy as np
if __name__ == '__main__':
    predict = torch.Tensor([[2, 3, 1],
                            [3, 7, 9]])
    label = torch.tensor([1, 2])
    loss = nn.NLLLoss()(predict, label)
    print(loss)

    loss = nn.NLLLoss(reduction="sum")(predict, label)
    print(loss)

    loss = nn.NLLLoss(reduction="mean")(predict, label)
    print(loss)

结果:

tensor(-6.)
tensor(-12.)
tensor(-6.)

默认的NLLLoss就是 把选中的那些值 求个均值 然后加个负号。

把reduction改成sum就是全部加起来,然后加个负号。

可见NLLLoss并不是直接用来做loss的,它不适合,需要搭配上log softmax。

CrossEntropyLoss说明

a是一个n元的向量, $$a_i\in[-\infty,+\infty]$$

y是一个n元的向量,是真实one_hot label,$$y_i = {0 , 1}$$

示例:

a=[2, 1]
y=[1, 0]

$$ CrossEntropyLoss(a, y) = -(y[0]log(softmax_a[0])+y[1]log(softmax_a[1])) $$

import torch
import torch.nn.functional as F
import numpy as np

if __name__ == '__main__':
    a = torch.tensor([[2, 1]], dtype=torch.float32)
    softmax_a = torch.softmax(a, dim=1)
    print("softmax_a")
    print(softmax_a)

    log_softmax_a = torch.log(softmax_a)
    print("log_softmax_a")
    print(log_softmax_a)

    one_hot_gt = torch.tensor([[1, 0]], dtype=torch.long)
    ans_index = torch.tensor([0], dtype=torch.long)

    loss1 = 0
    for i in range(len(one_hot_gt)):
        for j in range(len(one_hot_gt[i])):
            loss1 += -1*(one_hot_gt[i][j]*log_softmax_a[i][j])
    print("loss1")
    print(loss1)

    loss2 = F.nll_loss(log_softmax_a, ans_index)
    print("loss2")
    print(loss2)

    loss3 = F.cross_entropy(a, ans_index)
    print("loss3")
    print(loss3)

结果:

tensor([0.7311, 0.2689])
tensor([-0.3133, -1.3133])
tensor(0.3133)
tensor(0.3133)

$$ softmax(a) = \frac{e^{a_i}}{\sum_{i=1}^ne^{a_i}} $$

这里的n是指a向量的长度

$$ NLLLoss(a, y) = -\sum_{i=1}^ny_i*a_i $$

$$ CrossEntropyLoss(a, y) = NLLLoss(log\_softmax(a), y) $$

$$ CrossEntropyLoss(a, y) = -\sum_{i=1}^nylog(softmax(a)) $$

代码

测试计算逻辑:

import torch
import torch.nn.functional as F
import numpy as np

if __name__ == '__main__':
    # class1->class1 class2->class2
    data = torch.tensor(np.array([[2, 1], [1, 2]]), dtype=torch.float32)
    target = torch.tensor([0, 1])
    print("data:", data)
    print("target", target)
    print("")
    entropy_out = F.cross_entropy(data, target)
    print(entropy_out)
    print("")

    # class1->class1 class1->class2
    data = torch.tensor(np.array([[2, 1], [1, 2]]), dtype=torch.float32)
    target = torch.tensor([0, 0])
    print("data:", data)
    print("target", target)
    print("")
    entropy_out = F.cross_entropy(data, target)
    print(entropy_out)
    print("")

    # class1->class2 class2->class1
    data = torch.tensor(np.array([[1, 2], [2, 1]]), dtype=torch.float32)
    target = torch.tensor([0, 1])
    print("data:", data)
    print("target", target)
    print("")
    entropy_out = F.cross_entropy(data, target)
    print(entropy_out)
    print("")

    # class2->class1 class2->class2
    data = torch.tensor(np.array([[2, 1], [1, 2]]), dtype=torch.float32)
    target = torch.tensor([1, 1])
    print("data:", data)
    print("target", target)
    print("")
    entropy_out = F.cross_entropy(data, target)
    print(entropy_out)
    print("")

运行结果:

data: tensor([[2., 1.],
        [1., 2.]])
target tensor([0, 1])

tensor(0.3133)

data: tensor([[2., 1.],
        [1., 2.]])
target tensor([0, 0])

tensor(0.8133)

data: tensor([[1., 2.],
        [2., 1.]])
target tensor([0, 1])

tensor(1.3133)

data: tensor([[2., 1.],
        [1., 2.]])
target tensor([1, 1])

tensor(0.8133)
文章目录