多标签分类评估方式(Metrics for measuring the prediction quality of a multi-label system)
Metrics for measuring the prediction quality of a multi-label system
大多数情况下,机器学习,深度学习中我们经常面对的分类问题大多是二分类,多类问题,但是有时候我们也会有多标签分类问题的产生。
对于多标签分类问题,我们需要一个合理客观的评价这个分类器好坏的方式,在这里根据 Some remarks on predicting multi-label attributes in molecular biosystems Kuo-Chen Chou老师的论文复现了里面讲到的评估方式(Aiming,Coverage,Accuracy,AbsoluteTrue,AbsoluteFalse),能够较全面的对多标签分类问题进行评估,供大家参考。(这个评估标准基于生物医学的背景下,若有问题,大家一起交流学习)
def Aiming(y_hat, y):
'''
the “Aiming” rate (also called “Precision”) is to reflect the average ratio of the
correctly predicted labels over the predicted labels; to measure the percentage
of the predicted labels that hit the target of the real labels.
'''
import numpy as np
n, m = y_hat.shape
sorce_k = 0
for v in range(n):
union = 0
intersection = 0
for h in range(m):
if y_hat[v,h] == 1 or y[v,h] == 1:
union += 1
if y_hat[v,h] == 1 and y[v,h] == 1:
intersection += 1
if intersection == 0:
continue
sorce_k += intersection/sum(y_hat[v])
return sorce_k/n
def Coverage(y_hat, y):
'''
The “Coverage” rate (also called “Recall”) is to reflect the average ratio of the
correctly predicted labels over the real labels; to measure the percentage of the
real labels that are covered by the hits of prediction.
'''
import numpy as np
n, m = y_hat.shape
sorce_k = 0
for v in range(n):
union = 0
intersection = 0
for h in range(m):
if y_hat[v,h] == 1 or y[v,h] == 1:
union += 1
if y_hat[v,h] == 1 and y[v,h] == 1:
intersection += 1
if intersection == 0:
continue
sorce_k += intersection/sum(y[v])
return sorce_k/n
def Accuracy(y_hat, y):
'''
The “Accuracy” rate is to reflect the average ratio of correctly predicted labels
over the total labels including correctly and incorrectly predicted labels as well
as those real labels but are missed in the prediction
'''
import numpy as np
n, m = y_hat.shape
sorce_k = 0
for v in range(n):
union = 0
intersection = 0
for h in range(m):
if y_hat[v,h] == 1 or y[v,h] == 1:
union += 1
if y_hat[v,h] == 1 and y[v,h] == 1:
intersection += 1
if intersection == 0:
continue
sorce_k += intersection/union
return sorce_k/n
def AbsoluteTrue(y_hat, y):
'''
错误一个即为零
'''
import numpy as np
n, m = y_hat.shape
sorce_k = 0
for v in range(n):
if list(y_hat[v]) == list(y[v]):
sorce_k += 1
return sorce_k/n
def AbsoluteFalse(y_hat, y):
'''
hamming loss
'''
import numpy as np
n, m = y_hat.shape
sorce_k = 0
for v in range(n):
union = 0
intersection = 0
for h in range(m):
if y_hat[v,h] == 1 or y[v,h] == 1:
union += 1
if y_hat[v,h] == 1 and y[v,h] == 1:
intersection += 1
sorce_k += (union-intersection)/m
return sorce_k/n