Subjects: Computer Science >> Computer Application Technology submitted time 2023-02-15 Cooperative journals: 《桂林电子科技大学学报》
Abstract: Modern neural networks may produce high confidence prediction results for inputs from outside the training distribution,
posing a potential threat to machine learning models. Detecting inputs from out-of-distributions is a central issue in
the safe deployment of models in the real world. Detection methods based on energy models directly use the feature vectors
extracted by the model to calculate the energy score of a sample, and reliance on features that are not significant may affect
the performance of the detection. To alleviate this problem, a loss function based on sparse regularization is proposed to
fine-tune a classification model that has been pre-trained to increase the sparsity of in-distribution sample features while
maintaining the classification power of the model during the learning process. This results in a lower energy score for in-distribution
samples and a larger difference in scores between in-distribution and out-of-distribution samples, thus improving
detection performance. Furthermore, the method does not introduce an external auxiliary dataset, avoiding the effect of correlation
between samples. Experimental results on datasets CIFAR-10 and CIFAR-100 show that the method reduced the average
FPR 95 of detecting the six abnormal datasets by 15.02% and 15.41% respectively.