Abstract
Deep Neural Networks (DNNs) have achieved high accuracy in multiple Natural Language Processing (NLP) applications. The great success lies in the test data is drawn from the same distribution of the training samples. However, researches have found that the current models classify out-of-distribution, adversarial, and erroneous samples incorrectly with high confidence. Researchers also find the problem comes from the softmax layer of DNN. In this paper, we address this issue and propose a method that ignores the softmax layer in the DNN architecture. Specifically, we estimate the training samples' parameters of the output of the pre-softmax layer of DNN using the Dirichlet Process Gaussian Mixture Model (DPGMM). Then, we compute the distance between a test sample and the distribution of the training samples using Mahalanobis distance to get the classification results. We evaluate our method on a classic NLP task, sentiment analysis, by conducting extensive experiments on different models across several real-world datasets. The results demonstrate that our method assigns correct labels to the samples that are misclassified by current DNNs with softmax layer. Our method can be generalized to any pre-trained DNN without the need to re-train the models and it also does not need supervision learning.
Original language | English |
---|---|
Title of host publication | IJCNN 2021 - International Joint Conference on Neural Networks, conference proceedings |
Place of Publication | Piscataway, NJ |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Number of pages | 7 |
ISBN (Electronic) | 9780738133669 |
DOIs | |
Publication status | Published - 2021 |
Event | 2021 International Joint Conference on Neural Networks, IJCNN 2021 - Virtual, Shenzhen, China Duration: 18 Jul 2021 → 22 Jul 2021 |
Conference
Conference | 2021 International Joint Conference on Neural Networks, IJCNN 2021 |
---|---|
Country/Territory | China |
City | Virtual, Shenzhen |
Period | 18/07/21 → 22/07/21 |
Keywords
- Deep Neural Network
- Misclassification samples
- Sentiment Analysis
- Softmax