Abstract:
In order to improve the accuracy of text classification and solve the problem of insufficient utilization of node features by text graph convolution neural network, this paper proposes a new text classification model, which integrates the advantages of text graph convolution and Stacking integrated learning method. The model first learns the global expression of documents and words and the grammatical structure information of documents through text graph convolution neural network, and then secondary learns the features extracted by text graph convolution through integrated learning, so as to make up for the insufficient utilization of text graph convolution node features, and improve the accuracy of single label text classification and the generalization ability of the whole model. In order to reduce the time consumption of ensemble learning, the fusion algorithm removes the k-fold cross verification mechanism in ensemble learning. The fusion algorithm realizes the correlation between text graph convolution and stacking integrated learning method. The classification effect on R8, R52, Mr, Ohsumed, 20ng and other data sets is improved by more than 1.5%, 2.5%, 11%, 12% and 7% respectively compared with the traditional classification model. This method performs well in the comparison of classification algorithms in the same field.