Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2020-09-28 Cooperative journals: 《计算机应用研究》
Abstract: Recently, deep neural networks need to be deployed with low memory and computing resources, so it is necessary to design an efficient and compact network structure. This paper proposed a model compression method (KE) based on improved attention transfer for the design of compact neural networks, which mainly used a wide residual teacher network (WRN) to guide a compact student network (KENet) by extracting both spatial and channel-wise attention to improve the performance, and applied this method to real-time object detection. The image classification experiment on CIFAR verified that the knowledge distillation method with improved attention transfer can improve the performance of the compact model. The object detection experiment on VOC verified that the model KEDet has good accuracy (72.7mAP) and time performance (86FPS) . The experimental results show that the object detection model based on improved attention transfer has good accuracy and real-time performance.
Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-01-28 Cooperative journals: 《计算机应用研究》
Abstract: Most of the existing deep learning-based methods for hand pose estimation use a standard three-dimension convolutional neural network (3D-CNN) to extract 3D features and estimate the 3D coordinates of hand joints. The features extracted by these methods lack the multi-scale information of the hand, which limits the accuracy of hand pose estimation. In addition, due to the huge computational cost and memory requirements of the 3D CNN, these methods are often difficult to meet the real-time requirement. To overcome these weaknesses, the proposed method uses a spatial filter and a depth filter to simulate 3D convolutions, which reduces the amount of parameters. The proposed method extracts and integrates features at various scales, making full use of the 3D information of hand pose. Experiments show that the proposed method can improve estimation accuracy, reduce model size, and run at over 119fps on a standard computer with a single GPU.
Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2019-01-03 Cooperative journals: 《计算机应用研究》
Abstract: With the vigorous development of video sharing applications and platforms, video data is in an exponentially rising phase. In order to solve this thorny problem that the speed and accuracy of the current similarity video retrieval methods still cannot meet the requirements of users, this paper proposes a new similarity video quick retrieval method, which combines the three-dimensional convolutional neural network with the hash learning method and apply to video data. It not only can quickly learn the video spatiotemporal feature representation but also can greatly shorten the video retrieval time. The experimental results on the set show that the similarity retrieval performance of the video using the proposed method is superior to the current mainstream methods.
Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-20 Cooperative journals: 《计算机应用研究》
Abstract: Blind source separation problem(BSS) is one of the hot research in the field of signal processing. In many different algorithms for sloving BSS, fixed point algorithm (FastICA) is famous for fast convergence rate. However, its convergence performance is easy to affected by initial value selection of the initial demixing matrix. Aiming at the shortcomings of it, this paper introduced the gradient descent method to reduce initial value sensitivity, and put forward the improved secant method to accelerate the convergence speed. The experimental results show that the improved algorithm compared with the other FastICA algorithm, not only improves the separation performance, but also reduces the number of iterations and enhances the convergence stability. Therefore, the improved algorithm overcomes the sensitive influence of initial value selection, and achieves faster and more robust speech separation performance.