• 基于End-to-end深度强化学习的多车场车辆路径优化

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2022-05-18 Cooperative journals: 《计算机应用研究》

    Abstract: This paper proposed an end-to-end deep reinforcement learning framework to improve the efficiency of solving the Multi-Depot Vehicle Routing Problem (MDVRP) . There is a novel formulation of the Markov Decision Process (MDP) for the MDVRP, including the definitions of its state, action, and reward. Then, this paper exploited an improved Graph Attention Network (GAT) as the encoder to perform feature embedding on the graph representation of MDVRP, and designed a Transformer-based decoder. Meanwhile, this paper used the improved REINFORCE algorithm to train the proposed encoder-decoder model. Furthermore, the designed encoder-decoder model is not bounded by the size of the graph. That is, once the framework is trained, it can be used to solve MDVRP instances with different scales. Finally, the results on randomly generated and published standard instances verify the feasibility and effectiveness of the proposed framework. Significantly, even on solving MDVRP with 100 customer nodes, the trained model takes only two milliseconds on average to obtain a very competitive solution compared with existing methods.