基于End-to-end深度强化学习的多车场车辆路径优化 postprint

Author: 雷坤 ¹ 郭鹏 ^1,3 王祺欣 ¹ 赵文超 ¹ 唐连生 ²
Institute:

1. 西南交通大学机械工程学院

2. 宁波工程学院经济与管理学院

3. 轨道交通运维技术与装备四川省重点实验室
Submit Time:2022-05-18 16:08:24

Abstract: This paper proposed an end-to-end deep reinforcement learning framework to improve the efficiency of solving the Multi-Depot Vehicle Routing Problem (MDVRP) . There is a novel formulation of the Markov Decision Process (MDP) for the MDVRP, including the definitions of its state, action, and reward. Then, this paper exploited an improved Graph Attention Network (GAT) as the encoder to perform feature embedding on the graph representation of MDVRP, and designed a Transformer-based decoder. Meanwhile, this paper used the improved REINFORCE algorithm to train the proposed encoder-decoder model. Furthermore, the designed encoder-decoder model is not bounded by the size of the graph. That is, once the framework is trained, it can be used to solve MDVRP instances with different scales. Finally, the results on randomly generated and published standard instances verify the feasibility and effectiveness of the proposed framework. Significantly, even on solving MDVRP with 100 customer nodes, the trained model takes only two milliseconds on average to obtain a very competitive solution compared with existing methods.

多车场车辆路径问题深度强化学习图神经网络 REINFORCE算法 Transformer模型

Journal: 计算机应用研究
Subject: Computer Science >> Integration Theory of Computer Science
Cite as: ChinaXiv:202205.00136 (or this version ChinaXiv:202205.00136V1)
DOI:10.12074/202205.00136V1
CSTR:32003.36.ChinaXiv.202205.00136.V1
Recommended references： 雷坤,郭鹏,王祺欣,赵文超,唐连生.(2022).基于End-to-end深度强化学习的多车场车辆路径优化.计算机应用研究.[ChinaXiv:202205.00136] (Click&Copy)

Version History

[V1]

2022-05-18 16:08:24

ChinaXiv:202205.00136V1

Download

Related Paper

1. 神经模拟推断：基于神经网络和模拟推断的认知建模方法	2024-07-21
2. Humans are invited to write cell backbones as complex numbers by writing polyribonucleotides as computable numbers	2024-07-01
3. 中美两国人工智能头部企业研发和创新的比较分析与启示	2024-06-28
4. 基于深度卷积神经网络的大学英语四级成绩早期预警	2024-06-28
5. 基于BERT模型的科技成果中图分类自动标引方法研究	2024-06-21
6. 甘肃方言数据库建设与研究	2024-06-12
7. 面向低资源语言机器翻译的平行语料句对齐评分	2024-06-05
8. Turing’s thinking machine and ’t Hooft’s principle of superposition of states	2024-05-14
9. 恶意代码SCMP分类方法框架与风险行为多标签机制	2024-05-09
10. Guiding Large Language Models to Generate Computer-Parsable Content	2024-04-23


Public comments Anonymous comments Send only to author