Future planning of the ego vehicle; Predictions for neighboring vehicles; Ground truth ego trajectory; Driving history of the ego vehicle.
Achieving human-like driving behaviors in complex open-world environments is a critical challenge in autonomous driving. Contemporary learning-based planning approaches such as imitation learning methods often struggle to balance competing objectives and lack of safety assurance, due to limited adaptability and inadequacy in learning complex multi-modal behaviors commonly exhibited in human planning, not to mention their strong reliance on the fallback strategy with predefined rules. We propose a novel transformer-based Diffusion Planner for closed-loop planning, which can effectively model multi-modal driving behavior and ensure trajectory quality without any rule-based refinement. Our model supports joint modeling of both prediction and planning tasks under the same architecture, enabling cooperative behaviors between vehicles. Moreover, by learning the gradient of the trajectory score function and employing a flexible classifier guidance mechanism, Diffusion Planner effectively achieves safe and adaptable planning behaviors. Evaluations on the large-scale real-world autonomous planning benchmark nuPlan and our newly collected 200-hour delivery-vehicle driving dataset demonstrate that Diffusion Planner achieves SOTA closed-loop performance with robust transferability in diverse driving styles.
Figure 1: Diffusion Planner Architectuer
Enforcing versatile and controllable driving behavior is crucial for real-world autonomous driving. For
example, vehicles must ensure safety and comfort while adjusting speeds to align with user preferences.
Thanks to its close relationship to Energy-Based Models, diffusion model can conveniently inject such
preferences via classifier guidance. We briefly describe some applicable energy functions that can be used
to customize the planning behavior of the model.
Figure 2: Starting from the same position, the trajectories driven under different guidance settings.
Case studies for collision guidance: Starting from the same position, we visualized the closed-loop test results. The dashed line with the blurred yellow car represents the results without guidance, while the solid line with the solid car represents the results with guidance.
Case studies for drivable guidance: Starting from the same position, we visualized the closed-loop test results. The dashed line with the blurred yellow car represents the results without guidance, while the solid line with the solid car represents the results with guidance.
Case studies for speed and comfort guidance: For target speed guidance, the speed changes before and after guidance are plotted. For comfort guidance, the longitudinal jerk changes are compared before and after applying comfort guidance on top of collision avoidance guidance.
Table 1: Closed-loop planning results on nuPlan dataset. Blue:
The highest scores of baselines in various types.
*: Using pre-searched reference lines as model input provides prior knowledge, reducing the difficulty of
planning compared to standard learning-based methods.
NR: non-reactive mode. R: reactive mode.
Figure 4: Future trajectory generation visualization. A frame from a challenging narrow road turning scenario in the closed-loop test.
Figure 5: Scenario count by type in the delivery-vehicle driving dataset, with representative visualizations.
We collected approximately 200 hours of real-world data using an autonomous logistics delivery vehicle
.
The task of the delivery vehicle is similar to that of a robotaxi in nuPlan, as it autonomously navigates
a designated route.
During operation, the vehicle must comply with traffic regulations, ensure safety, and complete the
delivery as efficiently as possible.
Compared to the vehicles in the nuPlan dataset, the delivery vehicle
Table 2: Closed-loop planning results on delivery-vehicle driving dataset.
Dataset is comming soon...
@inproceedings{
zheng2025diffusion,
title={Diffusion-Based Planning for Autonomous Driving with Flexible Guidance},
author={Yinan Zheng and Ruiming Liang and Kexin Zheng and Jinliang Zheng and Liyuan Mao and Jianxiong Li and Weihao Gu and Rui Ai and Shengbo Eben Li and Xianyuan Zhan and Jingjing Liu},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=wM2sfVgMDH}
}