XPENG-Peking University Collaborative Research Accepted by AAAI 2026: Introducing a Novel Visual Token Pruning Framework for Autonomous Driving
-
XPENG -PKU Research Breakthrough:XPENG , in collaboration withPeking University , has developed FastDriveVLA—a novel visual token pruning framework that enables autonomous driving AI to "drive like a human" by focusing only on essential information, achieving a 7.5x reduction in computational load. - Top-Tier AI Recognition: The research has been accepted by AAAI 2026, one of the world's premier AI conferences, which had a highly selective acceptance rate of just 17.6% this year.
-
Accelerating L4 Autonomy: This achievement underscores
XPENG's full-stack capabilities in AI-driven mobility and advances the industry toward efficient, scalable deployment of next-generation autonomous driving systems.
The paper introduces FastDriveVLA, an efficient visual token pruning framework specifically designed for end-to-end autonomous driving Vision-Language-Action (VLA) models. This work offers a new approach to visual token pruning by enabling AI to "drive like a human", focusing only on essential visual information while filtering out irrelevant data.
As AI large models evolve rapidly, VLA models are being widely adopted in end-to-end autonomous driving systems due to their strong capabilities in complex scene understanding and action reasoning. These models encode images into large numbers of visual tokens, which serve as the foundation for the model to "see" the world and make driving decisions. However, processing large numbers of tokens increases computational load onboard the vehicle, impacting inference speed and real-time performance.
While visual token pruning has been recognized as a viable method to accelerate VLA inference, existing approaches, whether based on text-visual attention or token similarity, have shown limitations in driving scenarios. To address this,
The method introduces an adversarial foreground-background reconstruction strategy that enhances the model's ability to identify and retain valuable tokens. On the nuScenes autonomous driving benchmark, FastDriveVLA achieved state-of-the-art performance across various pruning ratios. When the number of visual tokens was reduced from 3,249 to 812, the framework achieved a nearly 7.5x reduction in computational load while maintaining high planning accuracy.
This is the second time this year that
These accomplishments reflect
About
On
For more information, please visit https://www.xpeng.com/.
Contacts:
For Media Enquiries:
Email: liangrq3@xiaopeng.com
View original content:https://www.prnewswire.com/news-releases/xpeng-peking-university-collaborative-research-accepted-by-aaai-2026-introducing-a-novel-visual-token-pruning-framework-for-autonomous-driving-302650038.html
SOURCE