At the Chengdu Auto Show, Lang Xianpeng, Vice President of Smart Driving R&D at Li Auto, made a bold statement: Li Auto’s new end-to-end + VLM (Vision-Language Model) solution has surpassed Tesla’s Full Self-Driving (FSD) system.
The end-to-end + VLM dual system is an architecture that simulates human thinking and cognition to achieve anthropomorphism or human-likeness.
Built on the "OneModel" framework, this technology integrates perception, decision-making, and execution into one comprehensive model.
ADVERTISEMENT
Lang emphasized that this breakthrough allows Li Auto to challenge and surpass Tesla, the global leader in autonomous driving.
Li Auto recently launched a trial program, inviting 10,000 users to experience their new end-to-end + VLM system.
This system is offered to Li Auto owners free of charge, allowing them to experience the next-generation Level 3 autonomous driving features firsthand.
End-to-end + VLM
The end-to-end + VLM approach is a significant leap for Li Auto, with AI driving the car almost as if it were "learning" to handle real-world situations independently.
Unlike traditional rule-based, modular architectures, Li Auto's model handles everything—from sensor input to driving output—through a single, integrated model.
Lang elaborated that this system allows for more efficient information processing, reducing delays, redundancies, and potential errors that occur with multi-layered systems.
He revealed that Li Auto's current training compute power has hit 5.39 EFLOPS, and by the end of 2024, it’s expected to surpass 8 EFLOPS.
The company invests over 1 billion yuan annually into boosting compute power, with 2 billion yuan set to be spent this year alone.
According to Lang, achieving full autonomous driving will likely demand up to 100 EFLOPS of compute power.
ADVERTISEMENT
Li Auto's Lang explains that the end-to-end solution involves feeding environmental data into a neural network "black box," which processes inputs and directly outputs driving commands.
This approach leads to faster, more accurate decisions, optimizing tasks like steering, braking, and acceleration in real-time.
Compared to traditional systems that divide tasks into separate modules, end-to-end models offer a more seamless, human-like driving experience.
OneModel to Tackle L4 Autonomy
While end-to-end models and VLM (Vision-Language Models) are sufficient for L3 supervised driving, Lang emphasized that the "OneModel" becomes essential when transitioning to unsupervised L4 autonomous driving.
At the L3 level, end-to-end and VLM systems can effectively handle most driving tasks, with VLM providing additional support, such as identifying complex road conditions or interpreting traffic rules.
However, Lang noted that for L4 autonomous driving, where the vehicle must independently handle all scenarios and unexpected events, the computational and data demands increase significantly.
ADVERTISEMENT
This is where the OneModel comes in—a sophisticated framework that allows the AI to simulate and process a wide array of real-world driving conditions in a virtual environment.
The OneModel equips the car to anticipate and respond to unknown situations, significantly enhancing the safety and decision-making capabilities of the vehicle.
Tesla's Pure End-to-End Neural Network Approach
Li Auto isn’t alone in this shift toward end-to-end models. Tesla was one of the first to deploy this technology, and it’s now expanding its regulatory reach by seeking approval for supervised FSD operations in Europe and China.
Elon Musk mentioned in X that since FSD v12, end-to-end neural networks have been entirely responsible for the task of autonomous driving.
Meanwhile, other automakers like XPeng and NIO are also embracing the end-to-end model.
ADVERTISEMENT
Back in July, XPeng's founder and chairman He Xiaopeng shared that the pure end-to-end approach that Tesla is using can only reach L2 or L3 driving, but may fall short of L4 autonomy.
To achieve L4, he believes that the end-to-end needs to be paired with a large model, which is the approach XPeng is currently adopting.
Li Auto's Lang explained that while Tesla and other competitors also employ end-to-end models, the integration of the OneModel in Li Auto’s system provides a more holistic approach to handling complex and unpredictable driving environments.
3 Trillion Yuan Industry
Research from Guosen Securities shows that the smart driving industry is approaching a tipping point, with a potential market value of nearly 3 trillion yuan.
However, the high costs associated with R&D in this space pose challenges for automakers.
Lang noted that Li Auto's current computing power has already reached 5.39 EFLOPS, with plans to exceed 8 EFLOPS by the end of the year.
The company expects to spend 2 billion yuan on computing power alone in 2024.
ADVERTISEMENT
Lang stressed that achieving true Level 4 autonomy will require exponential growth in both data and computing power.
"Every year, we’ll need at least $1 billion in investment, and in five years, the demand for computing power will only continue to skyrocket.
It’s not just about spending billions on autonomous driving, but whether we have the data and computing resources to support it," Lang said.
See video demo of the end-to-end + VLM by Li Auto here.