9.1. How does Apollo implement Reinforcement Learning for Trajectory Planning?

Apollo implements vehicle trajectory planning through reinforcement learning. How can it be run? And how can existing reinforcement learning algorithms be improved?

9.1.1. Answer