Reinforcement Learning in Eco-driving for Connected and Automated Vehicles
Author | : Zhaoxuan Zhu |
Publisher | : |
Total Pages | : 0 |
Release | : 2021 |
ISBN-10 | : OCLC:1337862963 |
ISBN-13 | : |
Rating | : 4/5 ( Downloads) |
Download or read book Reinforcement Learning in Eco-driving for Connected and Automated Vehicles written by Zhaoxuan Zhu and published by . This book was released on 2021 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Connected and Automated Vehicles (CAVs) can significantly improve transportation efficiency by taking advantage of advanced connectivity technologies. Meanwhile, the combination of CAVs and powertrain electrification, such as Hybrid Electric Vehicles (HEVs) and Plug-in Hybrid Electric Vehicles (PHEVs), offers greater potential to improve fuel economy due to the extra control flexibility compared to vehicles with a single power source. In this context, the eco-driving control optimization problem seeks to design the optimal speed and powertrain components usage profiles based upon the information received by advanced mapping or Vehicle-to-Everything (V2X) communications to minimize the energy consumed by the vehicle over a given itinerary. To overcome the real-time computational complexity and embrace the stochastic nature of the driving task, the application and extension of state-of-the-art (SOTA) Deep Reinforcement Learning (Deep RL, DRL) algorithms to the eco-driving problem for a mild-HEV is studied in this dissertation. For better training and a more comprehensive evaluation, an RL environment, consisting of a mild HEV powertrain and vehicle dynamics model and a large-scale microscopic traffic simulator, is developed. To benchmark the performance of the developed strategies, two causal controllers, namely a baseline strategy representing human drivers and a deterministic optimal-control-based strategy, and the non-causal wait-and-see solution are implemented. In the first RL application, the eco-driving problem is formulated as a Partially Observable Markov Decision Process, and a SOTA model-free DRL (MFDRL) algorithm, Proximal Policy Optimization with Long Short-term Memory as function approximator, is used. Evaluated over 100 trips randomly generated in the city of Columbus, OH, the MFDRL agent shows a 17% fuel economy improvement against the baseline strategy while keeping the average travel time comparable. While showing performance comparable to the optimal-control-based strategy, the actor of the MFDRL agent offers an explicit control policy that significantly reduces the onboard computation. Subsequently, a model-based DRL (MBDRL) algorithm, Safe Model-based Off-policy Reinforcement Learning (SMORL) is proposed. The algorithm addresses the following issues emerged from the MFDRL development: a) the cumbersome process necessary to design the rewarding mechanism, b) the lack of the constraint satisfaction and feasibility guarantee and c) the low sample efficiency. Specifically, SMORL consists of three key components, a massively parallelizable dynamic programming trajectory optimizer, a value function learned in an off-policy fashion and a learned safe set as a generative model. Evaluated under the same conditions, the SMORL agent shows a 21% reduction on the fuel consumption over the baseline and the dominant performance over the MFDRL agent and the deterministic optimal-control-based controller.