Deep Reinforcement Learning-Based Approaches for the MPPT Control of Standalone Solar PV Systems

Nwachukwu, Sampson

Deep Reinforcement Learning-Based Approaches for the MPPT Control of Standalone Solar PV Systems

Thesis / Dissertation

2024

Publisher

University of Cape Town

Department

Department of Electrical Engineering

Faculty

Faculty of Engineering and the Built Environment

Abstract

Solar energy is transformed into electricity using a solar photovoltaic (PV) system while drastically reducing the usage of fossil fuels, which otherwise would have caused environmental degradation. However, the first major challenge associated with adopting solar PV for electricity generation is that their operation is mostly impacted by weather conditions, including temperature and irradiance changes. As a result, “maximum power point tracking (MPPT)” methods are employed by researchers to guarantee that solar PV operates efficiently at their special point of operation, called the “maximum power point (MPP)” in various weather circumstances. This helps to increase solar PV energy conversion efficiency. Another challenge impacting the adoption of solar PV is that environmental factors like buildings, clouds, trees, etc., can produce shading on the series-connected solar panel, thus, changing how each panel behaves. This phenomenon is known as solar PV “partial shading conditions (PSCs)”. This phenomenon can create “one global MPP (GMPP) and multiple local MPPs (LMPPs)” on the solar PV array's current-voltage (I-V) and power-voltage (P-V) curves, and can directly impact the system's potential energy production and, consequently, making developing a reliable MPPT control system quite challenging. The MPPT control problem is commonly addressed using classical techniques such as hill-climbing (HC), perturb and observe (P&O), etc., because implementing these methods is easy and at a low cost. Even though these methods work well under uniform irradiation, they usually experience power oscillations close to the MPP, drift problems, slower tracking speed, and often fail to track the solar PV GMPP when partial shading is present. Recently, metaheuristic optimization algorithms, including particle swarm optimization (PSO), genetic algorithm (GA), etc., have been used to deal with the limitations of classical methods. The PSO and GA can track the solar PV MPP under uniform irradiation and PSCs. However, the optimal operation of solar PV is not always guaranteed due to the need to restart the algorithms every time there are irradiation variations, which prolongs the algorithms' convergence time. Also, they require accurate modelling of solar PV parameters (i.e., model-based), which is difficult to achieve in complex systems and weather conditions. Furthermore, they cannot learn solar PV behaviour or store and reuse optimization information for future optimization tasks. Thus, it is crucial to develop robust and reliable algorithms that can learn and adjust to the varying behaviour of solar PV under various environmental conditions, especially under PSCs. Reinforcement learning (RL) is a model-free optimization approach that can learn solar PV behaviour and store and reuse optimization information for future optimization tasks. Based on the data, the RL method has more flexibility in adapting the solar PV model to various unpredictable environmental settings compared to the classical and metaheuristic methods. In this study, a solar PV array's MPPT control problem under PSCs is investigated and addressed by applying three model-free and off-policy deep reinforcement learning (DRL) algorithms such as Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), and Deep Q-Network (DQN) algorithm. They are utilized mainly due to their robustness and ability to handle continuous state spaces, unlike the traditional RL method, which operates with discrete action and state domains. The DQN algorithm can handle continuous state spaces thanks to deep neural networks. However, only systems with limited and discrete action spaces may use the DQN method. Thus, the SAC and DDPG methods are used to address the DQN algorithm's limited action spaces problem. The DDPG agent employs a continuous deterministic actor, which implements deterministic policy over continuous action spaces, whereas the SAC agent employs a continuous Gaussian actor, which implements stochastic policy over continuous action spaces. In this dissertation, MATLAB/Simulink software was used to develop the MPPT system, while the RL toolbox in MATLAB was used to train the DRL algorithms. Also, the P&O method was developed for comparison purposes and to validate the DRL algorithms' performance. The MPPT system's performances were tested and assessed under constant irradiance, varying irradiance, and PSCs. The simulation results showed that the P&O and the DRL algorithms can extract over 99% of power from solar PV at various static and varying irradiance levels. However, the P&O method showed high power oscillations near the MPP because of its use of fixed-step size. Also, the results show that the P&O is unreliable under PSCs as it mostly converges at LMPP instead of the GMPP, thus, leading to huge power loss observed in most cases. It was further observed that at the standard testing condition (STC), the SAC, DDPG, and DQN tracked the MPP faster than the P&O method with zero power oscillations at the steady-state conditions. Unlike the P&O, which is unreliable under PSCs, the DRL algorithms successfully distinguished the LMPPs from the GMPP while converging to the GMPP. Generally, the SAC achieved a consistent, reliable, and more stable tracking of the solar PV power in all the tests performed, with zero power fluctuations at the steady state, even in the complex PSCs. This is due to the introduction of robust maximum entropy in the SAC's learning process, which ensures that the system model and estimation errors are minimized, especially in complex control problems. In contrast, the DQN and DDPG methods were inconsistent and produced a high magnitude of power oscillations at the steady state in certain environmental conditions, especially under PSCs.

Keywords

Engineering

Reference:

Collections

Masters

Full item page