Evaluating transformers as memory systems in reinforcement learning

Makkink, Thomas

Evaluating transformers as memory systems in reinforcement learning

dc.contributor.advisor	Shock, Jonathan
dc.contributor.advisor	Pretorius, Arnu
dc.contributor.author	Makkink, Thomas
dc.date.accessioned	2022-02-23T15:40:59Z
dc.date.available	2022-02-23T15:40:59Z
dc.date.issued	2021
dc.date.updated	2022-02-23T15:34:07Z
dc.description.abstract	Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success of transformers in natural language processing tasks has highlighted a promising and viable alternative. Memory in reinforcement learning is particularly difficult as rewards are often sparse and distributed over many time steps. Early research into transformers as memory mechanisms for reinforcement learning indicated that the canonical model is not suitable, and that additional gated recurrent units and architectural modifications are necessary to stabilize these models. Several additional improvements to the canonical model have further extended its capabilities, such as increasing the attention span, dynamically selecting the number of per-symbol processing steps and accelerating convergence. It remains unclear, however, whether combining these improvements could provide meaningful performance gains overall. This dissertation examines several extensions to the canonical Transformer as memory mechanisms in reinforcement learning and empirically studies their combination, which we term the Integrated Transformer. Our findings support prior work that suggests gating variants of the Transformer architecture may outperform LSTMs as memory networks in reinforcement learning. However, our results indicate that while gated variants of the Transformer architecture may be able to model dependencies over a longer temporal horizon, these models do not necessarily outperform LSTMs when tasked with retaining increasing quantities of information.
dc.identifier.apacitation	Makkink, T. (2021). <i>Evaluating transformers as memory systems in reinforcement learning</i>. (). ,Faculty of Science ,Department of Mathematics and Applied Mathematics. Retrieved from http://hdl.handle.net/11427/35840	en_ZA
dc.identifier.chicagocitation	Makkink, Thomas. <i>"Evaluating transformers as memory systems in reinforcement learning."</i> ., ,Faculty of Science ,Department of Mathematics and Applied Mathematics, 2021. http://hdl.handle.net/11427/35840	en_ZA
dc.identifier.citation	Makkink, T. 2021. Evaluating transformers as memory systems in reinforcement learning. . ,Faculty of Science ,Department of Mathematics and Applied Mathematics. http://hdl.handle.net/11427/35840	en_ZA
dc.identifier.ris	TY - Master Thesis AU - Makkink, Thomas AB - Memory is an important component of effective learning systems and is crucial in non-Markovian as well as partially observable environments. In recent years, Long Short-Term Memory (LSTM) networks have been the dominant mechanism for providing memory in reinforcement learning, however, the success of transformers in natural language processing tasks has highlighted a promising and viable alternative. Memory in reinforcement learning is particularly difficult as rewards are often sparse and distributed over many time steps. Early research into transformers as memory mechanisms for reinforcement learning indicated that the canonical model is not suitable, and that additional gated recurrent units and architectural modifications are necessary to stabilize these models. Several additional improvements to the canonical model have further extended its capabilities, such as increasing the attention span, dynamically selecting the number of per-symbol processing steps and accelerating convergence. It remains unclear, however, whether combining these improvements could provide meaningful performance gains overall. This dissertation examines several extensions to the canonical Transformer as memory mechanisms in reinforcement learning and empirically studies their combination, which we term the Integrated Transformer. Our findings support prior work that suggests gating variants of the Transformer architecture may outperform LSTMs as memory networks in reinforcement learning. However, our results indicate that while gated variants of the Transformer architecture may be able to model dependencies over a longer temporal horizon, these models do not necessarily outperform LSTMs when tasked with retaining increasing quantities of information. DA - 2021 DB - OpenUCT DP - University of Cape Town KW - Mathematics and Applied Mathematics LK - https://open.uct.ac.za PY - 2021 T1 - Evaluating transformers as memory systems in reinforcement learning TI - Evaluating transformers as memory systems in reinforcement learning UR - http://hdl.handle.net/11427/35840 ER -	en_ZA
dc.identifier.uri	http://hdl.handle.net/11427/35840
dc.identifier.vancouvercitation	Makkink T. Evaluating transformers as memory systems in reinforcement learning. []. ,Faculty of Science ,Department of Mathematics and Applied Mathematics, 2021 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/35840	en_ZA
dc.language.rfc3066	eng
dc.publisher.department	Department of Mathematics and Applied Mathematics
dc.publisher.faculty	Faculty of Science
dc.subject	Mathematics and Applied Mathematics
dc.title	Evaluating transformers as memory systems in reinforcement learning
dc.type	Master Thesis
dc.type.qualificationlevel	Masters
dc.type.qualificationlevel	MSc

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis_sci_2021_makkink thomas.pdf
Size:: 4.92 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 0 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters