• English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  • Communities & Collections
  • Browse OpenUCT
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
  1. Home
  2. Browse by Author

Browsing by Author "Dubb, Roland"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    Open Access
    Addressing deep reinforcement learning: empirical algorithm performance evaluations∗
    (2025) Dubb, Roland; Shock, Jonathan
    Due to the rapidly paced production of deep reinforcement learning (RL) research papers, some recent publications have begun to critique the manner in which RL algorithm performances are evaluated. Building on this recent scrutiny, our work attempts to identify the precise aspects of empirical deep RL algorithm performance evaluations that need attention for improvement. This dissertation begins by briefly introducing the RL problem. Thereafter, we review the literature and discuss recent scrutiny of various aspects of deep RL algorithm performance evaluations. Specifically, we discuss the following aspects: (i) the choice of RL environment, (ii) the measurement of uncertainty, (iii) the collection of data, and (iv) the aggregation of that data. From this discussion, we identify two particular problems with RL evaluations, namely the non-linear scaling of algorithm performance scores with the level of skill achieved by that particular algorithm, and the (potentially) biased weighting of scores in the data aggregation process, across RL environments. As multi-agent RL (MARL) presents a recently popular research paradigm whose evaluation procedures have not yet been carefully scrutinised in the literature, we analyse a dataset by Gorsane et al. [1] which documents the evaluation methodologies of many recent deep cooperative MARL publications. This analysis, which reveals several flawed aspects about MARL evaluation, along with the reviewed RL evaluation issues from the literature, motivates for an attempt at constructing an improved RL algorithm empirical performance evaluation guideline. Multi-criteria decision analysis (MCDA) is discussed as a potential framework that offers a data aggregation procedure that resolves the two aforementioned problems with RL evaluations. Combining the use of MCDA with our insights from the literature, we propose an improved guideline for deep RL empirical algorithm performance evaluations. This is contrasted with another proposed guideline by Gorsane et al. [1] before a proof-of-concept test is conducted. Overall, we aim to move toward the better evaluation of RL algorithms and contribute toward an increased sensitivity to a lack of scientific rigour [2, 3] in the field of machine learning.
UCT Libraries logo

Contact us

Jill Claassen

Manager: Scholarly Communication & Publishing

Email: openuct@uct.ac.za

+27 (0)21 650 1263

  • Open Access @ UCT

    • OpenUCT LibGuide
    • Open Access Policy
    • Open Scholarship at UCT
    • OpenUCT FAQs
  • UCT Publishing Platforms

    • UCT Open Access Journals
    • UCT Open Access Monographs
    • UCT Press Open Access Books
    • Zivahub - Open Data UCT
  • Site Usage

    • Cookie settings
    • Privacy policy
    • End User Agreement
    • Send Feedback

DSpace software copyright © 2002-2026 LYRASIS