Investigating user experience and bias mitigation of the multi-modal retrieval of historical data

Master Thesis


Permanent link to this Item
Journal Title
Link to Journal
Journal ISSN
Volume Title
Decolonisation has raised the discussion of technology having the responsibility of presenting multiple perspectives to users. This is specifically relevant to African precolonial heritage artefact data, where the data contains the bias of the curators of the artefacts and there are primary concerns surrounding the social responsibility of these systems. Historians have argued that common information retrieval algorithms may further bias results presented to users. While research for mitigating bias in information retrieval is steered in the direction of artificial intelligence and automation, an often-neglected approach is that of user-control. User-control has proven to be beneficial in other research areas and is strongly aligned with the core principles of decolonisation. Thus, the effects on user experience, bias mitigation, and retrieval effectiveness from the addition of user-control and algorithmic variation to a multimodal information retrieval system containing precolonial African heritage data was investigated in this study. This was done by conducting two experiments: 1) an experiment to provide a baseline offline evaluation of various algorithms for text and image retrieval and 2) an experiment to investigate the user experience with a retrieval system that allowed them to compare algorithms. In the first experiment, the differences in retrieval effectiveness between colour-based pre-processing algorithms, shape-based preprocessing algorithms, and pre-processing algorithms based on a combination of colour- and shape-detection, was explored. The differences in retrieval effectiveness between stemming, stopword removal and synonym query expansion was also evaluated for text retrieval. In the second experiment, the manner in which users experience bias in the context of common information retrieval algorithms for both the textual and image data that are available in typical historical archives was explored. Users were presented with the results generated by multiple algorithmic variations, in a variety of different result formats, and using a variety of different search methods, affording them the opportunity to decide what they deem provides them with a more relevant set of results. The results of the study show that algorithmic variation can lead to significantly improved retrieval performance with respect to image-based retrieval. The results also show that users potentially prefer shape-based image algorithms rather than colour-based image algorithms, and, that shape-based image algorithms can lead to significantly improved retrieval of historical data. The results also show that users have justifiable preferences for multimodal query and result formats to improve user experience and that users believe they can control bias using algorithmic variation