Assessing GridSim for modeling the global distribution of next-generation astronomy data

Thesis / Dissertation

2025

Permanent link to this Item
Authors
Supervisors
Journal Title
Link to Journal
Journal ISSN
Volume Title
Publisher
Publisher

University of Cape Town

License
Series
Abstract
The transfer of big data between geographic locations incurs various costs that are better managed when computing resources are used efficiently. Measuring the energy used by a computing facility is a mechanism for managing computational efficiency because the energy provided to the facility can be measured and managed. The Square Kilometer Array (SKA) radio telescope will share large volumes of science-ready astronomical data with the project collaborating partners. This dissertation attempts to address the weaknesses of the GridSim simulation toolkit for the configuration of the SKA data grid. Some of the GridSim features suited for the simulation project are: a) a network extension claiming realistic network communication; b) an extendable application programming interface because of the Java programming language; c) a datagrid extension that simulates distributed data storage, and tasks for managing the distributed files; d) packet- and flow-level network extensions and e) GridSim is used in simulations of similar real-world networks e.g., the Australian GrangeNet Gigabit network. GridSim was built primarily for modeling resources and application scheduling of parallel computing and distributed computation grids, and to assess different job scheduling policies. The SKA wide area collaborative network will send data to its distributed partners who have their own network and energy-related policies. This work proposes a design to implement, in GridSim, a prototype of the end-to-end energy cost model for large scale networks, ECOFEN (Orgerie, 2015). The purpose of this work being to demonstrate the utility of the GridSim toolkit in spite of a few known problems with the software. Invalidation exercises were performed to determine the cause of lost events in a network extension simulation, and to assess the implementation of the Routing Information Protocol, in GridSim, in multiple executions of the same simulation and configuration. In this work, GridSim simulations lose events for which a solution is suggested. In addition, the work found that routing tables do not always contain matching shortest path information for multiple executions of a simulation. The implementation of the proposed design for an ECOFEN model extension in GridSim is a project for future work after one unsuccessful attempt to implement the model in GridSim. This work considered other simulation tools as potential alternatives to the GridSim toolkit, finding SimGrid to be a likely candidate. Modern computational systems are just too complex for popular software simulation tools to copy dependably which has supported a return to live network emulation testbeds for the accurate and scalable modeling of real-world systems.
Description

Reference:

Collections