Browsing by Author "Gebbie, Timothy"
Now showing 1 - 12 of 12
Results Per Page
Sort Options
- ItemOpen AccessA reproducible approach to equity backtesting(2019) Arbi, Riaz; Gebbie, TimothyResearch findings relating to anomalous equity returns should ideally be repeatable by others. Usually, only a small subset of the decisions made in a particular backtest workflow are released, which limits reproducability. Data collection and cleaning, parameter setting, algorithm development and report generation are often done with manual point-and-click tools which do not log user actions. This problem is compounded by the fact that the trial-and-error approach of researchers increases the probability of backtest overfitting. Borrowing practices from the reproducible research community, we introduce a set of scripts that completely automate a portfolio-based, event-driven backtest. Based on free, open source tools, these scripts can completely capture the decisions made by a researcher, resulting in a distributable code package that allows easy reproduction of results.
- ItemOpen AccessCalibrating a Latent Order Book Model to Market Data(2022) Gant, Michael; Gebbie, TimothyWe investigate the formulation of the Latent Order Book (LOB) as a reaction diffusion Partial Differential Equation (PDE) and its subsequent numerical solution through an explicit method based on discrete stochastic processes. The numerical solution is calibrated using likelihood-free methods, Approximate Bayesian Computation (ABC) and an iterative extension, Population Monte-Carlo ABC (PMC-ABC) as well as a Black-box approach using the Nelder-Mead algorithm. We show that in the diffusion limit, the master equation becomes the LOB reaction-diffusion PDE and certain free-parameters are recoverable with the iterative calibration techniques.
- ItemOpen AccessCalibrating high frequency trading data to agent based models using approximate Bayesian computation(2021) Goosen, Kelly; Gebbie, TimothyWe consider Sequential Monte Carlo Approximate Bayesian Computation (SMC ABC) as a method of calibration for the use of agent based models in market micro-structure. To date, there are no successful calibrations of agent based models to high frequency trading data. Here we test whether a more sophisticated calibration technique, SMC ABC, will achieve this feat on one of the leading agent based models in high frequency trading literature (the Preis-Golke-Paul-Schneider Agent Based Model (Preis et al., 2006)). We find that, although SMC ABC's naive approach of updating distributions can successfully calibrate simple toy models, such as autoregressive moving average models, it fails to calibrate this agent based model for high frequency trading. This may be for two key reasons, either the parameters of the model are not uniquely identifiable given the model output or the SMC ABC rejection mechanism results in information loss rendering parameters unidentifiable given insucient summary statistics.
- ItemOpen AccessCorrelation emergence in two coupled limit order books in the fluid limit(2024) Bauer, Dominic; Gebbie, TimothyWeuse random walks to simulate the fluid limit of two coupled diffusive limit order books to model correlation emergence. The model implements the arrival, cancellation and diffusion of orders coupled by a pairs trader profiting from the mean-reversion between the two order-books in the fluid limit for a Lit order book with vanishing boundary conditions and order volume conservation we are able to demonstrate the recovery of an Epps effect. We show how various stylised facts depend on the model parameters and the numerical scheme and discuss various strengths and weaknesses of the approach. We demonstrate how the Epps effect depends on different choices of time and price discretisation and show how an Epps effect can emerge without recourse to market microstructure effects.
- ItemOpen AccessHigh-frequency correlation dynamics: Is the Epps effect a bias?(2021) Chang, Patrick; Gebbie, Timothy; Pienaar, EtienneWe tackle the question of whether Trade and Quote data from high-frequency finance are representative of discrete connected events, or whether these measurements can still be faithfully represented as random samples of some underlying Brownian diffusion in the context of modelling correlation dynamics. In particular, if the implicit notion of instantaneous correlation dynamics that are independent of the time-scale a reasonable assumption. To this end, we apply kernel averaging non-uniform fast Fourier transforms in the context of the Malliavin-Mancino integrated and instantaneous volatility estimators to speed up the estimators. We demonstrate the implicit time-scale investigated by the estimator by comparing it to the theoretical Epps effect arising from asynchrony. We compare the Malliavin-Mancino and Cuchiero-Teichmann Fourier instantaneous estimators and demonstrate the relationship between the instantaneous Epps effect and the cutting frequencies in the Fourier estimators. We find that using the previous tick interpolation in the Cuchiero-Teichmann estimator results in unstable estimates when dealing with asynchrony, while the ability to bypass the time domain with the Malliavin-Mancino estimator allows it to produce stable estimates and is therefore better suited for ultra high-frequency finance. We derive the Epps effect arising from asynchrony and provide a refined approach to correct the effect. We compare methods to correct for the Epps effect arising from asynchrony when the underlying process is a Brownian diffusion, and when the underlying process is from discrete connected events (proxied using a D-type Hawkes process). We design three experiments using the Epps effect to discriminate the underlying processes. These experiments demonstrate that using a Hawkes representation recovers the empiricism reported in the literature under simulation conditions that cannot be achieved when using a Brownian representation. The experiments are applied to Trade and Quote data from the Johannesburg Stock Exchange and the evidence suggests that the empirical measurements are from a system of discrete connected events where correlations are an emergent property of the time-scale rather than an instantaneous quantity that exists at all time-scales.
- ItemOpen AccessIntroduction to fast Super-Paramagnetic Clustering(2019) Yelibi, Lionel; Gebbie, TimothyWe map stock market interactions to spin models to recover their hierarchical structure using a simulated annealing based Super-Paramagnetic Clustering (SPC) algorithm. This is directly compared to a modified implementation of a maximum likelihood approach to fast-Super-Paramagnetic Clustering (f-SPC). The methods are first applied standard toy test-case problems, and then to a dataset of 447 stocks traded on the New York Stock Exchange (NYSE) over 1249 days. The signal to noise ratio of stock market correlation matrices is briefly considered. Our result recover approximately clusters representative of standard economic sectors and mixed clusters whose dynamics shine light on the adaptive nature of financial markets and raise concerns relating to the effectiveness of industry based static financial market classification in the world of real-time data-analytics. A key result is that we show that the standard maximum likelihood methods are confirmed to converge to solutions within a Super-Paramagnetic (SP) phase. We use insights arising from this to discuss the implications of using a Maximum Entropy Principle (MEP) as opposed to the Maximum Likelihood Principle (MLP) as an optimization device for this class of problems.
- ItemOpen AccessMarket Simulations with a Matching Engine(2022) Jericevich, Ivan; Gebbie, TimothyWe demonstrate the CoinTossX Java web-application as a low-latency, high-throughput, open-source matching engine/artificial exchange/simulation platform and deploy it to a cloud environment for asynchronous order matching and submission in a controlled framework via two seperate simulation techniques — Hawkes processes and agent-based modelling. A 10-variate Hawkes model stress tests the software whilst measuring the extent to which a matching engine can cloud the modelling of underlying order submission and management processes in a continuous-double auction. Estimation and calibration to the subsequent trade-and-quote data results in a model specification statistically different from the original — providing insight into the limits of the software, inference conducted on HFT models and future market microstructure modelling considerations. An asynchronous ABM with interacting low-frequency liquidity takers and high-frequency liquidity-providers is subsequently formulated with the aim of producing realistic trading scenarios/price action without relying on restrictive modelling assumptions or additional sources of noise. The resulting simulations are shown to replicate many stylized facts along with non-trivial price-impact curves and we use this to argue for future simple, reactive/actor-based financial model specifications that mimics real-world work-flow and system implementation.
- ItemOpen AccessMarket state discovery(2022) Singo, Unarine; Gebbie, TimothyWe explore the concept of financial market state discovery by assessing the robustness of two unsupervised machine learning algorithms: Inverse Covariance Clustering (ICC) and Agglomerative Super Paramagnetic Clustering (ASPC). The assessment is carried out by: simulating market datasets varying in complexity; implementing ICC and ASPC to estimate the underlying states (using only simulated log-returns as inputs); and measuring the algorithms' ability to recover the underlying states, using the Adjusted Rand Index (ARI) as a performance metric. Experiments revealed that ASPC is a more robust and better performing algorithm than ICC. ICC is able to produce competitive results in 2-state markets; however, ICC's primary disadvantage is its inability to maintain strong performance in 3, 4 and 5-state markets. For example, ASPC produced ARI numbers that were up to 800% superior to ICC in 5-state markets. Furthermore, ASPC does not rely on the art of selecting good hyper-parameters such as, the number of states a priori. ICC's utility as a market state discovery algorithm is limited.
- ItemOpen AccessOnline Non-linear Prediction of Financial Time Series Patterns(2020) da Costa, Joel; Gebbie, TimothyWe consider a mechanistic non-linear machine learning approach to learning signals in financial time series data. A modularised and decoupled algorithm framework is established and is proven on daily sampled closing time-series data for JSE equity markets. The input patterns are based on input data vectors of data windows preprocessed into a sequence of daily, weekly and monthly or quarterly sampled feature measurement changes (log feature fluctuations). The data processing is split into a batch processed step where features are learnt using a Stacked AutoEncoder (SAE) via unsupervised learning, and then both batch and online supervised learning are carried out on Feedforward Neural Networks (FNNs) using these features. The FNN output is a point prediction of measured time-series feature fluctuations (log differenced data) in the future (ex-post). Weight initializations for these networks are implemented with restricted Boltzmann machine pretraining, and variance based initializations. The validity of the FNN backtest results are shown under a rigorous assessment of backtest overfitting using both Combinatorially Symmetrical Cross Validation and Probabilistic and Deflated Sharpe Ratios. Results are further used to develop a view on the phenomenology of financial markets and the value of complex historical data under unstable dynamics.
- ItemOpen AccessPricing Offshore Services: Evidence from the Paradise Papers(2021) Gawronsky, Marcus; Gebbie, Timothy; Rajaratnam, KanshukanThe Paradise Papers represent one of the largest public data leaks comprising 13.4 million con_dential electronic documents. A dominant theory presented by Neal (2014) and Gri_th, Miller and O'Connell (2014) concerns the use of these offshore services in the relocation of intellectual property for the purposes of compliance, privacy and tax avoidance. Building on the work of Fernandez (2011), Billio et al. (2016) and Kou, Peng and Zhong (2018) in Spatial Arbitrage Pricing Theory (s-APT) and work by Kelly, Lustig and Van Nieuwerburgh (2013), Ahern (2013), Herskovic (2018) and Proch_azkov_a (2020) on the impacts of network centrality on _rm pricing, we use market response, discussed in O'Donovan, Wagner and Zeume (2019), to characterise the role of offshore services in securities pricing and the transmission of price risk. Following the spatial modelling selection procedure proposed in Mur and Angulo (2009), we identify Pro_t Margin and Price-to-Research as firm-characteristics describing market response over this event window. Using a social network lag explanatory model, we provide evidence for social exogenous effects, as described in Manski (1993), which may characterise the licensing or exchange of intellectual property between connected firms found in the Paradise Papers. From these findings, we hope to provide insight to policymakers on the role and impact of offshore services on securities pricing.
- ItemOpen AccessRepresentation learning for regime detection in financial markets(2025) Orton, Alexa; Gebbie, TimothyWe investigate financial market regime detection from the perspective of deep representation learning of the causal (reflexive) information geometry underpinning complex (multi-scale) dynamical traded asset systems using an emergent hierarchical correlation structure to characterise evolving macroeconomic market phases. Specifically, we assess the robustness of three toy models: SPD Matrix Network (SPDNet), SPD Matrix Network with Riemannian Batch Normalisation (SPDNetBN) and U-shaped SPD Matrix Network (U-SPDNet) whose architectures respect the underlying Riemannian manifold of input block hierarchical Symmetric Positive Definite (SPD) correlation matrices by employing Log-Euclidean Metric (LEM)s. Market phase detection for each model is carried out using three data configurations: i.) Randomised Johannesburg Stock Exchange (JSE) Top 60 data, ii.) synthetically-generated block hierarchical SPD matrices, and iii.) chronology-preserving block-resampled JSE Top 60 data. We show that using a singular performance metric is misleading in our financial market use cases. We confirm that U-SPDNet performs improved latent feature extraction with better classification performance in stressed and rally market phases, despite achieving lower Out-of-Sample (OOS) backtest scenario accuracy than that of the benchmark SPDNet. The SPDNet-based models fail in capturing latent reflexive spatio-temporal block hierarchical correlation dynamics and deliver corner solutions across all input data sets. The U-SPDNet is promising in terms of its utility in regime dependent portfolio optimisation strategy generation as a model better-suited to capturing latent block hierarchical correlation structures arising from lead-lag causal feedback information loops that often drive the evolution of evolving market regimes
- ItemOpen AccessSystematic asset allocation using flexible views for South African markets(2021) Sebastian, Ponni; Gebbie, TimothyWe implement a systematic asset allocation model using the Historical Simulation with Flexible Probabilities (HS-FP) framework developed by Meucci [142, 144, 145]. The HS-FP framework is a flexible non-parametric estimation approach that considers future asset class behavior to be conditional on time and market environments, and derives a forward-looking distribution that is consistent with this view while remaining as close as possible to the prior distribution. The framework derives the forward-looking distribution by applying unequal time and state conditioned probabilities to historical observations of asset class returns. This is achieved using relative entropy to find estimates with the least distortion to the prior distribution. Here, we use the HS-FP framework on South African financial market data for asset allocation purposes; by estimating expected returns, correlations and volatilities that are better represented through the measured market cycle. We demonstrate a range of state variables that can be useful towards understanding market environments. Concretely, we compare the out-of-sample performance for a specific configuration of the HS-FP model relative to classic Mean Variance Optimization(MVO) and Equally Weighted (EW) benchmark models. The framework displays low probability of backtest overfitting and the out-of-sample net returns and Sharpe ratio point estimates of the HS-FP model outperforms the benchmark models. However, the results are inconsistent when training windows are varied, the Sharpe ratio is seen to be inflated, and the method does not demonstrate statistically significant outperformance on a gross and net basis.