Dynamic Hand Gesture Recognition Using Ultrasonic Sonar Sensors and Deep Learning

Master Thesis


Permanent link to this Item
Journal Title
Link to Journal
Journal ISSN
Volume Title
The space of hand gesture recognition using radar and sonar is dominated mostly by radar applications. In addition, the machine learning algorithms used by these systems are typically based on convolutional neural networks with some applications exploring the use of long short term memory networks. The goal of this study was to build and design a Sonar system that can classify hand gestures using a machine learning approach. Secondly, the study aims to compare convolutional neural networks to long short term memory networks as a means to classify hand gestures using sonar. A Doppler Sonar system was designed and built to be able to sense hand gestures. The Sonar system is a multi-static system containing one transmitter and three receivers. The sonar system can measure the Doppler frequency shifts caused by dynamic hand gestures. Since the system uses three receivers, three different Doppler frequency channels are measured. Three additional differential frequency channels are formed by computing the differences between the frequency of each of the receivers. These six channels are used as inputs to the deep learning models. Two different deep learning algorithms were used to classify the hand gestures; a Doppler biLSTM network [1] and a CNN [2]. Six basic hand gestures, two in each x- y- and z-axis, and two rotational hand gestures are recorded using both left and right hand at different distances. The gestures were also recorded using both left and right hands. Ten-Fold cross-validation is used to evaluate the networks' performance and classification accuracy. The LSTM was able to classify the six basic gestures with an accuracy of at least 96% but with the addition of the two rotational gestures, the accuracy drops to 47%. This result is acceptable since the basic gestures are more commonly used gestures than rotational gestures. The CNN was able to classify all the gestures with an accuracy of at least 98%. Additionally, The LSTM network is also able to classify separate left and right-hand gestures with an accuracy of 80% and The CNN with an accuracy of 83%. The study shows that CNN is the most widely used algorithm for hand gesture recognition as it can consistently classify gestures with various degrees of complexity. The study also shows that the LSTM network can also classify hand gestures with a high degree of accuracy. More experimentation, however, needs to be done in order to increase the complexity of recognisable gestures.