Sriya Rajyam
Age 17 | Toronto, ON
Outstanding Project Mention at YSC 2020 Online STEM Fair,
YSC Toronto Regional Award Winner, Agincourt CI Top Project
Edited by: Suhail Kandanur
Machine learning (ML) is a subset of artificial intelligence, enabling systems to improve from experience without explicit instructions. Accomplished through computer algorithms, statistics are used to identify patterns in data. These patterns in turn help predict or make decisions regarding unknown data. Gesture recognition is the practice of interpreting human gestures with algorithms. This project uses ML to track human activity through gesture recognition.
Through identifying gestures such as walking, climbing stairs, and running, the project allows for the monitoring of daily physical activity, encouraging a healthy lifestyle. It has the potential to mimic health apps on a smartwatch or smartphone in a user-friendly, accessible, and cost-effective approach. The future goal of the project is to provide an efficient way to track health for those who cannot afford or whose needs are not met by these devices
The prototype is successfully able to identify six gestures based on accelerometer and gyroscopic data. It was made utilizing Python’s data science NumPy and SciPy libraries, and involved much research and many trials.
MATERIALS
Computer with Python Installed
Human Activity Recognition Using Smartphones Data Set (Anguita et al, 2013)
PROCEDURE
The first thing needed was a data set that could translate gestures into a numerical format. How exactly would a phone or smartwatch know if someone’s walking, climbing stairs, or laying around? The answer lies in the built-in gyroscope and accelerometer. These measure the orientation and acceleration of the held device, which produce distinct data per gesture. Knowing the data would be dif- ficult to collect first-hand, a suitable data set was found on the web (Anguita et al, 2013). It belonged to an experiment carried out with 30 volunteers, each performing six activities (walking, walking upstairs, walking downstairs, sitting, standing, and laying) while wearing a smartphone. The embedded accelerometer and gyro- scope captured 3-axial linear acceleration and 3-axial angular velocity. The total dataset of 2000 points was divided into two sets randomly, with the training set covering 70% of the data and the test set 30%. This was done for comparative purposes.
The data then needed to be put to use. The prototype was built using the Python programming language, along with NumPy and SciPy libraries. The two major methods of machine learning include regression and classification. Regression is used to predict a numerical quantity. Classification is used to predict a category, or a class. Predicting the type of activity using six classes is a classification problem and an algorithm was required to classify the gestures. The algorithm needed to take a point from the test set, compare it to all the points in the training set, and locate the “best fit”. The corresponding class of this fit in the training set is the classification. The algorithm decided upon was K-Nearest Neighbours (KNN). KNN hinges on the idea of finding two points with the smallest numerical distance between them (as identical gestures produce similar data). KNN finds this alike calculating distance between points on a graph, known as Euclidean distance (Harrison, 2019):
First the hyper-parameter K, referring to the number of “best fits” the algorithm finds, had to be initialized (Navlani, n.d.). As the closest point is desired, K was set to be 1. One by one, the pro- gram compares the distance between a point in the test set and each point in the training set. It stores all the measured distances from smallest to largest, eventually selecting the first one (as it denotes the smallest distance). The corresponding class is picked and the classification is complete.
Figure 1
Figure 2
To improve accuracy, the Fast Fourier Transform (FFT) was incorporated. The FFT is an algorithm that breaks down a function into its constituent sinusoidal functions (VanderPlas, 2013). After researching, it was found the FFT is a common feature transform used in ML to increase accuracy (Mironovova & Jiri, 2015). It was hence applied, increasing accuracy significantly by 11 percent.
RESULTS
The final prototype yielded an accuracy of approximately 83%. It was the highest accuracy obtained.
The accuracy denotes the success of the program after running all 600 test points. It was calculated through comparing the results of each predicted classification and the actual corresponding class. Hence, the program was able to predict 83% of the gestures accurately. This number will approximately be the same with any similar set of data.
Many different approaches were used before arriving at this number-juggling with different algorithms and different approaches to each one. Initially, the FFT was not used, relying only on KNN. The highest accuracy with solely KNN was about 72%. The hyper-parameter K was also adjusted to 1, 3, and 5. The highest occurring gesture among these ‘K’ points was adopted as the classification. Expectedly, as K was made higher, the accuracy declined – as each subsequent point is further and further away from the first. At K = 3, accuracy was at 64.6% and at K = 5 accuracy was at 61.2%. It became clear K = 1 would be the most rational choice at 72%.
DISCUSSION
Results are notable with possibility for improvement in the future. The program may be incorporated in health-related applications. Allowing users to track physical behaviour promotes a healthy lifestyle. The program is not limited to individual use and may also track others’ well-being, opening gateways to family mem- bers and caregivers.
FUTURE STEPS
The program can be modified to alert authorities when certain actions are executed, including abrupt falling or prolonged lack of activity.
For mass data storage, a formal GUI and a database using MySQL can be created. Users can track activity over long periods, observing trends and developments.
In the future, the gesture library may be expanded to include sign language. The prototype can be extended to more gestures to interpret sign language. The software would provide real time translations by comparing uploaded gestures with pre-installed reference images.
CONCLUSION
The program was deemed a successful prototype, an effective way to track human health and activity. The project, built on fundamentals, has several possibilities for the future. And potential to be expanded into a full-fledged product.
REFERENCES
Grus, J. (2015). Data science from scratch: (1st ed.). Sebastopol, CA: OReilly Media.
Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. A Public Domain Dataset for Human Activity Recognition Using Smartphones. 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2013. Bruges, Belgium 24-26 April 2013.
Hao, K. (2020, April 02). What is machine learning? Retrieved August 03, 2020, from https://www.technologyreview.com/2018/11/17/103781/what-is-machine-learning-we-drew-you-another-flowchart/
Harrison, O. (2019, July 14). Machine Learning Basics with the K-Nearest Neighbors Algorithm. Retrieved from https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761
Li, H. (n.d.). Machine Learning: What it is and why it matters. Retrieved from https://www.sas.com/en_ca/insights/analytics/machine-learning.html
Mironovova, Martina & Bila, Jiri. (2015). Fast fourier transform for feature extraction and neural network for classification of electrocardiogram signals. 1-6.
Navlani, A. (n.d.). KNN Classification using Scikit-learn. Retrieved from https:// www.datacamp.com/community/tutorials/k-nearest-neighbor-classification-scikit-learn
VanderPlas, J. (2013, August 28). Understanding the FFT Algorithm. Retrieved from https://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/
ABOUT THE AUTHOR
Sriya Rajyam
Sriya Rajyam is a 17-year-old Grade 11 student from Toronto, Ontario. From a young age, her fascination with STEM has garnered a deep interest in the field. Her ambition is to improve and advance human health from a technological perspective. To do so, she understands she has a lot to learn and strives to keep learning and innovating.