Open Access   Article Go Back

Deep Learning for Human Action Recognition – Survey

K.Kiruba 1 , D. Shiloah Elizabeth2 , C Sunil Retmin Raj3

Section:Survey Paper, Product Type: Journal Paper
Volume-6 , Issue-10 , Page no. 323-328, Oct-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i10.323328

Online published on Oct 31, 2018

Copyright © K.Kiruba, D. Shiloah Elizabeth, C Sunil Retmin Raj . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: K.Kiruba, D. Shiloah Elizabeth, C Sunil Retmin Raj, “Deep Learning for Human Action Recognition – Survey,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.10, pp.323-328, 2018.

MLA Style Citation: K.Kiruba, D. Shiloah Elizabeth, C Sunil Retmin Raj "Deep Learning for Human Action Recognition – Survey." International Journal of Computer Sciences and Engineering 6.10 (2018): 323-328.

APA Style Citation: K.Kiruba, D. Shiloah Elizabeth, C Sunil Retmin Raj, (2018). Deep Learning for Human Action Recognition – Survey. International Journal of Computer Sciences and Engineering, 6(10), 323-328.

BibTex Style Citation:
@article{Elizabeth_2018,
author = {K.Kiruba, D. Shiloah Elizabeth, C Sunil Retmin Raj},
title = {Deep Learning for Human Action Recognition – Survey},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {10 2018},
volume = {6},
Issue = {10},
month = {10},
year = {2018},
issn = {2347-2693},
pages = {323-328},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=3025},
doi = {https://doi.org/10.26438/ijcse/v6i10.323328}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i10.323328}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=3025
TI - Deep Learning for Human Action Recognition – Survey
T2 - International Journal of Computer Sciences and Engineering
AU - K.Kiruba, D. Shiloah Elizabeth, C Sunil Retmin Raj
PY - 2018
DA - 2018/10/31
PB - IJCSE, Indore, INDIA
SP - 323-328
IS - 10
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
632 508 downloads 250 downloads
  
  
           

Abstract

Human action recognition (HAR) in visual data has become one of the attractive research area in the field of computer vision including object detection, recognition, retrieval, domain adaptation, transfer learning, segmentation etc. Over the last decade, HAR evolved from heuristic hand crafted feature to systematic feature learning namely deep feature learning. Deep feature learning can automatically learn feature from the raw inputs. Deep learning algorithms, especially Convolutional Neural Network (CNN), have rapidly become a methodology of choice for analysing recognition of videos. In this paper, details of recent trends and approaches of deep learning including CNN, Recursive Neural Network (RNN), Long Short term Memory (LSTM) and Autoencoders which are used in HAR are discussed. The challenges are identified to motivate the researchers for future works.

Key-Words / Index Term

HAR, CNN, LSTM, Deep Learning model

References

[1] D.D. Dawn, S.H. Shaikh, “A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector”, The Visual Computer, Vol. 32, Issue. 3, pp. 289-306, March 2016.
[2] S. Herath, M. Harandi, F. Porikli, “Going deeper into action recognition: A survey”, Image and vision computing, Vol. 60, pp. 4-21, April 2017.
[3] C. Indhumathi, V. Murugan, “A Survey on Neural Network based Approaches and Datasets in Human Action Recognition”, International Journal of Computer Sciences and Engineering, Vol. 6, Issue 6, June 2018.
[4] N.S. Lele, “Image Classification using Convolutional Neural Network” Vol. 6, Issue 3, PP.22-26, June 2018.
[5] B. Li, M. He, Y. Dai, X. Cheng,Y. Chen, “3D skeleton based action recognition by video-domain translation-scale invariant mapping and multi-scale dilated CNN” Multimedia Tools and Application, pp. 1-21, January 2018.
[6] W. Ding, K. Li, E. Belyaev, F. Cheng, “Tensor-based linear dynamical systems for action recognition from 3D skeletons” Pattern Recognition, Vol. 77, pp. 75-86, 2018.
[7] Q. Ke, M. Bennamoun, S. An, F. Sohel, F. Boussaid, “Learning Clip Representations for Skeleton-Based 3D Action Recognition”, IEEE Transactions on Image Processing, Vol. 27, No. 6, June 2018.
[8] Z. Li, K.G.E. Gavves, M. Jain, C.G.M. Snoek, “VideoLSTM convolves, attends and flows for action recognition”, Computer Vision and Image Understanding, Vol. 166, pp.41–50, 2018.
[9] A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, S.W. Baik, “Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features”, Special Section on Visual Surveillance and Biometrics: Practices, Challenges, And Possibilities, Vol. 6, 2018.
[10] Y. Xu, L. Wang, J. Cheng, H. Xia, J. Yin, “DTA: Double LSTM with temporal-wise attention network for action recognition” IEEE International Conference on Computer and Communications (ICCC), pp. 1676-1680, December, 2017.
[11] A. Ignatov, “Real-time human activity recognition from accelerometer data using Convolutional Neural Networks” Applied Soft Computing. Vol.62, pp. 915–922, 2018.
[12] C. Cao, Y. Zhang , C. Zhang and H. Lu, “Body Joint Guided 3-D Deep Convolutional Descriptors for Action Recognition” IEEE Transactions on Cybernetics, Vol. 48, No. 3, March 2018.
[13] X. Wang, L. Gao, P. Wang, X. Sun and X. Liu, “Two-Stream 3-D convNet Fusion for Action Recognition in Videos With Arbitrary Size and Length”, IEEE Transactions On Multimedia, Vol. 20, No. 3, March 2018.
[14] A. Grushin, D.D. Monner, J.A. Reggia, A. Mishra,”Robust Human Action Recognition via Long Short-Term Memory”, International Joint Conference on Neural Networks (IJCNN), 2013.
[15] S. Nitish, M. Elman, S. Ruslan, “Unsupervised Learning of Video Representations using LSTMs”, Proceedings of the 32nd International Conference on Machine Learning, France, 2015.
[16] H. Gammulle, S. Denman, S. Sridharan, C. Fookes, “Two stream lstm: A deep fusion framework for human action recognition”, IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 177-186, March 2017.
[17] X. Wang, L. Gao, Song, J. and Shen, H., “Two stream lstm CNN: saliency-aware 3-D CNN with LSTM for video action recognition”. IEEE Signal Processing Letters, Volume 24, Issue 4, pp.510-514, 2017.
[18] J. Liu, G. Wang, L.Y. Duan, K. Abdiyeva, and A.C. Kot, “Skeleton-based human action recognition with global context-aware attention LSTM networks”, IEEE Transactions on Image Processing, Volume 27, Issue 4, pp.1586-1599, 2018.
[19] I. Lee, D. Kim, S. Kang, S. Lee, “Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks” IEEE International Conference on Computer Vision (ICCV), pp. 1012-1020 , October, 2017.
[20] J. Liu, G. Wang, P. Hu, L.Y. Duan, A.C. Kot, “Global context-aware attention lstm networks for 3d action recognition”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) ,Vol. 7, pp. 43, July 2017.
[21] L. Wang, X. Zhao, Y. Liu, “Skeleton Feature Fusion based on Multi-Stream LSTM for Action Recognition”, IEEE Access, September, 2018.
[22] C. Li, P. Wang, S. Wang, Y. Hou, W. Li, ”Skeleton-based action recognition using LSTM and CNN” IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 585-590, July 2017.
[23] S. Song, C. Lan, J. Xing, W. Zeng, J. Liu, “Spatio-Temporal Attention-Based LSTM Networks for 3D Action Recognition and Detection” IEEE Transactions on Image Processing, Volume 27, Issue 7, pp. 3459-3471, July, 2018.
[24] Y. Yuan, X. Liang, X. Wang, D.Y. Yeung, A. Gupta,” Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection” International Conference on Computer Vision, pp. 1819-1828, October, 2017.
[25] N. Srivastava, E. Mansimov, R. Salakhudinov, “Unsupervised learning of video representations using LSTMs”, International conference on machine learning, pp. 843-852, Junuary 2015.
[26] S. Das, M. Koperski, F. Bremond, G. Francesca, “A Fusion of Appearance based CNNs and Temporal evolution of Skeleton with LSTM for Daily Living Action Recognition” arXiv preprint arXiv:1802.00421, 2018.
[27] J.C. Núñez, R. Cabido, J. Pantrigo, A. Montemayor, J. Vélez, ”Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition” Pattern Recognition Vol. 76, pp.80–94, 2018.
[28] S. Sharma, R. Kiros, R. Salakhutdinov, “Action Recognition using Visual Attention”, arXiv preprint arXiv:1511.04119, 2015.
[29] X. Qinkun, S. Yang, “Human Action Recognition Using Autoencoder”, IEEE International Conference on Computer and Communications, 2017.
[30] G. Ian, B. Yoshua, C. Aaron,” Deep Learning” MIT Press, 2016.
[31] P. Josh, G. Adam, “Deep Learning: A Practitioner`s Approach”, O’reilly, 2017.
[32] E.P. Ijjina and C.K. Mohan, “Human action recognition using genetic algorithms and convolutional neural networks”, Pattern Recognition, 2016.
[33] E.P. Ijjina and C.K. Mohan, ”Hybrid deep neural network model for human action recognition“ Applied soft computing, 2015.
[34] J. Donahue, L.A. Hendricks, S. Guadarrama and M. Rohrbach, “Long-term Recurrent Convolutional Networks for Visual Recognition and Description” Conference on Computer Vision and Pattern Recognition (CVPR 2015), 2015.
[35] M. Xu, A. Sharghi, X. Chen, D.J. Crandall, “ Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition”, IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1607-1615, March 2018.
[36] L.N. Pondhu, G. Pondhu, “Tuning Convolution Neural networks for Hand Written Digit Recognition”, International Journal of Computer Sciences and Engineering,Vol. 6, Issue 8, August 2018.