Temporal fusion strategy for violence detection: utilising convolutional and LSTM neural networks for surveillance videos
Typ dokumentu
articlePeer-reviewed
publishedVersion
Autor
Merit, Khaled
Beladgham, Mohammed
Taleb-Ahmed, Abdelmalik
Práva
Creative Commons Attribution 4.0 International Licensehttp://creativecommons.org/licenses/by/4.0/
openAccess
Metadata
Zobrazit celý záznamAbstrakt
In the latest intelligent cities, there is a pursuit for the utmost degree of automation and integration of services. One of the major challenges in the surveillance industry is the need to automate real-time video analysis to identify critical cases. This paper introduces sophisticated models using Convolutional Neural Networks (CNN), specifically MobileNet V3, VGG16, and InceptionV3 networks, as well as networks using LSTM and feedforward networks. These models are designed to accurately categorise videos into two completely separate classes, namely: (“Non-Violence” and “Violence”). The RLVS database is used for this classification task. Various data representations are used by Temporal Fusion approaches. The highest attained outcome was an Accuracy of 91.03 %, and an F1-score of 90.90 %, which is superior to the results obtained in similar research performed on the same database for achieving the goal of recognising actions that are violent in Surveillance Videos.
Kolekce
K tomuto záznamu jsou přiřazeny následující licenční soubory:
Kromě případů, kde je uvedeno jinak, licence tohoto záznamu je Creative Commons Attribution 4.0 International License