Temporal fusion strategy for violence detection: utilising convolutional and LSTM neural networks for surveillance videos

Merit, Khaled; Beladgham, Mohammed; Taleb-Ahmed, Abdelmalik

Typ dokumentu

article
Peer-reviewed
publishedVersion

Autor

Merit, Khaled

Beladgham, Mohammed

Taleb-Ahmed, Abdelmalik

Práva

Creative Commons Attribution 4.0 International License
http://creativecommons.org/licenses/by/4.0/
openAccess

Metadata

Zobrazit celý záznam

Abstrakt

In the latest intelligent cities, there is a pursuit for the utmost degree of automation and integration of services. One of the major challenges in the surveillance industry is the need to automate real-time video analysis to identify critical cases. This paper introduces sophisticated models using Convolutional Neural Networks (CNN), specifically MobileNet V3, VGG16, and InceptionV3 networks, as well as networks using LSTM and feedforward networks. These models are designed to accurately categorise videos into two completely separate classes, namely: (“Non-Violence” and “Violence”). The RLVS database is used for this classification task. Various data representations are used by Temporal Fusion approaches. The highest attained outcome was an Accuracy of 91.03 %, and an F1-score of 90.90 %, which is superior to the results obtained in similar research performed on the same database for achieving the goal of recognising actions that are violent in Surveillance Videos.

K tomuto záznamu jsou přiřazeny následující licenční soubory:

Původní licence

Kromě případů, kde je uvedeno jinak, licence tohoto záznamu je Creative Commons Attribution 4.0 International License