Combining Network Anomaly Detectors

Grill, Martin

Typ dokumentu

disertační práce

Autor

Grill, Martin

Vedoucí práce

Pevný, Tomáš

Studijní obor

Informatika a výpočetní technika

Studijní program

Elektrotechnika a informatika

Instituce přidělující hodnost

České vysoké učení technické v Praze. Fakulta elektrotechnická. Katedra počítačů

Metadata

Zobrazit celý záznam

Abstrakt

The anomaly-based network intrusion detection systems (IDS) typically su er from high false alarm rate rendering them useless in practice as the subsequent analysis done by the network operator is costly and can be done only for a small number of raised alarms. This thesis introduces several novel anomaly detectors and develop techniques for their combination to achieve much smaller false positive rates. We propose an architecture of an IDS that uses a number of simple network anomaly detectors that are able identify anomalies relevant to malicious network communication using the NetFlow (CAMNEP IDS) or HTTP access log (Cisco Cognitive Threat Analytics|CTA) telemetry data. We introduce several novel network anomaly detection techniques that enrich the ensemble of the state-of-the-art network anomaly detection methods used in both detection systems. The detectors are designed to use di erent anomaly detection algorithms applied to di erent subsets of features to introduce diversity and detect wider range of malicious behaviors. The outputs of the anomaly detectors are combined using two parallel aggregation functions constructed in supervised and unsupervised manner. The unsupervised combination uses a state-of-the-art method that is robust to presence of low accuracy detectors. The supervised combination is created using a novel technique that nds a convex combination of outputs of the anomaly detectors maximizing the accuracy in -quantile of the most anomalous samples. An extensive experimental evaluation and comparison to prior art on real network data using anomaly detectors of both CAMNEP and CTA intrusion detection systems shows that the proposed method not only outperforms prior art, but is also more robust to noise in training data labels, which is another important feature for deployment in practice. Moreover, we propose to smooth the outputs of the ensembles by online Local Adaptive Multivariate Smoothing (LAMS) to further reduce the amount of the false positives. LAMS can reduce the number of false positives introduced by the anomaly detection by replacing the anomaly detector's output on a network event with an aggregate of its output on all similar network events observed in the past. The arguments are supported by extensive experimental evaluation involving ensembles of anomaly detectors of both CTA and CAMNEP intrusion detection systems. We also describe an e ective implementation of the proposed solution to process large streams of non-stationary data. Finally, the extensive experimental evaluation using real network data collected in a number of corporate networks with a large number of labeled samples shows that each of these techniques signi cantly improves the e cacy of the anomaly-based intrusion detection system.

K tomuto záznamu jsou přiřazeny následující licenční soubory:

Původní licence