|dc.description.abstract||This habilitation thesis presents advancements in machine learning for computer security,
arising from problems in network intrusion detection and steganography.
The thesis put an emphasis on explanation of traits shared by steganalysis, network intrusion
detection, and other security domains, which makes these domains different from
computer vision, speech recognition, and other fields where machine learning is typically
studied. Then, the thesis presents methods developed to at least partially solve the identified
problems with an overall goal to make machine learning based intrusion detection
system viable. Most of them are general in the sense that they can be used outside intrusion
detection and steganalysis on problems with similar constraints.
A common feature of all methods is that they are generally simple, yet surprisingly
effective. According to large-scale experiments they almost always improve the prior art,
which is likely caused by being tailored to security problems and designed for large volumes
Specifically, the thesis addresses following problems:
anomaly detection with low computational and memory complexity such that efficient
processing of large data is possible;
multiple-instance anomaly detection improving signal-to-noise ration by classifying
larger group of samples;
supervised classification of tree-structured data simplifying their encoding in neural
clustering of structured data;
supervised training with the emphasis on the precision in top p% of returned data;
and finally explanation of anomalies to help humans understand the nature of anomaly
and speed-up their decision.
Many algorithms and method presented in this thesis are deployed in the real intrusion
detection system protecting millions of computers around the globe.||cze