Abstract:
To protect the integrity and security of networks, experts and researchers are constantly improving ways to capture network-based threats. As the security systems advance towards effcient and effective recognition of such threats, exploiters and malicious users introduce novel attacks unknown to the society to circumvent the defensive systems. Novel threats are difficult to detect and capture due to a lack of information initially, but they must not be disregarded until suficient information is gathered to help identify them. A capable security system should not only be able to detect previously known attacks but also possess abilities to detect unknown novel attacks. We propose a method in which the novel attacks can be detected without prior knowledge or presumption of their intentions and purposes. Our method is based on (1) unsupervised anomaly detection algorithms that do not require labels in data, which makes it a useful exploratory tool for finding potential novel threats in the given data, and (2) local density algorithms, such as LOF and DBSCAN, to robustly analyse data while not being affected by the arbitrary shapes of the data. Our method implements a two-phase structure to mitigate one of the main caveats of an unsupervised anomaly detection task, that is relative underperformance in comparison to its supervised and semi-supervised counterparts. We evaluate each phase of our method as well as the whole, using KDD'99 data, to show that it works as designed and compare the results against other possible methods to report the strengths and weaknesses of our method. In our experiments, our proposed method was able to detect between 95% and 98% of intrusions, on average, from one dataset of KDD'99 and about 88% of intrusions from another, which suggests that our method is accurate. The standard deviations of our results were extremely small to suggest that our method is also precise. Our method was able to outperform clustering-based and PCA-based approaches in terms of both accuracy and precision.