Identification of Cyber Threats Information from Online News using Hybrid Machine Learning Algorithm

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

ASTU

Abstract

There are large volumes of data from the online generated cyber threat news that are freely available which might contain valuable information. Cyber threat information is highly increasing and can be analysed to gather informative insights of current situation. However, news is delivered in a variety of forms, and the emergence of new cyber-attacks, as well as the usage of ambiguous news items, has made detecting linked news more challenging. Thus, to handle these situations, the aim of this paper is to propose identification mechanism for cyber threat information. The system starts with identifying cyber-attack features that will be used to classify cyber threat information using Bi directional long short time memory with Conditional Random filed (BI-LSTM-CRF) model, as well as categorize similar news articles using Latent Semantic Analysis (LSA) to eliminate ambiguous cyber threat news. Data will be collected from the news article related to the cyber-attacks that include incidents or attacks that had happened. Data can be obtained from the news websites such as Recorded Future.com, Fire Eye, Security Week, Micro Trend and Ethiopian Monitor Website. The cyber-attack features will be identified from the collected data. Futures such as type of cyber-threat, threat actor, the organization affect and Country affect. For this research 2019 cyber related news articles are collected in the form of unstructured text. Experimental results demonstrate that using Bi directional long short time memory with Conditional Random filed is an effective way of classification performance. The model achieves an overall F-measure of 98.48% for Cyber threat information identification with accuracy 99.12%. The findings of this study should assist individuals by presenting a realistic picture of cyber-attack occurrences in our environment and providing useful information to the general public, thereby improving societal awareness about cyber-attack activities. In addition, the model requires further improvement regarding feature selection of the cyber -attack that would be difficult for machine to catagorized.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By