Intrusion Detection Using Heterogeneous Data Sources

dc.contributor.advisorGhaderi, Majid
dc.contributor.authorAbhari, Bardia
dc.contributor.committeememberHenry, Ryan Douglas
dc.contributor.committeememberHudson, Jonathan Wiliam
dc.date.accessioned2024-01-19T21:55:51Z
dc.date.available2024-01-19T21:55:51Z
dc.date.issued2024-01-18
dc.description.abstractAmidst the growing sophistication of cyber-attacks and malware, conventional Intrusion Detection Systems (IDS) often fall short, primarily due to their reliance on single data sources, such as Network-based (NIDS) or Host-based Intrusion Detection Systems (HIDS). These systems tend to miss a comprehensive view of network activities, as highlighted in existing literature. Recent research efforts have attempted to integrate multiple heterogeneous data sources, yet often treat each data source in isolation, thereby overlooking the complex interrelations that exist among various data sources within the same network. This thesis introduces IMD-IDS, which stands apart by its ability to fuse multiple heterogeneous data sources effectively for anomaly detection. The centrepiece of IMD-IDS is a machine learning (ML) based detection engine trained concurrently on all available data sources, whether heterogeneous or not. This approach enables IMD-IDS to uncover and understand the intricate relationships between different data sources. To achiece this, a novel fusion algorithm is presented, leveraging BERT encoders to convert textual host data into numerical vectors. These vectors are then integrated with feature vectors derived from network data, forming a rich, combined dataset. The XGBoost model, employed within IMD-IDS, utilizes this unified dataset to enhance anomaly detection accuracy, benefiting from simultaneous access to diverse data sources. Through experimental validation, this thesis demonstrates that IMD-IDS achieves superior performance compared to previous multi-datasource IDS approaches, particularly in detecting both known and zero-day attacks. The results show an average performance improvement of 12\% and 10\%, respectively, for these attack types.
dc.identifier.citationAbhari, B. (2024). Intrusion detection using heterogeneous data sources (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.
dc.identifier.urihttps://hdl.handle.net/1880/118009
dc.identifier.urihttps://doi.org/10.11575/PRISM/42853
dc.language.isoen
dc.publisher.facultyArts
dc.publisher.institutionUniversity of Calgary
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subjectNetwork Security
dc.subjectIntrusion Detection
dc.subjectWord Embedding
dc.subjectApplied Machine Learning
dc.subjectMulti Data Source
dc.subject.classificationEducation--Sciences
dc.titleIntrusion Detection Using Heterogeneous Data Sources
dc.typemaster thesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Calgary
thesis.degree.nameMaster of Science (MSc)
ucalgary.thesis.accesssetbystudentI do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2024_abhari_bardia.pdf
Size:
1.04 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.62 KB
Format:
Item-specific license agreed upon to submission
Description: