Context
Notations
- Machine Learning (ML)
- Deep Learning (DL)
- Neural Network (NN)
- Network Intrusion Detection System (NIDS)
- Host Intrusion Detection System (HIDS)
- Denial of Service (DoS), Distributed Denial of Service (DDoS)
Why Machine Learning and Network Security?
Traditional approaches
Traditional approaches to cyber threat detection, such as signaturebased detection, rule-based systems, heuristic-based systems, etc., are limited to known threats.
This means that they can be easily bypassed by unknown and emerging threats
Example: snort (snort.org) rule to detect whether an ICMP request is sent from an external source (network scanning attempt)
alert icmp any any -> \$HOME_NET any (msg:"ICMP Ping"; itype:8; sid:1000001; rev:1;)
How It Can Be Bypassed:
Non-ICMP Scan: Attackers can use other protocols or methods to scan or probe a network (e.g., TCP SYN scan, UDP scan)
ML-based network intrusion detection
Definition: Machine learning (ML) is the process of using historical data to create a prediction algorithm for future data
Machine Learning offers great adaptability and the ability to learn from data, enabling the identification of unknown and evolving threats
Examples of traditional approaches
Rule-based detection:
- firewall rules (e.g., block a range of source IP addresses), NIDS (the system can detect a sequence of network packets that matches the characteristics of a SQL injection attack)
- Access Control List (ACL) rules (e.g., a rule might specify that only certain users are allowed to access a sensitive database)
Signature-based detection:
- antivirus signatures (specific patterns of bytes)
- email spam filters (e.g., keywords, URLs, or patterns commonly associated with spam or phishing)
- application whitelisting (e.g., if an application lacks a valid signature or has an unapproved signature, it’s blocked from execution)
- Realistic example: SNORT, an open-source network intrusion detection/prevention system (for instance, rules 3 (SQL injection) and 5 (malware signature) at https://www.sapphire.net/security/snort-rules-examples/)
Limitations of traditional approaches
- Static Nature: Traditional methods rely on predefined rules, signatures, or heuristics that are static and inflexible. They can only detect threats for which they have explicit definitions. When faced with new or modified threats, these methods often lack the capacity to recognize unfamiliar patterns.
- Behavioural Drift: Over time, normal network traffic patterns can drift, and a heuristic system may continue to rely on outdated models
- Reduced Effectiveness: Over time, the NIDS might struggle to adapt to changes in legitimate network usage and to keep up with new threats, leading to inefficiencies in detection.
- Lack of Context: Traditional methods may lack the contextual understanding required to differentiate between legitimate deviations and actual threats (e.g., flash crowds confused with DDoS attacks)
- Zero-Day Vulnerabilities: Zero-day vulnerabilities are newly discovered vulnerabilities that have not yet been patched. Traditional methods struggle to detect attacks exploiting these vulnerabilities because there are no known signatures or rules to match against
Sophisticated attacks:
- polymorphic malware (constantly changing appearance or behaviour)
- slow, low-profile attacks (low-rate traffic to avoid detection, e.g., port scan)
- multi-stage attacks (reconnaissance, initial compromise, infiltration, lateral movement)
ML-based network intrusion detection
- Adaptability and Generalisation: ML models are designed to learn from data rather than rely on predefined rules or signatures
- Feature Extraction: some ML models can automatically extract relevant features from data
- Learning Context: ML models learn from historical data and understand the context of normal operations
- Feature Learning: Some ML methods, like deep learning, can learn hierarchical representations of data
- Behavioural Analysis: Anomaly detection using ML can capture deviations from normal behaviour, even if the specific attack patterns are previously unseen
- Continuous Learning: ML models can continuously learn and adapt as new data becomes available
- Multi-Dimensional Analysis: ML models can analyse various dimensions of data, including temporal, spatial, and behavioural aspects
- Granular Detection: ML models can detect subtle variations in behaviour, attributes, or patterns within data, making them effective against polymorphic threats that modify their behaviour
Use cases: DDoS
Distributed Denial of Service (DDoS)
A DDoS attack involves multiple connected online devices (botnet) which are used to overwhelm a target host or network
These attacks overload the target with an excessive amount of traffic, making it difficult or impossible for legitimate users to access the resources or services provided by the target
DDoS attack detection
Rule-based approaches involve creating specific rules or conditions that define known attack patterns. Some examples:
- Threshold-Based Rule: If the incoming request rate from a single IP address exceeds a predefined threshold within a short time window, trigger an alert
- Protocol Anomaly Rule: If a large percentage of incoming requests do not adhere to the expected protocol behaviour (e.g., incomplete or malformed requests), trigger an alert
- Resource utilisation Rule: If server CPU, memory, or network utilisation exceeds predefined thresholds, trigger an alert (DDoS attacks aim to overwhelm server resources)
- Rate Spike Rule: If the rate of incoming requests across the entire network exceeds a certain threshold, trigger an alert
- Heuristic algorithms don’t rely on strict predefined signatures or fixed thresholds like rule-based approaches. Instead, they use a combination of rules and conditions that are based on general principles, experience, and patterns of behaviour:
- Can involve various logical operators and conditions to determine whether a particular behaviour or pattern should raise an alert
- In a heuristic-based DDoS detection system, an operator with domain expertise might craft rules that take into account the distribution of incoming requests, the types of requests, the geographic origin of traffic, sudden spikes in traffic, and more
ML-based approaches involve training models to learn from historical data and then using those models to detect anomalies or deviations from normal behaviour
-
ML algorithms can be used to predict potential DDoS attacks by analysing historical traffic data and recognizing trends that typically precede an attack Even when traffic is encrypted, ML models can still identify suspicious patterns in traffic behaviour (e.g., volume, timing) without relying on the content of the packets
-
ML techniques like anomaly detection can spot subtle deviations in network traffic behaviour that might go unnoticed with traditional approaches
Use case: Polymorphic malware
Polymorphic malware consists of two parts:
-
- Encrypted virus body: code that changes its shape
-
- Virus encryption/decryption routine: code that doesn’t change its shape and decrypts and encrypts the other part
This malware has the capability to constantly change its code structure and appearance, making it nearly impossible to create a static signature that can accurately identify it
A ML model might focus on features such as communication patterns, data exfiltration behaviour, system resource usage, and other subtle indicators that remain relatively consistent despite the polymorphic changes in the malware’s code
Limitations of Machine Learning in Network Security
Data
- Dependency on Training Data: ML models require highquality, up-to-date and diverse training data to be effective. This applies to both benign and malicious training data
- Data Privacy: ML operations might raise privacy concerns, especially in cases where network data contains sensitive information.
ML model design and training issues
- False Positives and Negatives: ML models can generate false positives (legitimate traffic classified as malicious) or false negatives (malicious traffic classified as legitimate). Achieving the right balance between these two is challenging and often requires tuning the model parameters and thresholds
- Resource Intensive: Training and maintaining ML models can be computationally intensive and resource-demanding
- Overfitting and Generalisation: ML models can overfit the training data, meaning they perform well on the training data but poorly on unseen data
- Model Complexity: Complex ML models might require significant computational resources and expertise to develop, deploy, and maintain. Simpler models might not capture intricate attack patterns effectively 29
Operational issues
- Adversarial Attacks: Adversaries can manipulate network traffic to evade MLbased NIDS, or can poison the training data to influence NIDS’s operations.
- Concept Drift: Network behaviours and attack techniques change over time, leading to what’s called “concept drift.”
- Zero-Day Attacks: While ML models can detect novel attacks based on behavioural patterns, they might not always be effective against entirely new and unknown attack techniques until they have enough data to learn from.
- Lack of Interpretability: Many ML models, especially complex ones, are often considered “black boxes.” This lack of interpretability can make it difficult to understand how a model arrives at its decisions, which is crucial for security analysis and forensics.