Concept Drift in Anomaly Detection

NIAD+ML

Advanced

Concept drift

Concept drift is a phenomenon in machine learning and data mining where the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways

This means that the relationship between the input data and the model’s output is no longer valid

In simpler terms, the patterns the model learned from historical data are no longer accurate in predicting future outcomes because the underlying conditions have changed.

Examples of concept drift

Machine learning-based detection algorithms often operate under the assumption of a closed-world, where training and testing samples are independent and identically distributed

Concept drift in industrial applications:

Water Treatment: Changes in the quality and composition of incoming water due to seasonal variations, industrial discharges, or changes in source water can affect treatment processes Industry:
Gradual degradation of machinery and equipment can lead to changes in vibration patterns, energy consumption, and other operational metrics, affecting predictive maintenance models
Persistent exposure to electromagnetic noise can degrade the performance of sensors. Sensors might start to produce noisy or biased measurements, leading to a shift in the data distribution
Introduction of new electronic equipment or machinery that generates electromagnetic interference, can cause variations in the level of noise. This can lead to changes in the data patterns collected by sensors

Types of concept drift

Sudden Drift: The change happens abruptly. For instance, sensor cleaning, redundant devices starting operations
Incremental Drift: The change occurs gradually over time. An example could be the slow but steady increase of dust on optical sensors
Seasonal Drift: Changes recur in a cyclical pattern, such as variations in ICS operations due to seasons or holidays
Recurring Drift: Similar to seasonal drift, but the changes are not necessarily tied to a specific period
- For example, day-night shifts or varying demand periods

Impact on Anomaly detection

Anomaly detection systems rely on building a model of normal behaviour based on historical data

When concept drift occurs, the model becomes outdated, leading to:

Increased false positives: Normal data that deviates from the old model is incorrectly flagged as anomalous

Increased false negatives: True anomalies that align with the new normal data distribution might be missed or ignored

The system’s ability to accurately detect anomalies deteriorates

Addressing concept drift

Model Retraining: Regularly model from scratch or with some prior knowledge using updated or additional data
Adaptive Learning: Implementing online learning algorithms that can adapt to new data in real-time
Ensemble Methods: Combining multiple models to mitigate the impact of drift by leveraging diverse perspectives on the data
Feedback Loops: Incorporating feedback from domain experts or automated systems to continuously refine and update the model based on real-time anomaly detection results

Types of anomalies

Point Anomalies: These are individual data points that deviate significantly from the rest of the dataset

For example, a sudden spike in temperature in a cooling system could be a point anomaly

Contextual Anomalies: Anomalies that are only unusual in a specific context

For instance, a high temperature might be normal during peak operation hours but anomalous during downtime Collective Anomalies: A series of data points that, as a group, represent an anomalous pattern, even if individual points are not

Selecting the anomaly threshold

When only normal data is available for anomaly detection, the task is unsupervised

The challenge is to identify a threshold that separates the normal data from potential anomalies
Two common algorithms for anomaly threshold selection:
- Z-score
- Percentile

The Z-score

Z-score: The Z-score formula is a statistical measure used to measure how many standard deviations a data point is from the mean of a dataset

$$Z(\mathbf{x}) = \frac{\mathbf{x} - \mu}{\sigma}$$

NOTE: In anomaly detection, a data point is the anomaly score of a sample (e.g., the reconstruction error)

Z-score of 0: This means the data point’s value is the same as the mean value of the data set
Z-score of 1.0: This value indicates one standard deviation above the mean
Z-score greater than 1.0: The data point is considered unusual or farther from the mean
- example: Autoencoder threshold , applied to the reconstruction error T = μ + σ MSE(x, x′)
- a Z-score of 2 means the data point is two standard deviations above the mean
  - Equivalent to T = μ + 2σ

Example: anomaly detection with autoencoders

In this lab, the anomaly threshold is computed as the sum of the mean ( ) and the standard deviation ( ) of the reconstruction errors obtained on the validation samples (lines 3 and 5 in the code snippet) μ σ

$$T = \mu + \sigma$$

In the test phase (lines 13 and 17), the samples whose reconstruction error is larger than the threshold are considered anomalies

$$MSE(\mathbf{x}, \mathbf{x’}) > T$$

This is equivalent to a Z-score threshold of 1 standard deviation:

$$T = \mu + \sigma \iff T - \mu = \sigma \iff \frac{T + \mu}{\sigma} = 1$$

Percentile-based anomaly threshold

Percentile Method. A percentile is a measure that indicates the value below which a given percentage of observations in a dataset falls

Example: the 90th percentile is the value below which 90% of the data points lie Steps for Percentile-Based Anomaly Detection:
Calculate Anomaly Scores. If you have raw data, you might first need to compute an anomaly score for each data point
- This could be based on distance metrics, reconstruction errors (e.g., MSE), or other methods

Compute Percentiles. Once you have your anomaly scores, calculate the relevant percentiles

For example, you could calculate the 95th, 99th, or even 99.9th percentiles, depending on how strict you want to be in identifying anomalies Set the Anomaly Threshold. Determine which percentile will serve to compute your threshold for flagging anomalies
Typically, a high percentile (e.g., 95th, 99th) is chosen
The threshold is the minimum anomaly score of the samples outside the percentile

This means any data point with an anomaly score above this is considered an anomaly

Anomaly threshold and concept drift

Anomaly detection systems typically rely on the assumption that normal behaviour is consistent over the time
When concept drift occurs, what was once normal might now be considered anomalous or vice-versa
This makes it difficult to maintain a fixed threshold for detecting anomalies

Static Threshold Issues

A threshold that works well in one period may become ineffective if the data distribution shifts.

For example, a threshold that was effective in detecting anomalies in a dataset of sensor readings might become either too sensitive (high false positive rate) or too conservative (high false negative rate) after a concept drift.

Adaptive Thresholding

One approach is to dynamically adjust the threshold in response to detected concept drift.

However, determining the optimal adjustment mechanism is complex and often requires real-time monitoring and tuning, which can be resourceintensive.

It might also involve human intervention to ensure that the data used to tune the throshold that discriminates the normal behavior from anomalies is correctly labelled (e.g., all the samples are normal).

Drift detection delay

Detecting concept drift in real-time is difficult.

There is often a delay between the occurrence of the drift and its detection.

During this delay, the anomaly detection system may operate with a suboptimal threshold, leading to inaccurate detection rates.

Adversarial Machine Learning Industrial Control Systems