Unsupervised anomaly detection with Temporian and scikit-learn¶
In this tutorial, we use Temporian and Scikit-Learn to detect anomalies in a multivariate time series dataset.
Anomaly detection in time series, time sequences and other temporal formats is critical in a variety of domains. For instance, it is used in manufacturing to detect equipment failure and production lines, in computer systems and by financial institutions to detect fraudulent activities, and in energy management to detect outages.
We will use the Server Machine Dataset (SMD) dataset, published as part of the OmniAnomaly paper, which is available in CSV files in that same repository.
This dataset contains aggregated resource usage metrics (e.g., CPU and RAM utilization, network traffic) from 28 computers in 3 data centers over a period of 5 weeks. The timestamps and feature names have been anonymized and normalized. Therefore, calendar feature engineering or the use of expert knowledge is not possible.
The dataset will be loaded, feature engineered and converted into a tabular dataset using Temporian. Then, we will use an isolation forest model on this tabular data to detect anomalies. Finally, we will evaluate the quality of our detection on the ground truth anomalies, available as part of the testing dataset.
We will first evaluate our detections both with AUC-ROCs (Area Under the Receiver operating Characteristic curve) and with AMOCs (Activity Monitoring Operating Characteristic). Unlike an ROC, an AMOC takes into account the time to detection or forecasting horizon which is often critical in temporal problems like ours. If you don't know about AMOCs yet, don't worry. We will explain it.
Check out the Supervised anomaly detection tutorial for a version of this notebook that trains a supervised model using the ground truth labels, which is less common in an anomaly detection setting, but can yield better performance if those are available.
Installation and imports¶
%pip install temporian -q
[notice] A new release of pip is available: 23.1.2 -> 23.2.1 [notice] To update, run: pip install --upgrade pip Note: you may need to restart the kernel to use updated packages.
import os
from pathlib import Path
from typing import List
import numpy as np
import pandas as pd
from sklearn.ensemble import IsolationForest
from sklearn.metrics import roc_auc_score
import urllib.request
import temporian as tp
Download dataset¶
The dataset is made of 3 groups of 8, 9, and 11 machines respectively, with names "machine-1-1"
, ..., "machine-3-11"
.
Let's list the machine names, and download records locally.
# machines_per_group = [8, 9, 11] # Full dataset
machines_per_group = [3, 3, 3] # Subset of data
machines = [f"machine-{group}-{id}" for group, machine in zip(range(1, 4), machines_per_group) for id in range(1, machine + 1)]
print("{len(machines)} machines")
{len(machines)} machines
data_dir = Path("tmp/temporian_server_machine_dataset")
dataset_url = "https://raw.githubusercontent.com/NetManAIOps/OmniAnomaly/master/ServerMachineDataset"
data_dir.mkdir(parents=True, exist_ok=True)
# Download the data and labels for each machine to its own folder
for machine in machines:
print(f"Download data of {machine}")
machine_dir = data_dir / machine
machine_dir.mkdir(exist_ok=True)
data_path = machine_dir / "data.csv"
if not data_path.exists():
urllib.request.urlretrieve(f"{dataset_url}/test/{machine}.txt", data_path)
labels_path = machine_dir / "labels.csv"
if not labels_path.exists():
urllib.request.urlretrieve(f"{dataset_url}/test_label/{machine}.txt", labels_path)
Download data of machine-1-1 Download data of machine-1-2 Download data of machine-1-3 Download data of machine-2-1 Download data of machine-2-2 Download data of machine-2-3 Download data of machine-3-1 Download data of machine-3-2 Download data of machine-3-3
Load data¶
We use Pandas to load the data into a DataFrame
and perform tabular manipulation before importing it into a Temporian as an EventSet
.
We index the data by the machine
column in Temporian. This was, we can apply Temporian transformations on each machine individually as well as across all machines.
dataframes = []
for machine in machines:
machine_dir = data_dir / machine
# Read the data and labels
print(f"Load data of {machine}...", end="")
df = pd.read_csv(machine_dir / "data.csv", header=None).add_prefix("f")
labels = pd.read_csv(machine_dir/ "labels.csv", header=None)
df = df.assign(label=labels)
df["machine"] = machine
df["timestamp"] = range(df.shape[0])
print(f"found {df.shape[0]} events")
dataframes.append(df)
dataframes[0].head(3)
Load data of machine-1-1...found 28479 events Load data of machine-1-2...found 23694 events Load data of machine-1-3...found 23703 events Load data of machine-2-1...found 23694 events Load data of machine-2-2...found 23700 events Load data of machine-2-3...found 23689 events Load data of machine-3-1...found 28700 events Load data of machine-3-2...found 23703 events Load data of machine-3-3...found 23703 events
f0 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | ... | f31 | f32 | f33 | f34 | f35 | f36 | f37 | label | machine | timestamp | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.075269 | 0.065678 | 0.070234 | 0.074332 | 0.0 | 0.933333 | 0.274011 | 0.0 | 0.031081 | 0.000000 | ... | 0.048893 | 0.000386 | 0.000034 | 0.064432 | 0.064500 | 0.0 | 0.0 | 0 | machine-1-1 | 0 |
1 | 0.086022 | 0.080508 | 0.075808 | 0.076655 | 0.0 | 0.930769 | 0.274953 | 0.0 | 0.031081 | 0.000122 | ... | 0.050437 | 0.000386 | 0.000022 | 0.065228 | 0.065224 | 0.0 | 0.0 | 0 | machine-1-1 | 1 |
2 | 0.075269 | 0.064619 | 0.071349 | 0.074332 | 0.0 | 0.928205 | 0.274953 | 0.0 | 0.030940 | 0.000366 | ... | 0.055069 | 0.000386 | 0.000045 | 0.067111 | 0.067178 | 0.0 | 0.0 | 0 | machine-1-1 | 2 |
3 rows × 41 columns
# Convert the dataframes into a single Temporian EventSet
evset = tp.combine(*map(tp.from_pandas, dataframes))
# Index the EventSet according the the machine name.
evset = evset.set_index("machine")
# Cast the feature and label to a smaller dtypes to same one memory.
evset = evset.cast(tp.float32).cast({"label": tp.int32})
evset
WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_). WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_).
timestamp | f0 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 | f13 | f14 | f15 | … |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.07527 | 0.06568 | 0.07023 | 0.07433 | 0 | 0.9333 | 0.274 | 0 | 0.03108 | 0 | 0.1341 | 0.08108 | 0.0274 | 0.06781 | 0.1258 | 0.1506 | … |
1 | 0.08602 | 0.08051 | 0.07581 | 0.07666 | 0 | 0.9308 | 0.275 | 0 | 0.03108 | 0.000122 | 0.1488 | 0.1622 | 0.0548 | 0.0714 | 0.1231 | 0.1645 | … |
2 | 0.07527 | 0.06462 | 0.07135 | 0.07433 | 0 | 0.9282 | 0.275 | 0 | 0.03094 | 0.000366 | 0.1348 | 0.0946 | 0.0274 | 0.06328 | 0.129 | 0.1515 | … |
3 | 0.08602 | 0.04873 | 0.06355 | 0.07085 | 0 | 0.9282 | 0.2731 | 0 | 0.02725 | 0.000244 | 0.1313 | 0.08108 | 0.0274 | 0.06784 | 0.1104 | 0.1456 | … |
4 | 0.08602 | 0.05191 | 0.06243 | 0.07085 | 0 | 0.9333 | 0.274 | 0 | 0.03094 | 0.000244 | 0.1027 | 0.1081 | 0.0411 | 0.07565 | 0.1191 | 0.1184 | … |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
timestamp | f0 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 | f13 | f14 | f15 | … |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.23 | 0.1095 | 0.1266 | 0.1338 | 0.2264 | 0.621 | 0.4093 | 0 | 9.3e-05 | 1.5e-05 | 0 | 0.00722 | 0 | 0.03695 | 0.003551 | 0.02929 | … |
1 | 0.25 | 0.1089 | 0.1271 | 0.134 | 0.2264 | 0.5473 | 0.337 | 0 | 9.3e-05 | 1.5e-05 | 0 | 0.009025 | 0 | 0.03926 | 0.004129 | 0.03305 | … |
2 | 0.24 | 0.1228 | 0.1308 | 0.1353 | 0.2264 | 0.4785 | 0.2648 | 0 | 0.000653 | 0.000134 | 8e-06 | 0.01805 | 0 | 0.09858 | 0.009166 | 0.09242 | … |
3 | 0.23 | 0.1093 | 0.1286 | 0.1348 | 0.2264 | 0.4841 | 0.2648 | 0 | 0.001026 | 0.000134 | 8e-06 | 0.01444 | 0 | 0.102 | 0.01292 | 0.09136 | … |
4 | 0.24 | 0.09617 | 0.1254 | 0.1338 | 0.2264 | 0.485 | 0.2649 | 0 | 0.000187 | 1.5e-05 | 0 | 0.02347 | 0 | 0.02632 | 0.009579 | 0.02257 | … |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
timestamp | f0 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 | f13 | f14 | f15 | … |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.2424 | 0.01391 | 0.01491 | 0.01739 | 0.9286 | 0.4351 | 0.374 | 0 | 0.002196 | 0.000394 | 0 | 0.03985 | 0 | 0.01917 | 0.01931 | 0.01111 | … |
1 | 0.2424 | 0.01385 | 0.01479 | 0.01739 | 0.9286 | 0.4373 | 0.3764 | 0 | 0.000119 | 0 | 0 | 0.01414 | 0 | 0.008432 | 0.002625 | 0.00104 | … |
2 | 0.2222 | 0.01391 | 0.01498 | 0.01755 | 0.9286 | 0.436 | 0.3749 | 0 | 0.000178 | 0 | 0 | 0.01671 | 0 | 0.007965 | 0.003243 | 0.000964 | … |
3 | 0.2323 | 0.009957 | 0.01373 | 0.01708 | 0.9286 | 0.4347 | 0.3734 | 0 | 0.000178 | 0 | 0 | 0.01028 | 0 | 0.007965 | 0.002789 | 0.000832 | … |
4 | 0.2323 | 0.009734 | 0.01324 | 0.01692 | 0.9286 | 0.4353 | 0.374 | 0 | 0.000297 | 0 | 0 | 0.01671 | 0 | 0.008803 | 0.004606 | 0.001191 | … |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
timestamp | f0 | f1 | f2 | f3 | f4 | f5 | f6 | f7 | f8 | f9 | f10 | f11 | f12 | f13 | f14 | f15 | … |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.1111 | 0.009689 | 0.0171 | 0.04285 | 0.1579 | 0.8356 | 0.4777 | 0 | 0 | 0 | 0 | 0.03304 | 0 | 0.01214 | 0.005469 | 0.01656 | … |
1 | 0.101 | 0.006856 | 0.01541 | 0.04118 | 0.1579 | 0.8359 | 0.4779 | 0 | 0 | 0 | 0 | 0.02261 | 0 | 0.0238 | 0.003045 | 0.02151 | … |
2 | 0.1212 | 0.01429 | 0.01815 | 0.04248 | 0.1579 | 0.839 | 0.4815 | 0 | 0.001073 | 0 | 0 | 0.01391 | 0 | 0.06879 | 0.01759 | 0.04456 | … |
3 | 0.1313 | 0.01273 | 0.01872 | 0.04267 | 0.1579 | 0.8518 | 0.4956 | 0 | 0.001877 | 0 | 0 | 0.01565 | 0 | 0.06783 | 0.03362 | 0.1281 | … |
4 | 0.1111 | 0.009114 | 0.01751 | 0.04155 | 0.1579 | 0.8665 | 0.5114 | 0 | 0 | 0 | 0 | 0.01913 | 0 | 0.02304 | 0.001367 | 0.01609 | … |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
Awesome! Seems like each machine has more than 20.000 events and 39 features (counting the "label" one).
As stated previously, all metrics seem to be anonymized and normalized to [0, 1]
, so we won't need to take care of that ourselves.
Data visualization¶
Let's take a look at some of the first machine's features.
# Plot the first 3 features
evset.plot(indexes="machine-1-1", max_num_plots=3)
# Plot the labels
evset["label"].plot(indexes="machine-1-1")
The number of plots (39) is larger than "options.max_num_plots=3". Only the first plots will be printed.
Great! A lot to unpack here:
- It seems to be easy to understand when an anomaly occurs (label takes value of 1) by looking at the other plots. Features 11 to 14, for example, seem to be very correlated to the label.
- The data seems to have some periodicity to it.
- Some features seem empty, and we could evaluate dropping them if needed.
Data preparation¶
To prepare our data to train a model on it, let's start off by separating the features from the labels.
feature_names = evset.schema.feature_names()
feature_names.remove("label")
raw_features = evset[feature_names]
labels = evset["label"]
print("Raw features:", raw_features.schema)
print("Labels:", labels.schema)
Raw features: features: [('f0', float32), ('f1', float32), ('f2', float32), ('f3', float32), ('f4', float32), ('f5', float32), ('f6', float32), ('f7', float32), ('f8', float32), ('f9', float32), ('f10', float32), ('f11', float32), ('f12', float32), ('f13', float32), ('f14', float32), ('f15', float32), ('f16', float32), ('f17', float32), ('f18', float32), ('f19', float32), ('f20', float32), ('f21', float32), ('f22', float32), ('f23', float32), ('f24', float32), ('f25', float32), ('f26', float32), ('f27', float32), ('f28', float32), ('f29', float32), ('f30', float32), ('f31', float32), ('f32', float32), ('f33', float32), ('f34', float32), ('f35', float32), ('f36', float32), ('f37', float32)] indexes: [('machine', str_)] is_unix_timestamp: False Labels: features: [('label', int32)] indexes: [('machine', str_)] is_unix_timestamp: False
Next, we'll need to split our dataset into train and testing sets, which we'll use an 80/20 split for.
Note that the train labels will only be used for evaluation purposes.
We'll be creating reusable functions for each step, since we'll do some iteration over the feature engineering -> training -> evaluation
cycle.
DROP_COLS = ["machine", "timestamp"]
def split_event_set(evset: tp.EventSet, test_frac: float = 0.2):
# Average length of the records
average_duration = np.mean([len(machine_data.timestamps) for _, machine_data in evset.data.items()])
print("average_duration:", average_duration)
# Select the train/test cutoff.
# Note: All the machines are cut at the same time. This way, we can apply pre-processing that
# exchange data between the machines without risk of label leakage!
train_cutoff = average_duration * (1 - test_frac)
print("train_cutoff:", train_cutoff)
# Compute masks and split data based on cutoff
train_mask = evset.timestamps() <= int(train_cutoff)
test_mask = ~train_mask
# Split EventSets
train_evset = evset.filter(train_mask)
test_evset = evset.filter(test_mask)
print(f"Train events: {train_evset.num_events()}")
print(f"Test events: {test_evset.num_events()}")
return train_evset, test_evset
def evsets_to_tabular(*evsets: tp.EventSet) -> List[np.ndarray]:
datasets = []
for evset in evsets:
# Fill missing values with -1 (raw data doesn't have any, but we will create some in our feature engineering)
df = tp.to_pandas(evset).fillna(-1)
# Remove timestamp and machine columns
df = df.drop(columns=DROP_COLS)
# Convert to numpy (and convert from 2D to 1D array if it has a single feature, in the case of the labels)
arr = df.to_numpy().squeeze()
datasets.append(arr)
return datasets
features_train, features_test = split_event_set(raw_features)
labels_train, labels_test = split_event_set(labels) # Note that we won't be using the train labels for training in this unsupervised setting
X_train, X_test, y_train, y_test = evsets_to_tabular(features_train, features_test, labels_train, labels_test)
average_duration: 24785.0 train_cutoff: 19828.0 Train events: 178461 Test events: 44604 average_duration: 24785.0 train_cutoff: 19828.0 Train events: 178461 Test events: 44604
Training¶
Having done all that work to prepare our data, all that remains is to train our model.
contamination = y_train.sum() / len(y_train)
print(f"{contamination=}")
def train(X_train):
model = IsolationForest(
contamination=contamination,
n_jobs=-1,
verbose=1,
random_state=0,
)
model.fit(X_train)
return model
contamination=0.04820661096822275
model = train(X_train)
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers. [Parallel(n_jobs=8)]: Done 2 out of 8 | elapsed: 0.4s remaining: 1.3s [Parallel(n_jobs=8)]: Done 8 out of 8 | elapsed: 0.5s finished
Evaluation using AUC-ROC¶
We'll be reporting the model's ROC AUC score, which provides an aggregate measure of performance across all possible classification thresholds (since our model outputs an anomaly score for each sample, and in a real-world scenario it would be up to us to define the thershold from which we consider an event to be marked as anomalous).
figsize=(20,3)
results = {}
def evaluate(model, X_test, y_test, name):
"""Evaluates a model on its training data and unseen test data, computing accuracy score and plotting ground truth vs predictions."""
# Compute and print scores
roc_auc = roc_auc_score(y_test, -model.score_samples(X_test))
results[name] = {"ROC AUC": roc_auc}
print("Results:")
print(pd.DataFrame(results))
# Plot predictions
preds = model.predict(X_test)
preds[preds == 1] = 0
preds[preds == -1] = 1
tp.event_set(timestamps=list(range(len(y_test))), features={"labels": y_test, "predictions": preds}).plot(style="vline")
evaluate(model, X_test, y_test, "raw features")
Results: raw features ROC AUC 0.675865
That's pretty decent for a first try! Our model seems to be learning.
The plotted predictions seem to be all over the place though, with the model sporadically predicting anomaly events throughout in several non-anomalous periods.
There's plenty of room for improvement, so let's kick off the feature engineering!
Evaluation using Detection AMOC¶
A detection AMOC shows the relation between false alerts (or equivalently, the average time between false alerts) and the average time it takes to detect anomalies.
Using Temporian, we compute an AMOC as follows.
The objective is to detect anomalous events called "anomalies". An anomaly is created each time the "label" (or "target") is true. Anomalies are detected using a numerical "signal" and a "threshold". An "alert" is created each time the signal is above the threshold.
In this dataset, two anomalies that happen close in time can be considered the same anomaly. Therefor, we apply a "shutoff" on the anomalies: Two anomalies cannot happen within "missed_target_shutoff" of each other. A "shutoff" can also be applied on the "alert". Instead, we apply the shutoff on the "false alerts" (see next definition).
A "false alert" is an alert not detecting an anomaly event, i.e. an alert at time t
without any anomaly in [t-anomaly_window, t]
.
A "missed anomaly" is an anomaly not detected by any alert, i.e. an anomaly at time t
without any alert in [t, t+anomaly_window]
.
The "time to detection" of an anomaly at time t
is the time between the anomaly and the first following alert in [t, t+abnomal
y_window]`.
If not detected by an alert, the "time to detection" of an anomaly is related by the "anomaly_window" (this makes the plots easier to interpret).
Here is a summary:
We can compute an AMOC on a model or any other numerical signal. For instance, we can compute the AMOC on a feature.
We can find which one of the features f1
and f2
alone can best detect anomalies.
# @title Define "compute_detection_amoc" and "plot_detection_amoc" to compute and plot amocs.
from dataclasses import dataclass
from typing import Optional, Dict, List
import matplotlib.pyplot as plt
@dataclass
class ThresholdEval:
"""Model metric for each threshold value applied on the score."""
threshold: float
num_false_alerts: int
num_missed_anomalies: int
average_time_to_detection: float
total_record_duration: float
num_machines: int
def compute_detection_amoc(target : tp.EventSet,
signal : tp.EventSet,
detection_thresholds: List[float] = np.linspace(0,1,100, dtype=np.float32).tolist(),
anomaly_window : float = 1000.,
anomaly_shutoff : float = 200.,
false_alert_shutoff : float = 200.,
plot_first: bool = True,
) -> List[ThresholdEval]:
"""Computes a detection AMOC.
Args:
target: A boolean feature defining an anomaly to detect.
signal: A numerical feature detecting recent anomalies. A signal above a threshold becomes an alert.
detection_thresholds: List of evaluated detection thresholds applied on the "signal".
anomaly_window: For how long an anomaly can be detected.
anomaly_shutoff: Minimum distance between two anomalies.
false_alert_shutoff: Minimum distance between two missed alerts.
plot_first: Plot the alerts, anomalies of the first index and the mid-threshold.
"""
# List the anomalies
anomalies = target.cast(tp.bool_).filter()
anomalies = anomalies.filter_moving_count(anomaly_shutoff)
# Final timestamp used to collect all the metrics.
end_of_records = signal.end()
end_of_records_global = end_of_records.drop_index(keep=False).end()
def sum_by_index(evset) -> int:
"""Sums of the last value of a feature in all the indexes."""
assert len(evset.schema.features) == 1 # There should be only one feature
return float(evset.drop_index(keep=False).cumsum(sampling=end_of_records_global).get_arbitrary_index_data().features[0][0])
# The duration of record in each index is the time distance between the first and last event.
record_duration = signal.begin().since_last(sampling=signal.end())
total_record_duration = sum_by_index(record_duration)
print("total_record_duration:",total_record_duration)
# There is one index per machine
num_machines = len(target.data)
print("num_machines:",num_machines)
num_anomalies = int(anomalies.drop_index(keep=False).moving_count(np.inf, sampling=end_of_records_global).get_arbitrary_index_data().features[0][0])
print("num_anomalies:",num_anomalies)
# Can also be computed this way:
assert num_anomalies == np.sum([len(machine.timestamps) for key, machine in anomalies.data.items()])
amoc = []
for threshold_idx, threshold in enumerate(detection_thresholds):
is_mid_threshold = threshold_idx == (len(detection_thresholds)//2)
alerts = (signal >= threshold).filter()[[]]
# False alerts
false_alerts = anomalies.moving_count(anomaly_window, sampling=alerts).equal(0).filter()
false_alerts = false_alerts.filter_moving_count(false_alert_shutoff)
num_false_alerts = false_alerts.moving_count(np.inf, sampling=end_of_records)
# Missed targets
missed_anomalies = alerts.moving_count(anomaly_window, sampling=anomalies.lag(anomaly_window)).equal(0).filter().leak(anomaly_window)
missed_anomalies = missed_anomalies.filter_moving_count(false_alert_shutoff)
num_missed_anomalies = missed_anomalies.moving_count(np.inf, sampling=end_of_records)
# Time to detection
time_to_detection = anomalies.until_next(sampling=alerts, timeout=anomaly_window)
# Set the time to detection of non detected anomalies with the maximum anomaly_window.
time_to_detection = time_to_detection.isnan().where(anomaly_window, time_to_detection)
# Note: Some machines don't have alerts
sum_time_to_detection = time_to_detection.cumsum(sampling=end_of_records)
if plot_first and is_mid_threshold:
anomalies.name = "anomalies"
alerts.name = "alerts"
missed_anomalies.name = "missed_anomalies"
false_alerts.name = "false_alerts"
tp.plot([
target,
signal,
anomalies,
alerts,
missed_anomalies,
false_alerts
],
indexes="machine-1-1",
height_per_plot_px=100)
amoc.append(ThresholdEval(
threshold=threshold,
num_false_alerts=sum_by_index(num_false_alerts) / num_machines,
num_missed_anomalies=sum_by_index(num_missed_anomalies) / num_machines,
average_time_to_detection=sum_by_index(sum_time_to_detection) / num_anomalies,
total_record_duration=total_record_duration / num_machines,
num_machines=num_machines,
))
return amoc
def plot_detection_amoc(amoc_dict: Dict[str, List[ThresholdEval]]):
# Plot the AMOC and other related plots
fig, axs = plt.subplots(1, 2, figsize=(5*2, 3), squeeze=False)
for name, amoc in amoc_dict.items():
ax = axs[0,0]
ax.plot([e.average_time_to_detection for e in amoc],
[e.num_false_alerts for e in amoc],
label=name)
ax = axs[0,1]
ax.plot([e.average_time_to_detection for e in amoc if e.num_false_alerts != 0],
[e.total_record_duration / e.num_false_alerts for e in amoc if e.num_false_alerts != 0],
label=name)
ax = axs[0,0]
ax.set_xlabel("Avg. time to detection")
ax.set_ylabel("Num. false alerts")
ax.legend()
ax.grid(True)
ax = axs[0,1]
ax.set_xlabel("Avg. time to detection")
ax.set_ylabel("Avg. time between false alerts")
ax.set_yscale("log")
ax.invert_xaxis()
ax.legend()
ax.grid(True)
amoc_f1 = compute_detection_amoc(evset["label"], evset["f1"])
amoc_f2 = compute_detection_amoc(evset["label"], evset["f2"], plot_first=False)
plot_detection_amoc({
"f1":amoc_f1,
"f2":amoc_f2,
})
total_record_duration: 223056.0 num_machines: 9 num_anomalies: 132 total_record_duration: 223056.0 num_machines: 9 num_anomalies: 132
We can see that both features performs relatively equally. For instance, with an average time to detection of 400 time-units, both features generate approximately ~30 false alerts for each machine.
What about our detection model now?
def model_prediction_to_evset(model):
df = tp.to_pandas(features_test)[["machine", "timestamp"]]
df["detection"] = -model.score_samples(X_test)
df["label"] = y_test
evset = tp.from_pandas(df).add_index("machine")
return evset
pred_evset = model_prediction_to_evset(model)
amoc_model_raw_features = compute_detection_amoc(pred_evset["label"], pred_evset["detection"])
plot_detection_amoc({
"f1":amoc_f1,
"f2":amoc_f2,
"model": amoc_model_raw_features,
})
WARNING:root:Feature "machine" is an array of numpy.object_ and will be casted to numpy.string_ (Note: numpy.string_ is equivalent to numpy.bytes_).
total_record_duration: 44595.0 num_machines: 9 num_anomalies: 31
The model is much better than the features alone. This is expected, but it is a good check to do :).
For example, the model is able to detect the anomalies within 200 time-units while raising ~8 false alerts.
By comparison, using f1
or f2
alone will generate ~55 false alerts for the same time to detection
Wrapping up¶
In this notebook we learned how to perform feature engineering and visualization using Temporian, applying it to a real-world anomaly detection use case.