Anomaly Detection
PyRapide's AnomalyDetector learns the normal causal structure of your system and flags deviations. Unlike statistical anomaly detectors that look at metric values, PyRapide detects anomalies in the causal topology, meaning the shape of how events cause other events.
AnomalyDetector
Like the CausalPredictor, the detector has two phases: learn normal behavior from historical computations, then detect anomalies in new computations.
from pyrapide import AnomalyDetector
detector = AnomalyDetector()
Learning Normal Behavior
# Feed normal (healthy) computations
detector.learn(healthy_computation_1)
detector.learn(healthy_computation_2)
detector.learn(healthy_computation_3)
# The detector builds a model of:
# - Which event types normally appear
# - Which causal relationships are expected
# - Normal timing between cause and effect
Detecting Anomalies
anomalies = detector.detect(new_computation)
for anomaly in anomalies:
print(f"Type: {anomaly.type}")
print(f"Event: {anomaly.event}")
print(f"Severity: {anomaly.severity:.2f}")
print(f"Description: {anomaly.description}")
print()
Three Anomaly Types
The detector identifies three distinct categories of causal anomalies:
1. Unseen Events
An event type appears that was never observed during learning. This can indicate a new failure mode, an unexpected code path, or a configuration change.
# Anomaly: unseen event type
# Type: UNSEEN_EVENT
# Event: DatabaseServer.deadlock_detected
# Severity: 0.95
# Description: Event type 'DatabaseServer.deadlock_detected'
# was never observed in training data.
2. Unusual Causes
An event has a causal predecessor that was never observed during learning. The event type itself is known, but it was caused by something unexpected.
# Anomaly: unusual causal relationship
# Type: UNUSUAL_CAUSE
# Event: Alerter.alert
# Severity: 0.82
# Description: 'Alerter.alert' was caused by
# 'BackupService.timeout', which was never observed
# as a cause of this event type.
3. Timing Anomalies
The causal relationship exists in the training data, but the time between cause and effect is outside the learned distribution.
# Anomaly: timing deviation
# Type: TIMING_ANOMALY
# Event: Database.query_result
# Severity: 0.67
# Description: Time between 'Database.query_start' and
# 'Database.query_result' was 4.2s (expected: 0.1s-0.5s).
Full Example
1from pyrapide import AnomalyDetector, StreamProcessor
2
3# Train on historical healthy runs
4detector = AnomalyDetector()
5for run in load_healthy_runs():
6 detector.learn(run)
7
8# Detect anomalies in real-time
9processor = StreamProcessor()
10processor.add_source("app", app_source)
11
12async def check_anomalies():
13 computation = processor.computation()
14 anomalies = detector.detect(computation)
15 for a in anomalies:
16 if a.severity > 0.8:
17 alert_ops_team(a)
18
19processor.watch(
20 pattern="*",
21 callback=lambda e: check_anomalies()
22)
23
24await processor.run()
Next Steps
- Prediction: predict events before they happen
- Analysis and Querying: query the causal graph
- Streaming: real-time event processing