There is a danger lurking around every corner, hidden amidst the data for those clever enough to see. This is particularly true for IT security where IT staff relies on technology to spot abnormal activities based on mining huge amounts of data and finding the correlation between systems performance and cause and effect. But what happens when there is correlation without causation?
For example, recent experiments with correlations revealed that there is a 0.6 correlation between the number of people who drowned by falling into a swimming pool in the US and the number of films with Nicholas Cage. Similarly, an even scariest correlation of 0.94 is revealed when you compare the consumption of cheese per capita in the US and the number of people who died by becoming tangled in their bed sheets.
Hopefully, for most of the readership (and I fear for those of you for which this is not the case), these are obvious examples of correlation without causation. While Nicholas cage and cheese may have many mysterious qualities, it is common sense to realise that the ability to magically kill people in their pools and bed sheets are not among the likely attributes of either Mr. Cage or cheese (even the nastiest kind).
However, correlation analysis has been a practice (or perhaps a buzz word) in IT monitoring and security event monitoring for as long as anyone can remember. And yet the common sense that helps to rule out murderous cheeses is much harder to apply when it comes to obscure IT metrics such as log analysis and IT monitoring data.
Using a monitoring system in conjunction with your security system can help you grounded in reality and avoid spurious correlations.
1. Mandatory Safeguard
For example, several different security compliance standards require that an external monitoring system exist to ensure that network perimeter security devices (firewalls, intrusion detection systems, etc.) are up and running and that IT staff are alerted in case they are not.
2. Sanity Check
In addition to knowing your security infrastructure is up or down, it’s also a good idea to know if it is running within the green zone, as opposed to the yellow or red zone. Over utilisation of security infrastructure can be indicative to of everything from a run away process, a bad software patch, or even a stealth attack attempt.
SIEM products like ArcSight focus primarily on security logs, products like Splunk and Logstash focus primarily on IT event logs, and unified monitoring products focus mainly on performance and availability data. The ultimate in cross-checking, both to use event correlation to provide new insights, and to verify suspected problems, is provided by the new field of IT operations analytics, which combines multiple log and IT monitoring data with Big Data analysis techniques. Machine learning techniques are then used to find patterns that humans may have missed.
Lastly, don’t be a dummy – know your architecture well enough to avoid spurious correlations by knowing what makes sense and what doesn’t. Nobody wants to wake up the boss at 3:00 AM for a scenario that turns out to be no more realistic than a serial-murder Stilton.