To effectively combat pervasive cyberthreats like ransomware, organizations need to centralize their detection and response efforts. Oftentimes, they will turn to an Endpoint Detection and Response (EDR) tool or Managed Detection and Response (MDR) to achieve this. EDR solutions combine real-time continuous monitoring and collection of endpoint data with automated response capabilities, such as quarantining an infected device or blocking specific IP addresses. So, what does this look like in practice?
A user is tricked into doing something which compromises their machine. This may have been through a malicious link that was shared, a weaponized e-mail attachment masquerading as an invoice that needs to be paid or plugging in that handy new USB storage device given out by that nice fellow at the trade show who told me I could use it encrypt my crypto wallet. Now you are compromised, and the EDR/MDR has detected that low-level changes have been made to the device or your device is now communicating with a well-known DNS that was published in your latest threat intel report. EDR/MDR now takes action by isolating the compromised machine from the network and revokes the user’s access from the domain or VPN. This is great. You have now stopped the spread of ransomware or stopped an attacker from gaining access to your network at the cost of impacting an employee or group of employees from getting that end of quarter report done. This happens every day in IT security, and it’s a trade-off that is well accepted.
However, for critical infrastructure organizations relying heavily on operational technology (OT) to power their business, EDR and MDR are generally not a good fit because the technology is too intrusive for sensitive industrial control system endpoints. The level of control that an EDR tool has over an endpoint can potentially impact processes that are too critical to shut off in an OT system, such as a safety mechanism or critical power generation device.
Let’s say another scenario plays out, but this time in an OT environment. A company has good controls in place: things like checking e-mail or use of removable media on OT systems is well-controlled and managed. It’s now Friday afternoon, and you have personnel from your original equipment manufacturer (OEM) onsite to update a bug - a memory leak causing the safety system to restart occasionally. The updates are going well, but the vendor drops the approved removable media, and it falls through the drainage grate onto the manufacturing floor never to be seen again. To continue working, the vendor grabs a backup removable media that he used at the last customer site who has the same system because he knows the software update is on there. He plugs it in, uses it and accidently infects the safety system with command-and-control software. The EDR/MDR solution isolates the safety system from the network which means it is no longer able to monitor for unsafe operating conditions, creating a safety hazard for the organization. This is a prime example of an IT-based response which is unfit for OT.
A promising alternative to EDR/MDR solutions for OT environments is using an OT-specific asset management tool to monitor for suspicious activity on an endpoint and integrating that data into a centralized machine learning engine within a SIEM, such as Splunk. Security teams can then use common data models for behavioral detection efforts, while also being able to customize response workflows for OT systems in their SOAR tool as an alternative to more intrusive EDR/MDR solutions.
So, let’s take our OT scenario again. This time, instead of using an MDR/EDR solution, we leverage data from our OT asset management system and combine it with the power of a SIEM that also has SOAR technology. The OT vendor uses the tainted removable media and infects the safety system with command-and-control software. The OT asset management system detects the change and shares the alert along with deep context about that endpoint, such as criticality and function, with the SIEM which then triggers an OT response workflow with the SOAR platform, alerting a SOC analyst of the compromise, as well as the asset owner in OT. The SOC analyst looks at the network data and can see the C&C server attempting to phone home, but that firewall is blocking the traffic. The analyst then digs into the asset information to ensure no other unauthorized changes have occurred. In parallel, the OT asset owner safely transitions to a back-up safety system and works with the vendor to restore the primary system to a known good state.
To learn more about how you can put this into practice, watch this short video to see how our OTML Engine feeds contextual OT endpoint and network data into Splunk’s SIEM to centralize machine learning and enable advanced SOAR use cases.