Galleries

Part 5 – Incident Recovery

Introduction

Incidents can’t be avoided entirely, however the damage can be greatly minimized by a mature Incident Detection and Response function. In the CSIRT Series, we have been looking in detail at the various functions that make up a good IR process framework. Incident Containment and Incident Recovery are complimentary processes. While  containment is aimed at stopping the spread of a breach, Recovery is all about getting back on feet by reversing to a “Known good state”. The “known good state” in our opinion is very ambiguous in its meaning. It may apply to a single machine, or an entire network. However, in our opinion, Recovery process or getting back to a “Known good state” is a combination of three sub-steps:

  1. Pre-Recovery – Forensics Evidence Collection in our opinion is a Pre-Recovery step.  This is a critical process and is important for collecting and maintaining evidence that may be required to pursue future legal actions.
  2. Recovery from Backup – Ensure that systems or networks are returned to the pre-breach state.
  3. Post-Recovery –  As a post-recovery step, Remediation of the threat vector is crucial. A process to ensure that the infection or threat vector is a non-issue.

Let us look at each of these sub-steps in detail

Pre-Recovery – In cases which need legal course of action, it is important that we clearly document how all evidence has been collected, preserved and handled so that it is admissible in court. This is called Forensic Evidence Collection. It is key to note that legal requirements vary from region to region, jurisdiction to jurisdiction and a forensic person should be aware of that. It is recommended to have some of the team members obtain computer forensics training and certification to be able to handle the entire process end to end. However, it is not un-common to get professional third party help for conducting Forensic evidence collection and investigations during an Incident. Forensics is a standalone field in itself and to detail all the process steps her would be impractical. Hence, we have tried to give a succinct summary of what forensics entails:

    1. Determine legal issues regarding the incident that may cause an impact
    2. Determine technology and processes within the scope of the forensic analysis
    3. Identify evidence from the infected machine or person. The evidence can be electronic or physical.
      • Document and Collect the identified Evidence following the chain of custody
      • Perform Forensics Investigation and analysis.

Once the incident forensic process has been initiated, it is possible that the incident may need to be reclassified based on the results. Based on this, the entire recovery and remediation process attains a different color. For Example: A malicious code incident was originally triaged and classified as a medium security incident. The forensic analysis reveals that the malicious code has installed hidden back-door processes that can now be traced to additional systems that were not originally identified as being affected. The incident should be reclassified from a medium security incident to a high security incident.

Recovery from Backup – If the incident fits the criteria of high severity and or high impact, the CSIRT team should determine if IT business continuity, disaster recovery, and or backup restoration procedures should be initiated. The reason this is limited to high severity and high impact incidents is nothing but practical consideration. The goal of the Recovery phase is to safely put the impacted systems back into production. To complete the recovery process the following three steps have to be followed:

  • Validation  of the recovered systems  – Involves asking the user base if the system is operating properly or comparing that the ports and services of the system are consistent using profiling tools
  • Restoring Operations – Involves placing the system into full production, allowing it to interact fully with other devices on the network
  • Monitoring – Involves checking systems for back-doors or any other issues which may have escaped previous detection. If possible, host-based and network based monitoring should be used to compare that the attacker did not leave any back-doors on the system

Ultimately, when services are restored, the system should have an effective defence against future attacks of the same nature. Any access methods which may have been used to conduct such an attack should be corrected. When restoring services, systems or data from archived backups, consideration should be taken based on the type of attack, the data affected and most importantly the timeline in which the attack initially took place. This information should have been discovered and documented as part of the forensic analysis. This step in the recovery process is critical so that vulnerabilities, malware or corrupted data is not re-introduced into the operating environment. Depending on the severity of the incident, it may be required to do a full system rebuild in order to re-establish the integrity of the system.

Post-Recovery – Once the recovery is completed, Incident remediation steps should be followed. Most of the times, the Threat Vector will be a System vulnerability or Network vulnerability. For such vectors, available patches or system updates should be applied. System hardening techniques may also need to be applied and core deployment images may need to be updated to prevent the introduction of the weakness elsewhere in the organization. In the case of Non-Vulnerability related vector, the root cause should be identified and appropriate fixes have to be implemented.

Conclusion

It is important to have a well defined and smooth functioning recovery capability in the CSIRT team.  Without recovery capabilities, the probability of a security incident or issue recurring persists.
Go back or Continue to Part 6 – Continuous Improvement

Part 6 – Continuous Improvement

Introduction

One of the things we do in the Incident Recovery phase is to  determine the root cause of the incident and to identify appropriate remediation steps. This typically follows the Root Cause Analysis workflow which many of you are aware of.  Once the remediation is done, it is important to document the “lessons learnt”.

Why is it important?

Lessons learnt are an important aspect of a CSIRT organization. “A stationary object gathers more moss”. This is the philosophy of a CSIRT organization – Continuous evolution and improvement. This is typically a 15 to 30 minute exercise every CSIRT member who handled the incident should go through. In this exercise, the following key items should be discussed:

  1. What process, technology or people worked?
  2. What did not work? Why?
  3. Response and Resolution effectiveness? Why?
  4. Any recurring issues or themes?

Once answers for all these questions have been penned down and discussed, a detailed action plan needs to be devised on how to improve the CSIRT function. The Action plan can be categorized under two major groups:

  • Control Improvements  – This section should describe any changes or improvements that should be put in place to better detect future incidents of this type and/or prevent similar incidents. Some of the examples are
    1. Policy Changes, typically related to organization wide policies related to user, IT systems etc.
    2. Monitoring System changes, typically these are configuration changes that will be made in SIEM, perimeter or endpoint defences to improve better detection and efficient reporting
    3. Architectural Changes, typically are long term major changes in the way the systems are built.
  • Process Improvements – This section should describe any improvements that could be made to the actual response process itself. Some of the examples are:
    1. Improving the Incident handling cheat sheet with additional details
    2. Improving the communications plan to get speedier response
    3. Escalation matrix improvements
    4. Process automation
    5. Staff training and awareness

Rinse and Repeat!!!

As you can see from above, the goal is not to do this exercise as a one time activity. Instead this is a repetitive process. However, this may not be possible for every single incident that is detected and worked by the CSIRT. Hence, this is where practicality dictates that this process should be done in a way that is scalable. Keeping that in mind, below is a recommended approach for doing this:

  1. Perform “Lessons Learnt” exercise for all Major and High Severity incidents
  2. Perform “Lessons Learnt” exercise for all repeat incident category (refer to Incident Classification for more details)
  3. Perform “Lessons Learnt” exercise on a monthly or quarterly basis for CSIRT processes.

Conclusion

Lessons learnt are an important part of continued learning and quest for functional perfection. CSIRT is no different and it should also be improved on a regular basis. These improvements should be aimed at efficiently detecting and responding to cyber incidents in a timely fashion.

With this post, the CSIRT Series comes to a conclusion. Please feel free to post your comments on the section below.

 

SIEM Product Comparison – 101

SIEM Product Comparison – 101 

Please refer to the SIEM Comparison 2016 for the latest comparison.

We at Infosecnirvana.com have done several posts on SIEM. After the Dummies Guide on SIEM, we are following it up with a SIEM Product Comparison – 101 deck. So, here it is for your viewing pleasure. Let me know what you think by posting your comments below. The key products compared here are based on Gartner Magic Q which is what Organizations typically use to select SIEM vendors. The Vendors mentioned here in the deck are :

1. HP ArcSight

2. McAfee Nitro

3. IBM QRadar

4. Splunk SIEM

5. RSA Security Analytic

6. LogRhythm.

If you need any other Vendor evaluation on the parameters mentioned in the deck, please do let us know and we can post them for your use.