Introduction
Incidents can’t be avoided entirely, however the damage can be greatly minimized by a mature Incident Detection and Response function. In the CSIRT Series, we have been looking in detail at the various functions that make up a good IR process framework. Incident Containment and Incident Recovery are complimentary processes. While containment is aimed at stopping the spread of a breach, Recovery is all about getting back on feet by reversing to a “Known good state”. The “known good state” in our opinion is very ambiguous in its meaning. It may apply to a single machine, or an entire network. However, in our opinion, Recovery process or getting back to a “Known good state” is a combination of three sub-steps:
- Pre-Recovery – Forensics Evidence Collection in our opinion is a Pre-Recovery step. This is a critical process and is important for collecting and maintaining evidence that may be required to pursue future legal actions.
- Recovery from Backup – Ensure that systems or networks are returned to the pre-breach state.
- Post-Recovery – As a post-recovery step, Remediation of the threat vector is crucial. A process to ensure that the infection or threat vector is a non-issue.
Let us look at each of these sub-steps in detail
Pre-Recovery – In cases which need legal course of action, it is important that we clearly document how all evidence has been collected, preserved and handled so that it is admissible in court. This is called Forensic Evidence Collection. It is key to note that legal requirements vary from region to region, jurisdiction to jurisdiction and a forensic person should be aware of that. It is recommended to have some of the team members obtain computer forensics training and certification to be able to handle the entire process end to end. However, it is not un-common to get professional third party help for conducting Forensic evidence collection and investigations during an Incident. Forensics is a standalone field in itself and to detail all the process steps her would be impractical. Hence, we have tried to give a succinct summary of what forensics entails:
- Determine legal issues regarding the incident that may cause an impact
- Determine technology and processes within the scope of the forensic analysis
- Identify evidence from the infected machine or person. The evidence can be electronic or physical.
- Document and Collect the identified Evidence following the chain of custody
- Perform Forensics Investigation and analysis.
Once the incident forensic process has been initiated, it is possible that the incident may need to be reclassified based on the results. Based on this, the entire recovery and remediation process attains a different color. For Example: A malicious code incident was originally triaged and classified as a medium security incident. The forensic analysis reveals that the malicious code has installed hidden back-door processes that can now be traced to additional systems that were not originally identified as being affected. The incident should be reclassified from a medium security incident to a high security incident.
Recovery from Backup – If the incident fits the criteria of high severity and or high impact, the CSIRT team should determine if IT business continuity, disaster recovery, and or backup restoration procedures should be initiated. The reason this is limited to high severity and high impact incidents is nothing but practical consideration. The goal of the Recovery phase is to safely put the impacted systems back into production. To complete the recovery process the following three steps have to be followed:
- Validation of the recovered systems – Involves asking the user base if the system is operating properly or comparing that the ports and services of the system are consistent using profiling tools
- Restoring Operations – Involves placing the system into full production, allowing it to interact fully with other devices on the network
- Monitoring – Involves checking systems for back-doors or any other issues which may have escaped previous detection. If possible, host-based and network based monitoring should be used to compare that the attacker did not leave any back-doors on the system
Ultimately, when services are restored, the system should have an effective defence against future attacks of the same nature. Any access methods which may have been used to conduct such an attack should be corrected. When restoring services, systems or data from archived backups, consideration should be taken based on the type of attack, the data affected and most importantly the timeline in which the attack initially took place. This information should have been discovered and documented as part of the forensic analysis. This step in the recovery process is critical so that vulnerabilities, malware or corrupted data is not re-introduced into the operating environment. Depending on the severity of the incident, it may be required to do a full system rebuild in order to re-establish the integrity of the system.
Post-Recovery – Once the recovery is completed, Incident remediation steps should be followed. Most of the times, the Threat Vector will be a System vulnerability or Network vulnerability. For such vectors, available patches or system updates should be applied. System hardening techniques may also need to be applied and core deployment images may need to be updated to prevent the introduction of the weakness elsewhere in the organization. In the case of Non-Vulnerability related vector, the root cause should be identified and appropriate fixes have to be implemented.
Conclusion
It is important to have a well defined and smooth functioning recovery capability in the CSIRT team. Without recovery capabilities, the probability of a security incident or issue recurring persists.
Go back or Continue to Part 6 – Continuous Improvement