As discussed in Part 1 – Incident Detection and Part 2 – Incident Classification , identifying and accurately classifying an incident based on category and severity are the most important and foremost steps in an Incident Response process. Now comes the most important part of Incident Handling. Readers may have known or used Incident handling process and procedures for a long period of time, but if we were to compare each of them side by side, they would all be similar in purpose but different in execution. Consider Incident handling to be like an organization’s signature – “Unique and cannot be replicated easily”
With this post, we are trying to provide our “unique signature” regarding Incident handling.
Before actually getting the work started, it is important to define the foundational blocks. Without these pre-requisites, a structured Incident response will be difficult. These pre-requisites are:
- Responder Groups – People who “do the analysis and investigation” on the ground are called Responder groups. These need to be defined as part of the CSIRT governance function. In smaller organizations, this can be one or two persons. They are typically the analysts, reverse engineers, forensic experts etc. They are the first line of defence.
- Resolver Groups – People who “do the re-mediation” are the resolver groups. Typically, this group gets into action mostly post-incident. However, these groups also assist during the incident investigation from an infrastructure angle. They are typically comprised of Network teams, Server teams, Application teams etc.
- Management Groups – People who are the “top brass in the organization” are the management groups. These groups are very important and need to be activated if the incident impact is going to be enterprise wide. They are typically comprised of ISM (info-sec manager), CISO, CIO, CTO etc.
- External Communication Groups – People outside of the IT department like Legal, HR, Crisis team, regulators etc. are called external communication groups. These groups take care of interacting with the external agencies like law and order, media, shareholders etc.
- Communication Protocols – Defining “How” to communicate among the various groups is very important because these can’t be established during a live Cyber Incident. Some of the protocols can be email, phone call, template forms, ticketing systems, encrypted communication lines etc.
- Service Level Agreements (if any) – Organizations as they mature in their process of operating a CSIRT want to track performance efficiency of their people and process. This can be done using SLA metrics by defining the “time to respond” and the “time to resolve” or in ITIL terms “Response SLA” and “Resolution SLA”.
Every qualified and classified Incident (Part 1 and Part 2 of CSIRT Function) has to be analysed as per its merit. While the skeletal for the analysis is the same, the content and the context differs from organization to organization. In general, every analysis starts with the following 2 questions:
- What we know? – The answer to the question typically lies in gathering the details regarding the incident. The details can be as follows:
- Victim user/machine details like user name, machine name, IP Address etc.
- Logs that triggered the incident. The logs are typically from SIEM or the point products themselves.
- Attacker information from the logs like Attacker IP Address, Domain etc.
- Attack pattern if it is a signature alert from IDS/IPS/WAF etc.
2. What we don’t know? – This is everything else about the incident that we are yet to investigate or determine. This is the perfect jump off point for investigation. Some of the most common items in this list are as follows:
- Forensic Analysis of the machine like disk analysis and memory analysis.
- Attack Vector synthesis
- Static and Dynamic analysis of malcode, Reversed binaries etc.
- Impact and spread of the attack in terms of data stolen, machines compromised, monetary impact etc.
Once the “known and the unknown” are identified, listing down the course of action becomes easy. This will primarily assist in a timely and coordinated response. In this post, we will not be discussing the individual tools used in analysis, however, we will be talking about the overall process involved.
Once the Incident analysis is under way, there is bound to be a constant flow of information coming from the responder groups. Communicating this to the appropriate stakeholders is key to effective Incident handling. Different organizations have different communication protocols and as mentioned above in the Pre-requisite sections, defining this can be along these lines:
- Establish a communication protocol – Who to call? What is the number to call? What times to call?
- Primary and Secondary contact persons
- Communication template – Email, Report, SMS, Ticket updates, calls etc.
- Timelines – For example, First update – within 30 minutes, Second update – Within 1 hour of First update etc..
In our opinion, incident communication is one of the most under-rated aspects of incident handling and getting this right is important.
Once the analysis is complete, a decision needs to be made based on the collected facts. The decision can be to continue with the Incident containment function or move directly to the Incident Recovery function.