Galleries

Episode 3 – Security Investigation Series – Should I press the panic button?

 

Scenario:

  • A User has a machine installed with an Application Client and connects to the Web Application Server which in turn connects to the Database servers on the back-end.
  • The Client machine has been reported to be being slow.
  • Some basic SIEM based analysis of Network logs for the client machine shows some Network connections resembling a Port Scan from the client machine in its own subnet
  • The machine is also doing some random internet connections to Public IP Addresses.
  • On checking the Anti-Virus software on the machine it shows as being outdated.
  • Suspicion is that the machine is infected with some kind of a malware.
  • Since this client machine has connections to the Critical App and Database server, this event is critical

Should I press the panic button?

Before we press the panic button (raise an Official Security Incident which has SLA tagged to it), there are several questions we need answers for. This is something I learnt from one of my mentors (he blogs here). What I learnt was that before beginning a Security Incident investigation and press the panic button, we need to sit down and collect as much as corroborating data as possible. In order to do that we can follow a series of logical steps:

  1. Collect the Public IP Addresses it connects to and find out more about them. Sometimes a simple Google search can help to determine the authenticity and reputation of the website/domain or public IP.
  2. Establish Ground Zero for the infection. This can be determined by historical log data. Some SIEM tools do this natively for you. Else, you can use a white board or pencil/paper to draw this visualization yourself.
  3. Gather as much as possible infected machine IPs and sort them geographically, Network Zone based and if possible VLAN based as well.
  4. Other Transitive Infections that could be identified.
  5. Analyze one of the infected machines to gather Malware analysis data

Next Steps – Remediation, Control etc

  • Quarantine all the machines infected.
  • In case of the server, ensure that all instances (if any) of the infected file are cleaned.
  • In case of a database, if we exactly know the infection source and possible file names, try purging them from the database contents.
  • Submit the samples of the Malware to the Security Vendor and ask them to start working on a Signature or definition.
  • Perform a Reverse Engineering of the Malware to identify specific Network connections/Behaviors so that they can go into our prevention systems in the Network. (Read Network IPS, Host IPS, Proxy etc). Reverse Engineering skill set  is a good skill to have because for Zero day scenarios, vendors are slow to react. If a Reverse Engineer is available, the job of getting a control measure put in place (although a crude one) will be quicker.
  • Document the characteristics of the malware, the response taken and the findings during the course of this entire incident.
And Finally, if you did all of the above after following the Security Investigation Series blog, say thanks!!!

What is TOR? How can I Un-TOR?

TOR or The Onion Router is one of the widely used Anonymity Networks in the “Wide World Web”. Before we understand what TOR Networks are, its important to understand the basic technology involved. This is key to defining a Detection/Prevention/Control strategy in Enterprise Networks. The basic technology behind TOR is Onion Routing.
Onion routing is a technique for anonymous communications over a Network. Data Packets are repeatedly encrypted and then sent through several onion routers. Like someone unpeeling an Onion, each onion router removes a layer of encryption to uncover routing instructions, and sends the message to the next router where this is repeated. Using this approach means each node in the chain is ideally aware of only 2 other nodes:

  1. The preceding node from which the onion was transmitted.
  2. The proceeding node to which the onion should next be transmitted.

Now how is it achieved? Just like in a Routed Network, there is a master node in Onion Network that maintains the list of TOR Nodes and their Public Keys (Remember TOR uses Asymmetric Crypto). Whenever a request is made, this master node crafts the data packet in layers. Outermost Layer of encryption will be opened by the First Onion Router and the Innermost encryption will be opened by the Last Onion Router. The peeling away of each layer of the onion makes it difficult or impossible to track the onion and hence the name Onion Routing.

Some things to understand about TOR in addition to the basic technology is how it works and how hard it makes life for Security Professionals to identify and control it.

  • Firstly, TOR Nodes have to be public. Their IPs cannot be hidden. Here is a sample list of TOR IP Addresses. This could potentially serve as a blacklisting source in Enterprise Firewalls/IPS/IDS/Proxy.
  • TOR Nodes can use Bridges to connect to TOR Public IPs. Bridges are nothing but Relay IP addresses that help a client connect to the TOR Network. Bridge/Relay IP Addresses make it difficult to identify TOR entry and Exit nodes. Any user can install “Vidalia” and set up a Bridge relay to help several TOR users (who have TOR Public nodes blocked by ISP or Enterprise). These bridge IP addresses are generated randomly per Request Received. There are several such Relays and they are hidden from Public IP Pool.
  • TOR traffic is encrypted and hence detection using IDS/IPS will be difficult
  • TOR Clients have the capability to use SOCKS to set up connections and hence differentiating SOCKS doing TOR and SOCKS not doing TOR is a great challenge.
  • Several Torrent softwares have the capability to do native TOR communication. Identifying such Software Machines will be a challenge in an Enterprise having a distributed setup, Remote Access Setup etc.

Now that we know what TOR is, Don’t we need to know how to control this in the Network?
Analysis and Control of TOR can be done as follows:

1. Blacklist all known IP addresses – This essentially is not fool-proof for the mentioned reasons above with Bridging.
2. Custom Script to pool Bridge IPs and keep adding the same to IP Blacklist. This can regularly query the Bridge Mail ID to get the random list of Bridge IP addresses.
3. If your enterprise is using HTTP Proxies only, then SOCKS Protocol should not be available in your network. Identifying a user doing SOCKS can help identify possible TOR clients.
4. P2P traffic should be blocked in the enterprise as P2P and TOR go hand in hand. The key things to look for is, Browser Plugins for P2P that mask behind HTTP and HTTPS Requests. This is quite an interesting development as far as Identifying P2P Users in your network is concerned.
5. If Traffic Analysis, Flow Analysis is available in your network, you can profile your Network segments for all the Application protocols in use in your network. Unless and until you are using TLS/IPSEC through out your network, chances are that very less amount of encrypted traffic is found. Filtering through the Chaff on known Encrypted traffic should narrow you down to a list of machines that do encrypted traffic and are not supposed to or not normal.

I welcome the readers to share your experience on working with TOR and let me know of any other method of identifying and analysis TOR in Enterprise networks.

High Log Volume – What to Filter and What to Keep?

What and How much to Collect concentrated on giving a starting point for log collection and usage. Here, I am going to talk about Log Filtering for Security Investigations and Analysis purpose. If you are collecting logs from a troubleshooting perspective, you can collect without filtering, but for Security analysis, we would need less of haystack so that we can have a better chance of finding the needle.
In organizations, where several thousands of devices are sending Syslog, managing them is really a nightmare. In such scenarios, more focus is required on Log filtering. Log Filtering for Security Infrastructure use is one of the most ignored aspects of Log Management but it is the most important aspect for a cleaner and efficient log management and Security Event Analysis.

Log Filtering can be done at the Source or at the Syslog Server Location.

  1. Filtering at Source is the best approach when it comes to Log Filtering. This provides for better control of your infrastructure as you know what is being logged and what is not. Indirectly, it also aids in less utilization of System Resources. But the downside is, this is the most difficult to implement. Network devices have very little capability of Source filtering. Firewalls have a lot of options when it comes to Logging, but Routers and Switches offer very less. UNIX offers basic filtering natively and with third party tools like rsyslog, syslog-ng offer granular filtering capabilities. As all would know, there is no native Syslog capability in Windows and it would typically need a third party client to fill in for this lack of capability. This works in favor of filtering because; generally Third Party tools give filtering options.
  2. Filtering at Destination is the most practical and easiest to implement when it comes to Log Filtering. Here, Log Management Solutions or SIEM Tools take care of filtering what is needed and what is not. However, your Source Devices will be generating tons of logs and will be using System resources as well as Bandwidth.

Based on your organizations Architecture, it will help to decide on what is easier to implement and manage. Both have their pros and cons and its up to the Security Teams to decide. Once the filtering approach is decided, it is time to move on to the “Filtering” itself. This is the hard part. Before you start filtering, start understanding the device families present in the Environment. Every Vendor logs events differently and every event log means differently as well. So, care should be taken that we don’t end up filtering logs for one version and in reality we have another.

Example Log Filtering for Cisco ASA:
Let’s take an example of a Cisco ASA Firewall. The Cisco Adaptive Security Appliance (CISCO ASA in short) Operating System generates several logs. Out of the several thousands of messages, the most important events from a Security perspective are the following events given in the table. These events are built in protection defaults for Cisco ASA appliances. Apart from the following events, we can pick and choose Traffic Based events for analysis purpose. If interested to receive the full list, please comment below and I will put up that list too.

%PIX|ASA-2-106016 Deny IP spoof from (IP_address) to IP_address on interface interface_name. – Spoofing Detection
%PIX|ASA-2-106017 Deny IP due to Land Attack from IP_address to IP_address – Spoofing Detection
%PIX|ASA-2-106020 Deny IP teardrop fragment (size = number, offset = number) from IP_address to IP_address – Teardrop Attack
%PIX|ASA-1-106021 Deny protocol reverse path check from source_address to dest_address on interface interface_name – Protocol Reverse Attack
%PIX|ASA-1-106022 Deny protocol connection spoof from source_address to dest_address on interface interface_name – Spoofing Detection
%PIX|ASA-3-109010 Auth from inside_address/inside_port to outside_address/outside_port failed (too many pending auths) on interface interface_name. (floodguard enable)
%PIX|ASA-4-109017 User at IP_address exceeded auth proxy connection limit (max) – DOS Attack Possible
%PIX|ASA-4-109022 exceeded HTTPS proxy process limit  – DOS Attack Possible
%PIX|ASA-5-111001 Begin configuration: IP_address writing to device – Privilege Use
%PIX|ASA-5-111003 IP_address Erase configuration – Privilege Use
%PIX|ASA-6-113006 User user locked out on exceeding number successive failed authentication attempts – Brute Force
%PIX|ASA-3-201002 Too many TCP connections on {static|xlate} global_address! econns nconns – DOS Attack Possible Behavior
%PIX|ASA-3-201004 Too many UDP connections on {static|xlate} global_address! udp connections limit – DOS Attack Possible Behavior
%PIX|ASA-4-209003 Fragment database limit of number exceeded: src = source_address, dest = dest_address, proto = protocol, id = number  – DOS Attack Possible Behavior
%PIX|ASA-4-209004 Invalid IP fragment, size = bytes exceeds maximum size = bytes: src = source_address, dest = dest_address, proto = protocol, id = number – Possible Intrusion Event
%PIX|ASA-4-209005 Discard IP fragment set with more than number elements: src = Too many elements are in a fragment set  – Possible Intrusion Event
%PIX|ASA-3-210011 Connection limit exceeded cnt/limit for dir packet from sip/sport to dip/dport on interface if_name  – DOS Attack Possible Behavior
%PIX|ASA-4-405001 Received ARP {request | response} collision from IP_address/MAC_address on interface interface_name to IP_address/MAC_address on interface interface_name – ARP Poisoning
%PIX|ASA-4-405002 Received mac mismatch collision from IP_address/MAC_address for authenticated host – ARP Spoofing
%PIX|ASA-2-410002 Dropped num DNS responses with mis-matched id in the past sec second(s): from src_ifc:sip/sport to dest_ifc:dip/dport – Possible DNS Attack Detected
%PIX|ASA-4-412002 Detected bridge table full while inserting MAC MAC_address on interface interface. Number of entries = num – Possible L2 Attack Detected
%ASA-4-424001 Packet denied protocol_string intf_in:src_ip/src_port intf_out:dst_ip/dst_port. [Ingress|Egress] interface is in a backup state – Possible Intrusion
%ASA-4-424002 Connection to the backup interface is denied: protocol_string intf:src_ip/src_port intf:dst_ip/dst_port – Possible Intrusion
%PIX|ASA-6-605004 Login denied from source-address/source-port to interface:destination/service for user “username” – Possible Intrusion
%PIX|ASA-5-304001 user@source_address Accessed {JAVA URL|URL} dest_address: url

If you carefully note the structure of the Event message itself you will notice that almost all of the Syslog Messages I have chosen have facilities that range from 1 to 5. These Event IDs can be used for analysis of different scenarios. Here, I have listed down some pointers on how to select these Events of Interest. Some approaches that we can use are:

  1. Focus on the Utility of an Event rather than the data it generates. The description of an Event ID will be misleading many a times, and hence looking at the Generated Data from the Event is important.
  2. Focus on what events will help you track back the actual Attacker/User/Service responsible for triggering the Event in the first place. This event independently or in tandem with another should be able to help you
  3. Look for what Event Collection is more suited to your environment in terms of Security Monitoring. For Example, if you have an Internet Firewall, Spoofing events don’t make sense to collect, whereas internal firewalls would need spoofing events.
  4. Look for Optimal Event ID Selection for auditing so that you get the right amount of data with One Event ID rather than many. This may be difficult in some cases with some devices, but still where ever possible this should be employed.
  5. Finally, see if your SIEM tools can parse these event properly for you to analyze

Now let us look at Juniper NetScreen Firewall Events using the same approach as we used for Cisco ASA.

Critical (00031) Message 〈string〉 detected an IP conflict (IP 〈IP address〉, MAC %m) on interface 〈string〉
Notification (00031)Message ARP detected IP conflict: IP address 〈ip〉 changed from interface 〈if_old〉 to interface 〈if_new〉
Notification (00051) Message Static ARP entry added to interface 〈string〉 with IP 〈IP address〉 and MAC %m
Notification (00052)Message Static ARP entry deleted from interface 〈string〉 with IP address 〈IPaddress〉 and MAC address %m
Emergency Message Ping of Death! From 〈src_ip〉 to 〈dst_ip〉, proto 1 (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Emergency Message SYN flood! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto TCP (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Emergency Message Teardrop attack! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto { TCP | UDP | 〈number1〉 } (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number2〉 times.
Alert Message Address sweep! From 〈src_ip〉 to 〈dst_ip〉, proto 1 (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Alert Message ICMP flood! From 〈src_ip〉 to 〈dst_ip〉, proto 1 (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Alert Message IP spoofing! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto {TCP | UDP | 〈number1〉 } (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Alert Message Land attack! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto TCP (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Alert Message Port scan! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto { TCP| UDP | 〈number1〉 } (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number2〉 times.
Alert Message Source Route IP option! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto { TCP | UDP | 〈number1〉 } (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number2〉 times.
Alert Message UDP flood! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto UDP (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Alert Message WinNuke attack! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:139, proto TCP (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Critical Message – Bad IP option! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto{ TCP | UDP | 〈number1〉 } (zone 〈zone_name〉, int 〈interface_name〉).Occurred 〈number2〉 times.
Critical Message EXE file blocked! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto { TCP | UDP | 〈number1〉 } (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number2〉 times.
Critical Message FIN but no ACK bit! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉, proto TCP (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.
Critical Message SYN and FIN bits! From 〈src_ip〉:〈src_port〉 to 〈dst_ip〉:〈dst_port〉,proto TCP (zone 〈zone_name〉, int 〈interface_name〉). Occurred 〈number〉 times.

If you see in the case of Juniper, Alert and Critical messages are sent to Syslog and these events are not always identifiable using Event IDs. From a Syslog facility all these would be still at Level 5. These events are also default Security Messages generated from Juniper. In addition to these messages, we would be collecting the Traffic Logs (Event ID 000257) from Juniper that helps in Security analysis for correlation.

This same approach can be used for Windows, Unix, Security Application Devices etc. For  more details of specific Device type filtering, please request in the comment section and I will post the filtering for the same.

If you effectively identify the Events of Interest for various devices and filter out the chaff, you would be able to harness the power of Syslog for your Security Investigation purposes. SIEM is getting bigger by the day in terms of data usage. The world is moving towards big data collection. But the key thing to note is, in spite of data growth, the quality of SIEM tools in log analysis has not been going North. Quality does not essentially come with Quantity. Filtering will always be needed no matter how much logs we collect.

Remember “Logs don’t lie”