SIEM Technology – The Good, The Bad and The Ugly
SIEM is one of those technologies most of the organizations adopt in the wake of Security Log Analysis/Incident/Event Reporting requirements. If you already know what SIEM technology and want to get into the domain, these are the things to know (SIEM – What you need to know). If you don’t know what SIEM is, read it nevertheless!!! This blog post is to talk about SIEM technology by analyzing it critically (even though I am a big fan of SIEM, I believe that maturity comes from review and feedback). Almost a decade ago, SIEM started gaining traction and has come a long way since. Now, I think is a good time to review the technology from a critical view point. So here is my blog on The Good, The Bad and The Ugly!!! This will be a 2 part post, with Part 1 concentrating on Introducing SIEM and then highlighting what it has and has not achieved. Part 2 will concentrate on a proposal/vision on how SIEM should move ahead in the coming years
SIEM is data driven. Data in the form of logs from IT Infrastructure is the key driver for SIEM tools to perform their so called “magic”. Logs have been around in IT for a long time. Logs have been one of the main tools to troubleshoot programs/operating systems etc since long. Gradually, Security gained importance and because of an established logging platform available across IT landscape, Security Events also slowly started to trickle into Logs. With time, along came several compliance and Audit requirements that were driving the Security Log Management domain. Then gradually there arose a need to analyze Log Data and based on the analysis, perform an action. This is where SIM tools gained prominence. This later started to get focussed on Security related incidents and diverged as SIEM. If you look at the pro genesis of SIEM, it has all to do with Data. That is why in today’s world, where Data is exploding in the Internet, it is of utmost importance to understand a technology as SIEM and improve it with time.
What SIEM has accomplished?
For more than a Decade, SIEM has done a lot of things for IT folks. When there was no capability to analyze lines and lines of Log files, SIEM was our savior. SIEM gave us the following capabilities right off the bat:
- Process Log Streams from Various Products and standardize them into a single Application data set.
- Provide capabilities to work with several thousands of events per second and still give what we need in terms of searching and querying Log data
- Provide capabilities to co-relate data from different entities so that we can trace the progeny of an issue
- Provide nice Alerts/Reports/Dashboards/Summary for the IT Log data
- Finally, a Incident and Event Management Workflow to make it operational.
Several vendors of SIEM (SIM/SEM also used interchangeably but SIEM is becoming standard) exist and google searches will give you more than 20 in number. The SIEM market today has grown into a Multi-Billion dollar market and companies, people etc are all embracing the change.
While SIEM is a lucrative segment to be in, the problem is that the technology is not mature and has some gaping holes. The technology instead of solving a problem for good, fixed some and introduced several other collateral issues. Let us look at some of them below:
- Log Management as a technology, as a solution was never mature. We never had good enterprise wide Log Management technologies and tools around before SIEM arrived.
- Several Log Management issues still exist. These are around Big Data Sets, Standard Log Format Specifications, Integration of Log Sources, Standardization of Applications logging with respect to Security etc. Instead of focussing on fixing these issues, we jumped into SIEM solutions (Log Management + Event Management).
- SIEM came packaged with Log Management solutions as well, but they were not as efficient as they should be. SIEM came packaged with Event Management Solutions as well, but what is good Event Management, when Log Management is not efficient.
- Sample this, Windows Logs are resident files in a proprietary format. All Network devices send Syslog messages using the same RFC, but content is varied. Database Audit logs are a mix of Table Data and File Audit Data. When we have a variety of such logs from vendors, there is no way we can effectively perform Log Management and subsequently Event Management
- One of the best and easiest solution for Log Management was that SIEM vendors packaged a client that can collect and normalize the data into its proprietary format. Then the processed data was sent to a Central Manager where all Event Management capabilities existed.
- The problem with the above approach is, different data sources need different processing and hence a different client for every data source. Though this seems to be a simple solution at the outset, it adds a layer of complexity in terms of managing the Clients themselves. Imagine this problem for a huge enterprise and you know what a pain point this is for SIEM solutions.
- Client management is a decentralized approach and hence a failure. Monitoring the health of the client is one of the management headaches one has to bear with. Patching them, updating properties, remote management etc are all points of failure, Not to mention keeping them up and running with constant care and feed like a new born.
- Since the log standardization in SIEM is in proprietary format, migrating from one system to another, one vendor to another is a pain point. This would require client re-installation and data re-processing. This is a problem where you are stuck with a product for life. Inter-operability between systems has been always a problem for Vendors in IT space. This while protects their business, limits the capability of the end user to get what he wants. The solution cannot be more and more new products, new projects to replace existing SIEM solutions etc. It has to be more robust than that.
- Searching data across TBs (terabytes) of data is the most important problem every organization faces. How do our SIEM solutions solve this? By using some sort of Databasing and Indexing. All the databases today (Read Oracle/SQL/MYSQL/PGSQL) are all limited in terms of handling such randomly formatted, high volume feeds, thereby rendering long term searches, trend analysis etc a slow, frustrating and time consuming job.
- Client Server Models implemented by SIEM does not scale for BIG DATA!!! Let me tell you how:
- Most of the SIEM solutions I have worked with have 3 layers of architecture – Data Collector Layer (Event Collectors), Data Storage Layer (Event Indexers/Storage) and the Data Processing Layer (Event Management/Administration/Web Console/Server).
- In the above architecture, Data Collection and Data Storage is High Volume ranging up to 100K events per second. However, for Data Processing Layer or the Manager Layer, there is a limit of how much it can process (typically in 1/10 – 1/20 of collected data)
- If the effective use of Log data is only going to be 10-20%, what about the rest?
- People say aggregation and filtering is done to consolidate the data to be within the 10-20% range. Filtering and Aggregation have their own pros and cons but the end result is what you collect, is not used entirely.
- Managing SIEM solutions (from architecting, implementing, integrating, customization, event management, content development, maintenance etc) is not a simple task and usually requires huge investments in people and training. The vendors make money with this I know, but honestly, being a User, you know that “If it is complicated, adoption will be difficult”
- Most SIEM solutions are not integrated with ITIL process of Incident and Event Management (A rather standard form for IT framework used across the industry) thereby limiting deployments that should be a seamless transition.
I have to be honest about the fact that the above list is not comprehensive and there are several points you as readers would like to point out as far as the Positives and Negatives of SIEM. Please comment on and I will update the post with your views and comments. Part 2 will be discussing about the various options for SIEM to learn and improve based on Industry feedback and User feedback.