Incident Response Engineer
Role and (mostly) Responsibilities
Before we go any further, note the word ‘engineer’ in the title.One of the definitions by the Oxford dictionary of 'engineer' is:‘Skilfully arrange for (something) to occur.’With that in mind, let’s get on with it.
We’re discussing an InfoSec Incident Response Engineer in this article. If along the way at any point you feel like this article is more applicable to a manager role, you’re wrong. Every member of the IR team is a manager, they need to manage their part in the response, orchestrate it and be accountable for their actions according to their level of involvement.
Contrary to mainstream belief that an IR engineer needs to be someone good at detecting or hunting for threats, the real purpose of an IR engineer is to respond to incidents in an efficient, impactful and compliant way, following clearly defined protocols and an even more clearly defined scope. After the incident has occurred. This is a definition that usually gets lost in the name of proactive engagements, efficient time management and whatnot. Basically, it means ‘we don’t have enough ‘active incidents’ to keep our entire team occupied’. Which can only mean one of two things: you are not detecting all the bad things happening in your network OR you are not detecting all the bad things happening in your network. :)
A good IR engineer needs to tick two boxes:
- Basic technical abilities: Good technical understanding of the environment they need to respond in and thorough practical knowledge of the toolset required to do the job. ‘Basic’ means they need to be across every technical aspect required for the role — not just bits and pieces. This is a term that has been used rather loosely by recruitment posts and wrongly perceived by prospective candidates for quite some time now. Basic means you meet ALL the minimum requirements for the role. In this article, we are not talking about you need to get a job in IR, we’re discussing what you need to do the job.
- Advanced abilities: These abilities are required to ‘engineer’ desired outcomes in a timely, efficient and impactful manner.
Trust me, the first one is the easier of the two as it can be taught.
Basic technical abilities
Andrew is an IR engineer who has been called in to respond to a security incident in a large Windows environment. In this case, Andrew needs to be familiar with the operating system and at the very least, all the basic protocols and technical aspects of conducting a response exercise in a Windows based environment. He needs to be able to perform basic triage, gather evidence and then classify artefacts correctly. In order to effectively carry out these steps, he needs to be across all the tools required to assist with the process thus far. At this point, he should have clear knowledge of what the next steps are and what other team(s) he needs to engage in order to progress the response process to the next step.
It takes a lot of experience and hard work to get to a stage where you are confident in all technical aspects of the process. You need to have gone through sufficient training and practical experience with the tools of the trade.
An IR engineer doesn’t necessarily needs to be an expert in advanced technical skills required to investigate an incident but they need to be comfortable enough to get their hands into the technical aspects of the incident. They need to have a good understanding of the environments they’re working in and also the common techniques used for acquisitions (disk, memory etc), forensics and advanced analysis leading to actionable information and artefacts. For highly advanced tasks such as reverse engineering, they should be able to engage the right team for further actions.
Now let’s move on to the second part.
Other than technical knowledge and experience, both of which serve as the bedrock for a good IR engineer, there are other attributes that make an IR engineer a good IR engineer. Let’s start with an example and then we’ll discuss each of them.
Helen is a commercial airline pilot with many years of experience flying large passenger aircrafts with commercial airlines. She has flown around the world and has exceptional technical knowledge when it comes to piloting aeroplanes. This example helps us understand what makes her a good pilot:On one of the long haul flights under Helen's command, there is an incident on the aircraft. Working in conjunction with the crew, Helen quickly determines it is a serious incident and declares an emergency. The decision on what needs to happen next is now with Helen. Helen quickly assesses the situation and makes the decision that they need to land the aircraft as soon as possible. She quickly delegates the first set of tasks to her crew. One of the tasks is to find the nearest airport that they can make an emergency landing in. Another very important task is that of communications with the relevant authorities on ground. Once they have that information, Helen quickly asks for all information on that location. Once she has acquired all the information on the environment they are about to face, she gets her team to work on the different aspects of the new location such as type of airport, type of runway, headwind, ground facilities etc. All of this information needs to be clearly inspected, analysed and then laid out in a simple but comprehensive format so that the entire team can use it for further decision-making.
After going over the information, Helen makes a few more decisions and quickly asks her team to check her decisions and see if there's anything that she has missed. All this time, there is a clear and direct line of communication open with the ground authorities where all updates regarding the situation onboard the aircraft are relayed flawlessly to the air traffic control team.
After all the calculations have been made, Helen approaches the runway and under clear guidance from the team on the ground, the aircraft lands without any incident. More over, because of the clear line of communications between the aircraft and the ground team, emergency services are already on the runway, ready to start helping the crew and the passengers. Helen's job here is to land the aircraft safely and she has the technical and operational experience to that. But as you can see above, this is not her only job. She needs to make sure they pick the right airport. She needs to make sure that they land in the right airport (in the right country) that has all the facilities required for such an event. She needs to make sure that they have an emergency team deployed on the ground when the aircraft lands (for this she needs to make sure that they know what the situation is on the aircraft so they get the right people deployed on the ground). She needs to make sure everyone on the aircraft remains calm. There are lots of decisions she needs to make in the moment and make sure they're all the right ones. She runs her decisions past others in the team to make sure she hasn't missed anything. She needs to delegate tasks, spreading the workload and also making sure she uses the different skillsets within the team. She needs to be assertive while dealing with other teams such as the air traffic control. Most importantly, she doesn't panic under stress. She didn't just land the aircraft in an unfamiliar location, she engineered a perfect emergency landing.
I like using the above example as it is very similar to what IR people have to go through when working on an active security incident. Obviously, the comparison is purely analogous. The similarities are very familiar though. Decisions need to be made, external and internal parties need to be engaged, tasks need to be delegated, different skillsets need to be utilised and so on.
In order to look at some of these factors, let’s break them down into points and discuss in detail.
Many a times I have seen experienced IR engineers starting to panic when engaged on a high severity security incident. When people see cyber attacks in action, especially against their own networks, the stress levels hit the peaks very fast. In these situations, the worst thing to do is to panic. That, quite literally, renders you (somewhat) useless at that point. All your technical know-how and expertise is more or less irrelevant at this point and it definitely doesn’t prepares you for these high-stress situations. You need to have your faculties in total control in order to be productive and useful. This is not something anyone can teach you. You need to condition yourself for this, although having this ability naturally is the best. I like to tell people that this is what they’ve been trained for, this is their job, showing up for situations like this and then helping out. Security incidents happen and when they do, IR folks spring into action and take care if the situation. Think of it this way, if there were no incidents, your role wouldn’t exist! This usually helps with people taking control and getting on with the job :)
Let’s take a look at another example from a different industry.
Cameron is an experienced and highly capable paramedic with years of experience in high-stress active duty. When on shift, Cameron has to deal with anything from an elderly person suffering a domestic accident to drug overdoses and serious assaults. When faced with a situation where lives are at risk, one thing Cameron and his team can't afford to have in the mix is panic. Lives depend on their ability to handle the situation without any panic, by following the protocol down the letter, while using their own judgement and experience to make sure they take actions best suited to the condition of the people they're trying to help. This saves lives.
How do you get to this point in your career? Hands-on experience. Lots and lots of it. BUT you need to be a certain type of person who can handle stress in a positive way. That, is hard to teach. It has to come from within. Even experienced people panic and make wrong decisions.
What makes Cameron a good paramedic in this example is his natural ability to handle stress well, combined with his practical field experience.
Again, this example is more of an analogy, obviously paramedics deal with life or death situations on a regular basis. But then again, InfoSec IR could, depending on when and where, in fact, also be about lives at risk.
Although it is best to have this ability come to you naturally, there are ways to condition yourself to a level where you don’t panic easily. Some of the other points discussed below in this article go a long way if followed. Also, knowledge is power. The more you know what to do, the less you feel the stress. Work as a team, help each other and win.
Panic is a strict NO-GO ZONE.
The whole point of having a team is to have a pool of skillsets. Both technical and non-technical. You need to delegate tasks or reach out for help (another way of delegating). You will have someone on your team who is better at doing a certain tasks than you (if you don’t, stop reading this and go and a build a better team), so reach out to them and delegate that task to them. They’ll end up doing it faster and better than you. You should only ever work on the aspects of an incident that you are familiar with and experienced with. This, like this entire article, applies to every member of the team, regardless of wether or not you are a manager or not. You need to be able to breakdown your workload into seperate tasks and then identify the right people to do those tasks.
As you go deeper into the incident, you’ll come across new aspects of the incident which may or may not fall into your area of expertise. Contrary to common understanding, delegating is not a manager-only task. Delegating tasks is something that highly successful, efficient teams do within themselves, which, in turn, makes them successful and efficient in the first place!
Make better decisions
Regardless of what level you’re engaged at with the incident, you’ll need to make decisions. Lots of decisions. Wether it is about what tool to use for a certain task or what to include in your update to the execs, it is absolutely imperative that you make the right decisions. Wrong decisions can alter the path of the entire incident. It can affect outcomes and also have a negative impact on your response.
So what does it means to make better decisions? At a basic level, it means you need to make decisions based on facts and your own judgement backed by past experience in similar situations. You need to make sure your decisions are made based on wrong information, unverified facts or impulse. At times, you will need to go with your gut and that’s fine, as long as you’ve set all other factors discussed in this article in place. For example, you need to make sure that you are encouraging your team to challenge your decisions (discussed further down in this article) so that if you miss something, they can point it out to you.
Ask Questions and don’t be afraid to ask them
By this point we have established beyond doubt that you need to possess the basic technical skillset and knowledge that’s needed for you to be on the IR team¹. Moving forward, there will be situations (almost on every single incident) that would require you to learn on-the-go. Vast majority of this learning depends on and is directly proportional to the number and frequency of questions you ask as you go along. There will be new teams that you need to work with as part of a security incident. There will be new protocols and guidelines that you’ll need to become familiar with within a short period of time. You’ll need to deal with unfamiliar infra-structure and new terminology as you go deeper into an incident. All of this can be fast-tracked if you ask the right questions at the right time. NEVER not ask a question when you don’t know something about the incident that you’re dealing with. Remember, it is not expected of you to know everything — that’s one of the reasons you have access to other teams and their knowledge while responding to an incident.
No question is a silly question. Unless it really is a silly question, that is.
Consider this. You are dealing with a security incident at your company and an external response team has been engaged to help with the incident. During the course of the incident response, they end up asking a few questions that are not very clear. You answer them to the best of your knowledge but in the end it turns out that the information they really needed was not something that they directly asked for. Needless to say, this problem will end up delaying proper response to the incident. Which is something that you would not appreciate.
When engaged in a response task, you need to make sure of two things when it comes to communication during response. Firstly, make sure there’s regular communication between all parties involved in the response process. Secondly, and this more important, make sure that all communication is clear, precise and to the point.
There is nothing worse than mis-communication during the time of an active incident. It not only slows the response process down, it can even result in some wrong actions that could derail the entire response process.
If you do not have a dedicated communications team/personnel, make sure you follow the keep-it-simple rule.
- Use bullet points where applicable
- Assign tasks clearly to individuals if possible
- Make sure recipients know how to contact relevant parties and when
- Always include a TL;DR at the start of all communications
Make your team challenge your decisions
This is the trickiest one in this list! The purpose here is not to get into long arguments and debates. On the contrary, the goal of this point is to make sure there’s enough camaraderie and understanding between the team members to be able to positively challenge decisions when required with the mutual aim of getting through the process with confidence, certainty and easy flow — start to finish.
The whole idea of having a team working on an incident is to make sure multiple brains are tackling the problem rather than one. It is very easy to get sucked into a certain direction when dealing with a high stress situation and miss some important things. When you end up in that situation, you’d definitely want someone in your team to ‘correct’ you or challenge your decisions when needed. The key here is to cultivate a culture where your team finds is easy to do that and also make sure that they understand the value of coming to the table with not just a problem that they see with your decision, but also a solution to that problem. This helps reduce time wasted while working on an incident and also helps create an open and progressive culture in the team.
You’re right, I did mention this one before! The reason you’re seeing this here again is that I really want to stress on how important this point is.
Panic during an active incident means you’ve already lost half of the battle. Wrong decisions will be made, tasks will be delayed, delegation will take a hit and eventually the chances of the entire process being handled badly end being quite high. Not the result you want for a security response.
If it helps, keep reminding yourself of not to panic regularly during the incident. If it get’s too much, take a break, that’ll be far more useful in the end than to carry on in a panicked state of mind.
And remember —its your job to respond, its what you’re trained for, there’s no time to panic!
In conclusion, there are a lot of technical skills that are required to be able to do basic IR tasks, some of them require years to acquire. To take it to the next level and become a great IR engineer, one has to learn how to deal with difficult, stressful situations in a professional, efficient and effective ways. A great IR engineer is much more than a tech-expert, he is someone with the ability to ‘engineer’ successful response to complex security incidents.
It is not possible to put everything in one article when describing a certain role in any industry. This article is based on my own experiences and those I’ve been fortunate enough to witness and learn from others. There will be many more aspects to what makes a great IR engineer and if you’d like to add those, please feel free to put them in the comments below, it’ll be interesting to read those.
Also, please note that these are all my personal opinions/views.
- Note that I always put forward any idea in terms of a team. IR is not a solo gig. At least not in the enterprise scheme of things. If you’re working as a single engineer on a major security incident, you’re bound to fail or at the very least, not succeed within an acceptable time frame. Even if you possess ALL the skills to respond to the incident technically, you won’t be able to deliver on time.