Welcome to Cash in the Cyber Sheets. I'm your host, James Bowers, and together we'll work with business leaders and industry experts to dive into the misunderstood business of cybersecurity and compliance to learn how to start making money from being secure and compliant. Welcome to Cash in the Cyber Sheets.
Hey everyone, welcome to episode 15 of Cash in the Cyber Sheets. I'm your host, James Bowers, Chief Security and Compliance Architect here at Input Output. Very happy to have you all on with us today. So over the past few weeks, we've been talking about business continuity plans going to stay in that same realm and where we talked about last week about actually how to put one together.
This week, I want to talk about the different types of testing, some of the different methodologies and kind of quickly the pros and cons of each, where you'd use some, why you would use some over the others, and that will help give a really good, once we've got our plan, show us how to test it really well. Interestingly, in audits that we do, and even recent audits that we've done, this is an area that is typically lacking in quite a few companies. And that's not normally that it's overlooked, just that either A, it's done at a smaller scale.
So we'll talk about like a lot of companies are just doing walkthrough exercises or very limited tabletop and they may not have the resources. But I want to talk about where that can leave everything, where that can introduce a lot of risk, because as we're going through our audits, inevitably, we're finding a lot of things with the continuity plans, things, unfortunately, that are missed and God forbid, when disaster actually strikes, that's when everything really kind of comes to light. So you don't want to do your main testing during a disaster.
You don't want that to be your primary continuity test. So we're going to go through those here today. Before jumping in, before going any farther, please, please click that subscribe button, click the follow wherever you're listening to us at, send us some comments.
And if there are certain topics that you'd like to discuss, some things that you're having trouble with in your information security program, give us a shout. We'd love to have it on there. So jumping into the different types of continuity plans, and for those of you that don't take notes completely, OK, we actually will have a we've got a companion article on our website, right on our blog of all of this information.
So you'll be able to go there, check it out. But in any case, this is going to go through each of those. So the the main continuity testing plan types that we're going to talk about today is a walkthrough test, sometimes just described as like a step by step brainstorming.
Sometimes it's described as a very light tabletop. Sometimes it's used interchangeably with a tabletop exercise, which tabletop exercise is number two. Going up, we've also got parallel testing.
We've got full interruption testing and then sandbox and not necessarily in an order of impact or danger or better than the other because they each have pros and cons. And that's what we're going to talk about here. So.
Our very first one talking about our walkthrough testing now. Like I said. Sometimes companies will just mash all this in or people talk about these as a tabletop exercise and definitely not a bad way to look at that.
The reason why I make that distinction that I like having that in there is because not all tabletop exercises are created equally. So. The walkthrough testing is.
What I look at as what we describe kind of a lighter version and the biggest thing of a walkthrough is it's really. It's really what it sounds like, we're just going to walk through all the different things that can happen. We're going to bring some of the key players involved and let's just talk about, hey, if this system went down, what would happen? What would we need to do to keep things running? OK, what do we have in place? And.
At at the heart of it, that's really it. It's it's a brainstorming session where we're just going to talk through, walk through all of those different little nuances and just basically discuss the plan, discuss what could happen, discuss the plan, and through that process, we're going to identify some different things here and there that maybe we didn't consider. Maybe there's another system that we need to to bring up.
Or some other things that we need to put into place, or perhaps we should bring in somebody else to get their perspective because they would have a little bit better input than than who we have currently. That's basically all a walkthrough is, though, just a really nice brainstorming session and. Honestly, just dedicated time to actually going through the continuity plan, which like ripping off the Band-Aid is something that most people, they don't like to do, they don't want to do it.
So. You could probably talk about the walkthrough as time set aside, testing pros of it, very, very inexpensive, relatively easy to do, really, all you're doing is scheduling a little bit of time and getting everybody to engage. The cons is it's very limited.
You're typically not getting all the parties involved. You're not actually. Bringing anything down, so there's a lot of assumptions, which, you know, it happens when you assume that there's a lot of assumptions in this type of a continuity test, so.
When there's an actual disaster event, if all you've done is these walkthroughs, these very limited tabletops. Chances are you're going to find some things you didn't consider or some of the assumptions that you had in place that you had adequate backup systems. You're going to find out that you really did it.
It's not going to operate exactly as you thought it would. And that could leave, leave you in a bad spot. So walkthrough testing is great to get the tests, get your continuity plans built, kind of run them through some revisions, get some of the SMEs and other stakeholders, especially the ones that have limited time available to get their input.
And then from there, you can look at some of the other tests, maybe with a more limited team. So, again, great pros and cons, very easy to execute and test and typically should be a part of every annual business continuity plan test, at least a part of it. So the next one, and like I said, kind of in tandem, is the tabletop exercise.
And what this is, is really more of. More robust kind of walkthrough, essentially what you're doing is you're bringing all of the different participants, participants in and all of the people that would be involved with the continuity plan with a disaster. And you're actually going to walk through different scenarios, you're going to.
Create a scenario that would happen a lot of times that's actually breaking it down, so we would look at, let's say, in Florida, OK, we have a hurricane, we've detected it off the coast of Florida and the Atlantic, it looks like it's going to come in and hit us. And what would we do at that point and go around the table, everybody gives their input, you look at the plan, OK, do we have the things in place, who's going to execute, what vendors do we need, do we have that in place, and you're going to go a bit deeper with it. And in some cases, actually have them just like doing an audit, right? You said you're doing this, OK, show me the evidence that that you have that you could execute.
You know, do you have these names and numbers on your on your contact list? Then we could further discuss, OK, when it makes landfall, what's that going to look like? What are each of the teams going to do? How are we going to communicate to all of our employees? Are we going to communicate to our clients? So on and so forth. And then if this causes a power disruption or a prolonged outage, what's going to happen then? Are we going to help employees get resources they need? Are we going to move to remote only? What are we going to do? And it's typically a much more structured process. Typically, you'll actually I mean, you schedule all of these, but you'll have a very scheduled, regimented program and agenda in place.
OK, here's where we're going to discuss the event. Here's where we're going to go around and essentially role play with everybody. Here's where we're going to try to pick that apart and not in a bad way.
That's just everybody coming up with different ideas of perhaps how that wouldn't work or how that wouldn't best support the company. So that way you can find ways to improve it. And once that's all completed, then you can even do a postmortem on it and then make any changes to the continuity plans that you need.
So. There's a lot of similar similarity there between what we defined as the tabletop and what we defined as the walkthrough, and that's because they're essentially the same thing. Just the tabletop is much more structured.
Walkthrough is. Again, mostly the best way to look at that is just time set aside to actually go through the business continuity test. So.
As far as pros and cons go of the tabletop. The biggest pro is that it's going deeper. You're getting a lot of input and.
The main focus is to try and pick that apart, pick the plan apart so you can see how can we improve it. The cons are really the same as the the walkthrough. We're not actually bringing any systems down, so we're still making assumptions.
And when the time comes that we need to put this plan into place. We're probably going to find that there's things we missed. Or things aren't going to operate as we expected them to, and it's going to cause us the same problem.
I guess if we're listing all of the cons, another one could be. That it is more formal, so you're needing to put more time into planning, you're needing to bring more people to be involved, and it's typically longer than than just the simple walkthrough is. So as far as resource impact and a personnel resource impact, there's definitely a higher level there, but it's not bringing any systems down.
So to the company. Not really too big of an impact, so we'll throw that in the pro column. Now.
These next few actually start impacting systems. This is where we're going to really start simulating an actual event. So the first one here is a parallel test.
And what that's doing is running all of those continuity systems in parallel to your current systems. So you may not actually bring anything down, but we're actually going to say spool up our backup internet, make sure it's working, make sure we can operate on it. Maybe we'll throw a few people on there to make sure that it works.
We may actually go to our Warn site or a Hot site and take a take a few systems there, maybe take an employee or two and make sure that it's working. Hey, can we can we actually operate like we thought we could? You know, let's let's have somebody spend a day there and work through this process and see where the snags are, and that way we can actually improve it like it, like the name implies. It's not typically bringing anything offline.
Again, just working in tandem with your current systems, but because you're doing that, because you're deviating from how you typically work, there is that potential for productivity impact. Again, I think that can be very well mitigated, especially if it's planned well. But the really nice thing for a parallel test is that here we're actually testing the systems.
We're not making assumptions that they're going to work. We're putting them through their paces to make sure that, yes, they will operate the way that we thought, and typically here during the parallel test or similar, we find some things that didn't, we can address them now. So that way, when disaster hits, we're much better prepared.
Now, pros and cons of it. Actually, I think we just talked about the pros right there. We're really defining all that.
We're seeing the different ways that things could not work as expected and getting them addressed. Now, the cons of it, that's a much deeper exercise that is going to impact productivity in some way. It takes a lot more planning.
It can typically be more expensive because now we're not just operating one set of systems or processes, we're operating two or three or whatever the case is for your continuity plan point is, is all of those systems and processes take resources to run. Now we're doing that on top of what we're already doing. So there's beyond the other ones that are just a time, I don't want to say time suck, but a time investment, investment's a much better word there, from the other ones that are just a personnel time investment.
This one's actually a parallel test are starting to cost the company money to execute. So again, there's not just planning the test, but also planning with management authorizations. Um, I guess a good way to look is there's also more bureaucracy typically involved in getting this one, uh, scheduled and executed.
Taking the parallel, a step, quite a few steps further is a full interruption test, right? And this is one of the other, uh, continuity plans. Um, a full interruption test is just like it sounds. We're going to basically take all of our production systems, all of our current systems that could be impacted by this disaster that we're looking at.
We're going to bring them offline and we're going to bring them offline and we're going to work just from this continuity plan. We're going to execute this just like a disaster happened right now. And everybody's going to do what they're supposed to do.
The great thing about a full interruption test is not only are you testing the supporting systems, not only are you testing the plan, but you're also testing every, every stakeholders, every person's involvement. You're making sure that they all know what to do and you're helping to train them. So that goes from, uh, IT, senior management, all the different stakeholders, all the way down to frontline employees are now having to operate under this continuity plan.
And this is getting them familiar with it, understanding what they have to do, understanding the steps, and then also identifying where there's gaps so that you can address those. And part of this is also in itself a very deep training exercise. So that way, and this is the one, I think one of the two primary benefits is benefit one, you're getting everybody trained and really that's it.
You're getting everybody trained and walking through this so they know what to do when things happen. Number two is far and above any of the other testing options. You're going to find every, typically every single opportunity for improvement where the systems don't work, where the communication is a bit rocky, where the processes could use some refinement.
Every part of this, you're going to find ways to improve the continuity plan. And this is really the deepest, um, best way to test the continuity plan because you're actually using it. That's also where the biggest impact, the biggest con comes in is we're bringing everything offline to run off this continuity plan.
Continuity plans by definition are not production. They're not running at full capacity. You're as best you can limping along while you get things put back together.
So just running this test, just, just operating in the business continuity plan itself is going to impact productivity. We're, because we're going to have reduced functionality. There's also the impact to all of the employees that need to be trained prior, that need to be involved.
We need to coordinate with clients. We need to coordinate being able to manage all the things that we have to manage while this is going on. There's money involved because now we're actually taking off production systems.
So we're essentially perhaps losing some revenue, but we're also having to run these other systems, which typically costs money. So as far as actual resource expenses, a full interruption test is typically very intrusive, very expensive, and requires a lot of planning. However, with the benefits that it brings as far as training and being prepared, it really can't be overlooked.
And especially for systems that you need a high availability, you may not do a full interruption test because you don't want to take those offline, but doing full parallel testing and then trying a, a full interruption after doing those parallel tests. So you have a pretty good idea that everything, how everything's going to work, and then you're actually really going to test it. Those can fit very well together.
The final one is in this covers, I think is just really a different way to do a parallel or a full interruption, but it's sandbox testing. Essentially, we're going to create a duplicate environment and we're going to test our continuity plan there. So we're either doing a parallel test in that sandbox, or we're doing a full interruption test in that sandbox while we have production here being unaffected.
This will typically pull some resources, uh, personnel, uh, sometimes over into that sandbox environment, but it's a way to really kind of have your cake and eat it too, as it comes into parallel and full interruption testing, we get to get that really deep review, that really deep, um, analysis and improvement opportunities without impacting production, without having such an impact to the organization. I would say as the biggest con is really cost creating a duplicate environment as best as you can is expensive. The other con is that try, try as you might doing a, creating another environment, you're either going to miss things or because of resource constraints, there's certain things you won't put in, or what can also happen is when you're putting your sandbox together, everything's being put together in the right way, say all the configurations, things that you may have missed in your production environment that you haven't yet realized you have working in your sandbox environment.
So when we do our sandbox test, everything looks great over here. It's all working great. Everything executed just like we thought it would, or it didn't.
And we found ways to improve it, improve it. But when we have an actual disaster, things don't work as expected because, Oh my God, we didn't have certain things configured, configured correctly in our live environment, and now, now we're discovering it. So again, it's just like everything security.
It's, it's that sawhorse pros and cons ups and downs, but as far as when to use these, how to use these, they should all be used at different times. So when, when we're building our plan, when we're getting everything together, the walkthroughs, those limited tabletops are excellent because it's allowing us to build everything out. Basically our core framework, the main blueprint, and we can graduate to a much deeper tabletop to where, all right, now we're going to bring all the teams together and we're going to walk through this and, and here's some example, disaster scenarios of what's going to happen at these different times.
And does everybody know what they're doing? And, Oh, okay. We didn't consider that. Let's get that put in.
And then for more critical systems, more critical areas, let's start with some parallel testing if we can, if the environment allows it, let's go ahead and run those continuity systems, those continuity processes to see what works, what doesn't, what can we improve? And then for those critical systems that we cannot have them down for an extended amount of time, let's try our full interrupt test to make sure that we can, that we can switch over and we can operate effectively and perhaps even do that in a sandbox environment, perhaps a parallel, do a sandbox. And then finally, when, when we think we've got everything together, it looks like it's all, all buttoned up, all tightened down during non-peak hours. When we can, we do that full interruption to see, okay, have we gotten everything put together? Like we thought we did, is this going to operate when we need it to? So that's the different ways that you can use these, these different testing methodologies of your continuity plan.
And they're each going to provide different benefits. They're each going to have some drawbacks to them. And as long as you go into it, understanding what those are, what you'll be able to see and what you can't, that's a great way to manage the risk, the expectations, and then be able to manage the limited resources, I say limited because it's not infinite.
We all have limited resources, especially when it comes to our information security programs, our compliance programs, you'll be able to more effectively manage those limited resources. And then when things happen and they will, when you have some sort of disaster, some sort of incident, you'll have everything in place that you need to be able to keep the business running. And that's ultimately what this is all about.
So I think that is a good place to wrap it up for today with our, uh, our different continuity plans. As a reminder, we do have this on our blog, so go ahead, uh, hop over to inputoutput.com, uh, slash blog, and it'll be right there. It should be right at the top because it, uh, is coming out at the same time as, uh, this, uh, podcast, but it just gives, uh, in a little bit more detail, each of the different types, the pros and cons, some other considerations, and, uh, just some other good information in there.
And I think eventually what we'll do is, uh, I don't know if we're going to put, uh, some playbooks in there, uh, for that. I don't know if that's, I don't want to say behind a paywall. Um, I like giving a lot of stuff away for free.
Some of our, uh, some of our partners and, uh, and other, uh, uh, business partners, uh, like to slap my hand over that. Um, but, but we'll see what we can get out there. I'll see what I can squeeze through.
Um, so before you go, please hit that, uh, subscribe button, hit that follow button, um, leave us some comments. And as we always say, if you'd like to be on the show, I'd love to hear your story. I'd love to hear how you've managed your risk, some of the hardships you've gone through, how you got your business from where it was that, that little seed getting started to where it is now, or where, where you did that in the, in the company or the organization that you're managing.
Um, so we'd love to hear your story, but thanks so much for joining us today. Uh, cash in the cyber sheets. Very happy we're here. We will see you next Thursday at 10 a.m.