#32: The Dirty 13 | Bad Data Classification Practices

Season #1 December 19, 2024

In this episode, James continues the “Dirty 13” series, tackling one of the most common and costly audit findings: poor data classification.

Without a structured approach to labeling and protecting data, organizations are left vulnerable to security breaches, compliance failures, and wasted resources.

Join James as he explores:

Why data classification is a cornerstone of effective information security.
The risks of neglecting classification, from financial losses to reputational damage.
Input Output’s tiered classification framework, designed to protect data while streamlining operations.
Practical steps to build and implement your own classification system.

Data classification isn’t just a security checkbox—it’s a strategic tool that can save your organization time, money, and risk.

Want to go deeper? Check out our accompanying blog article for a more detailed look at strategies, frameworks, and tools to strengthen your classification practices.

Why Poor Data Classification is a Cybersecurity Risk Your Company Can’t Ignore

Explore more topics from the Cash in the Cyber Sheets - Dirty 13 series:

Episode Transcript

Welcome to Cash in the Cyber Sheets. I'm your host, James Bowers, and together we'll work with business leaders and industry experts to dive into the misunderstood business of cybersecurity and compliance to learn how to start making money from being secure and compliant. Welcome to Cash in the Cyber Sheets.

Hey everybody, welcome back to Cash in the Cyber Sheets, the podcast where we dig deep into the challenges, risk, and strategies of information security, business practices, and regulatory compliance.

I'm your host, James Bowers, Chief Security and Compliance Architect here at Input Output, and very excited to continue our journey through the Dirty 13. Now, for those of you just joining us, the Dirty 13 is our series highlighting the most common audit findings we uncover during the information security gap assessments. Today, we're tackling a topic that often goes overlooked, poor data classification.

In this episode, we're going to explore why data classification is so vital to your organization, and also why failing to classify your data properly can leave you vulnerable and the steps you can take to build an effective classification framework. Along the way, I'll have some of our own input output classification structure as an example to show how to get it right. So let's get started.

Before we jump in, though, please click that subscribe, click that like, let all of your friends and colleagues know about Cash in the Cyber Sheets so that they can sign up, too. You can get us on Apple, Spotify, or YouTube. Click that like.

And with that said, let's go ahead and jump in. So why does data classification even matter? It seems straightforward. You sort data into categories and protect it accordingly.

But in practice, many organizations either skip this step entirely or they implement it poorly. The consequences of neglecting data classification are serious. Without clear frameworks, you don't know where your data is stored, who's accessing it, or how to prioritize the protection.

This blind spots or these blind spots can have a domino effect across your entire security strategy. For example, if you can't identify where your sensitive data resides, you can't create accurate data flow or network diagrams. Now, these tools are essential for understanding how data moves through your systems and for identifying vulnerabilities.

And let's not forget the financial side of things. Companies without proper classification often overspend on security measures that don't address the real risk or worse, they miss opportunities to optimize their data management entirely. Fixing inefficiencies and aligning protection to actual data sensitivity can save significant resources while reducing exposure to breaches.

Now, proper classification also lays the foundation for advanced solutions like data loss protection, DLP or data leak prevention, however you want to say it. Data de-identification, which we'll discuss later. But the bottom line is this, if you don't classify your data, your organization is being left exposed and you won't even really know where.

Now, let's talk about the consequences of getting this wrong. Poor or completely non-existent data classification leads to cascading issues. First, without classification, tools like data leak prevention, data loss prevention can't function effectively.

DLP relies on knowing what data is sensitive so it can monitor and prevent unauthorized transfers. Also, data masking and data de-identification are completely off the table if you don't know which data needs these protections. These technologies allow you to reduce the risk without sacrificing the usability of your data, which is crucial for modern businesses.

There's also the issue of compliance. If you can't demonstrate that you're protecting the sensitive information properly, you're inviting regulatory scrutiny. This can lead to lawsuits and reputational harm and some serious, serious fines.

Finally, breaches caused by poor classification don't just impact the bottom line, they can derail the entire business. Imagine losing critical intellectual property or customer trust due to preventable oversights. That's a position that no company wants to be in.

Now, let's shift gears and look at what a good data classification framework actually looks like. To illustrate this, I want to walk you through our own input-output classification structure. At InputOutput, we use a five-tier system to match sensitivity and audience of our different data.

Each of these levels defines how data should be handled, shared and protected. Now, it's tied loosely to the traffic light protocol, and I'm going to show you exactly how that works here. So our first level is TLP white or public data.

This is the data it's intended for everybody, like marketing materials or publicly available reports, blog posts, all of that. It's freely shareable. But there's still risk if it's accessed from internal systems during a breach.

Really, the only issue there is that it creates a potential reportable event while we investigate. But overall, if public data is breached, not that big of a deal to us. The next step up we have is TLP green or internal data.

This is information that's not intended for public, but it isn't highly sensitive either. It might include internal policies, operational updates. NDAs are advised when sharing these, but they're not always required.

We want to not publicly distribute these, but it's not a big deal ultimately if they get out. The third level, and this is where most companies operate, is TLP amber or confidential data. This includes highly important data that requires strict controls.

For example, financial records. It could be certain intellectual property. Any internal company data typically falls into confidential.

Sharing this type of data requires NDAs. We cannot share this with internal external parties without NDAs executed. And breaches can actually lead to potential regulatory fines and reputational damage.

The next highest step is TLP red or restricted. This is highly sensitive data like personal identifiable information, PII or PHI, protected health information. Access to this is strictly need to know, and it's the highest regulated.

Any type of breaches with TLP red or restricted data is a serious event and most often requires disclosure to parties, typically requires remediation payments and all types of other costly activities. Where input output slightly differs is we have one extra step which we classify as purple. There's not really a TLP to that.

Uh, that we call critical or top secret. This is our most sensitive category. This covers information that's essential to the organization's operation.

A breach at this level would be catastrophic. Typically, this type of information includes trade secrets, things that don't have a patent, but that run the company. And examples of this are Coca-Cola's recipe or KFC's 13 spices.

These things getting out could have a serious impact to the continued operation of the business. For this level, access is extremely limited and sharing is highly controlled. There are very, very strict controls around this.

Having the structured framework ensures that every piece of data is protected in a way that aligns with its sensitivity and impact. So how can your organization start building a robust data classification framework? It might seem daunting, but it doesn't have to be. Step one is to inventory your data.

What types of information do you handle? What are their sensitivity levels? Start categorizing based on factors like regulatory requirements, operational needs, potential impacts, exposure. Also, I would recommend splitting up PII and PHI to where essentially it's in its own little bucket. That'll help with figuring out what controls you need to have where.

Step two is define your classification levels. Use a framework like ours and put outputs as a guide, but tailor it to your organization. Make sure it's clear.

Make sure it's actionable. Step three is to implement controls and assign protections and access restrictions to each classification level. This ensures that your data is only accessible to the right people under the right conditions.

Finally, train your team and invest in tools to automate classification. This is like DLP tools, data classification. Automation can help you maintain consistency and compliance without creating an overburden on your staff.

So that's it for today's episode of Cache in the Cyber Sheets. We covered why data classification is so important, how input outputs framework works, and the steps you need to take to build your own data classification. If you're realizing that your organization has a classification gap, don't worry.

It's never too late to start. Take inventory, build a framework, and put the right protections in place. Thanks for tuning in.

If you found this episode helpful, share it with your colleagues or anyone responsible for data security at your organization or any of your partner organizations. Until next time, thanks for listening.

Thanks for joining us today. Don't forget, click that subscribe button, leave us a review, and share it with your network. Remember, security and compliance aren't just about avoiding risk. They're about unlocking your business's full potential. So stay secure, stay compliant, and we'll catch you next week on Cash in the Cyber Sheets. Goodbye for now.