While on one of my adventures navigating the world of data protection and organizational security, a client’s simple ask to implement Data Loss Prevention controls in Microsoft 365 led to a surprising revelation. The system, designed to safeguard sensitive information, stumbled when faced with policy-related documents. This prompted an examination of the controls we were building, and showed that there were no filters in place that would recognize what a policy even looked like. Reflecting on the many different types of materials within the organization that might contain sensitive information, and the lack of clarity about what lives where, sparked both a recognition of and commitment to taking control of data scattered across the organization by classifying it in a consistent way.
Making this discovery at that time also illuminated and averted a potential crisis that no one saw coming. Failing to formally classify all of their data meant that some of the higher-security initiatives they had in place, like the aforementioned Data Loss Prevention, would not be able to be fully utilized if the need arose. Letting it continue on unchecked could have led to unintended document exfiltration – meaning that bad actors could get access to documents that should never leave the organization’s electronic perimeter, and quickly cause a breach.
Let me be clear: this organization cared about security! They were already investing their time, energy, and resources in valuable security initiatives. The problem is that they built their house with a key piece of the foundation missing.
Data classification is an integral, but easily overlooked, part of managing and ultimately protecting data. In fact, it may be the single most important factor that drives an information security program. Data classification is important because it allows people and organizations to understand the types of information they are processing and storing. It makes employees aware of the types of information they handle and the data’s value. Once data’s value is understood, all the other aspects of information security come into play to secure it from external threats and make it available to the people, organizations, and processes that rely on it.
Data classification is the process or method for defining and labeling data according to its type, sensitivity, and business value so that informed choices can be made about how it is managed, protected, and shared – both inside and outside your organization. For each pool of data held in an organization, an owner should be named who is familiar with the contents of the data and who will decide on who can access it.
The first step in the data classification process is to define a person or group that will be responsible for the ownership of the data that needs classification.
Data classification can be a massive undertaking for established organizations who have never done it before. It takes buy-in from the top-level executives to be the most effective. In structured organizations, this may fall to a data governance committee. The data governance committee should consist of an executive leader, the owners for all the data pools, the IT staff that will be building out the controls for regulating the data, and a manager for those who will be reviewing the events that get generated if data shows up somewhere that it wasn’t expected.
Data Classification in Government organizations commonly includes five levels: Top Secret, Secret, Confidential, Sensitive, and Unclassified; however, in commercial organizations, there are typically four levels: Restricted, Confidential, Internal, and Public.
Once the classifications are identified and communicated, the second step will be for the data owners to categorize the data pools.
By organizing data into categories, organizations have more control, making data easier to locate and retrieve, which is of particular importance when it comes to risk management, compliance, and data security. Depending on the nature of the data, it may fall under regulatory control, such as the Gramm-Leach-Bliley Act (GLBA) for financial institutions or Health Insurance Portability and Accountability Act (HIPAA) for healthcare organizations. In the US, each of these regulations has specific requirements for how sensitive data must be safeguarded. Breaches of the safeguards that result in data disclosure can result in fines of millions of dollars.
The third step is for the assigned data owners, who will decide how, to whom, and how much of the data can be disseminated.
Organizations must build an effective data governance strategy that consider people, processes, and technology. Many safeguards rely on a combination of these elements. At the core, people are both the owners and protectors of the data. The processes are designed to protect access to the data through authentication and authorization mechanisms. Finally, technology is used to control the access via processes such as encryption, tagging, and process denial. If a party has no right to access a particular data element, then they should never be granted access to it, and if they try to circumvent those controls, then technical safeguards should engage to block access.
Throughout the process described above, another aspect that should always be included is digital asset management – which involves the cataloging of the data by identifying where it is stored, who owns it, the data’s classification, and its purpose. Without solid data management, data assets can become liabilities. The full life cycle of data asset management includes the collection or creation of data, the data’s storage and use, and the data’s eventual disposal.
As you can see, data classification is an indispensable cornerstone for a strong information security program. Without data classification, there cannot be proper data management, security, or controlled dissemination. Once data is classified, additional controls can be developed that allow for automatic detection of sensitive documents, and those documents can be prevented from going outside the organization.
Returning to the example of my client: With control of their data’s classification, they were able to effectively implement the powerful security controls that they wanted. They confidently put the pieces of a strong cybersecurity program in place that included a Data Governance committee and fraud department checking alerts from the Data Loss Prevention system, and are now well guarded against breaches as well as prepared for a resilient recovery if it ever happens.
It is well worth the investment and overhead to classify all data. An organization may be surprised to find hidden weak points, and then be fortified to shore them up before they cause a problem. Many times, when a full data classification effort is done, we find sensitive data sitting in temporary repositories or file shares where it is not supposed to be, or discover users who were not following good sensitive data management security. When caught before these mistakes cause harm, these can be powerful opportunities to model and train your staff in proper data caretaking using a real-world example, with an emphasis on empowerment rather than reactivity.
For Kuma’s clients, these are the things we emphasize and build on as part of growing an organization’s security maturity. They are also controls that we continually test to ensure they are working as expected. Finally, these policies, controls, and tests lead to not only a stronger and more secure organization, but also a fully compliant organization, whether that be HITRUST or another industry-required certification.