-
Introduction to Data Security
-
Chapter 1
Developing your Data Security Policy
-
Chapter 2
Understanding Data Security Compliance Laws
-
Chapter 3
Classifying Data by Sensitivity
-
Chapter 4
Building a Security Strategy on Identity
-
Chapter 5
Working with a Trusted ETL Partner
-
Chapter 6
Essential Cloud ETL Data Security Features
-
Chapter 7
6 Security Questions to Ask Your ETL Vendor
- How can your platform help protect our PII, PHI, and other sensitive data?
- What examples can you share of how you have helped other clients with their data security?
- What features does your platform have to maintain compliance with regulations such as GDPR, CCPA, HIPAA?
- How can your data security team assist with our data security strategy and implementation?
- How do you remove/encrypt sensitive data in Europe for GDPR before moving data to the U.S. or elsewhere for centralized analysis?
- Does your platform support field-level encryption for sensitive data fields?
-
Conclusion
Classifying Data by Sensitivity
Data classification hinges on one question: What would be the consequences if this data leaked?
If you consider this question in terms of your company’s data, you see three main categories:
High Impact
This category includes personal information that could breach data protection laws or expose people to the risk of identity fraud. It also includes sensitive corporate documents such as confidential reports and strategy documents.
Moderate Impact
This includes information that you’d rather keep private, but which poses no immediate risk. For example, B2B invoices and supplier agreements may fall into this category, as well as personal information that doesn’t identify an individual.
Low Impact
This information won’t hurt your business if leaked, and much of it may already be available to the public. Press releases, white papers, and non-proprietary corporate information all fall into this category.
Companies can use this system to create a classification taxonomy for data. Some of the more common systems use Public, Internal, Confidential, and Restricted. You can create further compartments within these general accepted categories as well.
How to classify personal data
Most privacy laws define PII as information that could potentially reveal someone’s identity. Clearly, this includes unique identifiers such as:
- Name
- Address
- Date of birth
- Login credentials
- Social security
- IP address
- Biometric information
It doesn’t mean that all records associated with an individual automatically count as PII. For example, a register of login times for a user account is personal information, but it is not necessarily identifiable.
That said, data owners must bear in mind that minor pieces of data can reveal someone’s identity when combined. A study by Sophos found that a combination of gender, date of birth and ZIP code is enough to uniquely identify 87 percent of U.S. residents 7.
When in doubt, it’s best to assume that all personal records count as PII until you’re sure otherwise.
Expanding your data classifications
The system above describes an outcome-based data classification system. Some organizations may choose to add extra layers of detail to create a more expressive taxonomy that describes multiple types of risk.
Some of the extra factors to consider are:
- Frequency of movement: Data is at risk when it keeps moving between locations. Conversely, the risk decreases when the data remains encrypted in a secure repository and rarely moves.
- Encryption and password protection: Additional measures can help lower risk, such as password protecting files or encrypting them in transit. It’s not always possible to encrypt in-use data, so this increases the potential risk.
- Access level: The more people with access, the greater the risk. If data rests in a highly restrictive environment, it’s low risk. Data on a live system with multiple users is at a much higher risk.
- Compliance impact: Some organizations choose to classify data according to legislation. Health data poses a high risk of HIPAA breaches, while E.U. data could lead to a GDPR issue.
Classifying data helps to support data security while also improving performance. If you arrive at a set of definitions that meets your business needs, you can make sure that highly sensitive data always has the best possible protection. Then you can focus on improving processing efficiency for low-risk data.