Managing privacy in a Big Data environment can seem overwhelming, yet Sunil Soares, founder of Information Asset, while speaking at the DATAVERSITY® Enterprise Data Governance Online 2017 Conference, said that Data Privacy challenges can be overcome: “This is not a secret sauce! A lot of this you’re going to do anyway from an Enterprise Data Management perspective,” but by using a step-by-step Data Governance plan or “playbook,” a business can ensure that the policies and procedures keep the entire organization in compliance with regulations.
In most companies, the Data Privacy Office, the Information Security Office, and the Data Governance Office tend to operate in silos, he said, “We believe that’s a real challenge. If you look at global Data Privacy, it’s multi-dimensional.” There are multiple subject types, like customer, employee, as well as emergent data types like the Internet of Things (IoT) and wearable devices.
“I’m wearing a Fitbit right now and it’s got all kinds of information about how much I walked, how much I weigh, when did I sleep– that’s highly sensitive data. If it’s combined with my personal profile, it certainly has huge privacy ramifications.”
Privacy regulations have been changing rapidly and vary based on jurisdiction, he said:
“You’ve got the EU’s General Data Protection Regulations (GDPR), you’ve got CASL, which is the Canadian anti-spamming legislation, and you’ve got HIPAA, and HiTech from a US health care privacy perspective, so you’ve really got to think through how all these dimensions interact with each other.”
Multiple subject areas, emerging data types, multiple jurisdictions, and rapidly changing regulations create a challenge, he said. “The privacy folks have a pretty good handle on the regulations, and the Enterprise Data Management and Data Governance folks really have a handle on the data.” it’s the combination of these two silos that’s critical across these four dimensions.
Guardrails and Ethics
Soares presented a slide outlining legal “guardrails.” Anti-discrimination laws prohibit uses of mental or physical health information, and in the US, there is specific legislation regarding use of genetic information. Workplace monitoring laws cover privacy, correspondence, employment, social media passwords, and employee access to personal records.
The National Labor Relations Board (NLRB) regulates concerted activities and information collected about employee activities outside of the workplace, and the European Union regulates use of Big Data and Predictive Analytics, he said. Regulations about notice, choice, use limitation, access, correction, integrity, accuracy, quality, minimization, retention, security, monitoring, and enforcement: “Everything from a Data Privacy perspective comes down to the Fair Information Principles (FIPS).”
Ethical considerations based on the values of your organization must always be considered. Soares presented the following questions to keep in mind when formulating policies. “Even if it’s legally allowed for you to use certain information, is it ethical?”
- Is this the type of organization we want to be?
- Is this activity consistent with our core values as a company?
- How would our employees feel if they learned about this activity?
- How would our customers feel if they learned about this activity?
- And more importantly, would we be comfortable if this were on the front page of the newspaper? Would we be embarrassed or would we be able to support our actions?
To put this into practice from a Data Governance Standards and Processes perspective, Soares created a tool he calls a Data Governance Playbook, which consists of 16 steps that serve as the foundation for a Data Privacy program. “The playbook really gets down into parent-child relationship between policies, standards, and controls,” all of which should be customized to fit an organization’s needs, he said. As he went through the slides detailing the different sections of the playbook, he referenced specific General Data Protection Regulations (GDPR) where applicable.
Data Governance to Support Privacy Compliance: The Playbook
1. Develop Policies, Standards, and Controls
Policies are high-level statements about how data should be handled, similar to a Vision Statement. Standards outline what rules are in play to put policies into action, and controls provide specific instructions about how to implement a standard, he said.
2. Create Data Taxonomy
(GDPR Article 9 – Processing of Special Categories of Data)
Collaborate with Enterprise Data Architecture to classify data into consistent categories and subcategories.
3. Confirm Data Owners
Identify who will do the work – Data Stewards or Data Owners – and ensure ownership is documented for clarity.
4. Identify Critical Data Elements (CDES) and Datasets
(GDPR Article 9)
Soares distinguished between critical data elements (CDE) and critical datasets as a method to prioritize for documentation and governance.
“A critical data element might be a social security number. A critical dataset might be Facebook data, or Fitbit, or Instagram. These are datasets that are tied to regulations or have a major impact on the financial report. They tend to have elements of higher risk.”
Soares presented a method for classifying CDEs. Level One is customer/employee data; Level Two encompasses contacts, identity, geolocation, and web and social media; Level Three further breaks down information from contacts and social media into specific information sources, like chat logs, or email addresses. “When you’re defining your acceptable use standards and your data collection standards, they could be at any of these levels.”
He then walked through an example of how CDEs could be tracked using Orchestra Networks, and pointed out that Excel, Word, or SharePoint would work as well. He continued with some data mapping, identifying some of the places in where this sensitive data is sitting. “If you’re a CIO or a Chief Privacy Officer, this is a great place to see an intersection between Enterprise Data Management and Data Privacy.”
5. Establish Data Collection Standards
Soares says it doesn’t matter where you manage your collection standards, as long as they’re well-defined. “I recommend you manage it in your Data Governance Tool, but you could document it in Excel or in SharePoint and potentially link that into your Data Governance Tool.”
6. Define Acceptable Use Standards
(GDPR Article 6 – Lawfulness of Processing, GDPR Article 7 – Conditions for Consent)
Different jurisdictions treat acceptable use in different ways, but generally, the key questions become “Is it lawful for us to process this data?” and “Do we have the appropriate consent?” For example, an email address can be used for outbound marketing in the US, as long as the customer hasn’t opted-out. In the EU and Canada, he said, it can be used if you have an opt-in consent from the customer.
7. Establish Data Masking Standards
Determine the proper masking protocols based on the level of sensitivity of the data.
“You might hide the data with random characters, which would be masking the sensitive data elements, or mask the identifiers. You might want to de-identify the information by removing certain identifiers, or by anonymizing it completely.”
8.Conduct Data Protection Impact Assessments
(GDPR Article 35)
Whenever a new data type is brought into the organization or when working with a new application, it’s important to conduct a data protection impact assessment, he said. “What we’ve seen with many organizations is that the data protection impact assessment refers to certain Metadata,” but the data protection impact assessment itself is sitting in a Word document on SharePoint. There’s no linkage between the impact assessment and the Metadata in the Data Governance Tool, “so connecting the two is valuable,” he said.
9.Conduct Vendor Risk Assessments
(GDPR Article 28 – Processor)
The GDPR requires risk assessments on vendors so that downstream data use is properly protected.
10. Improve Data Quality
(GDPR Article 16 – Right to Rectification)
The right to rectification ensures that data collectors provide subjects with the ability to fix inaccurate information and addresses remediation.
11. Data Lineage
(GDPR Article 30 – Records of Processing Activities)
“I would say the biggest [issue] that’s come up lately around Data Privacy is the GDPR article 30 that talks about records of processing activities. What they want to know is, where did this data come from, where is it going, what happens to it along the way, [and] what is the impact?”
The Data Lineage needs to be shown for any data sent to a vendor – not just within the firewalls of an organization, but to any processors, or vendors, or sub-processors, he said. “I need to manage the Data Lineage all the way through.”
12. Model Governance
(GDPR Article 22 – Automated Individual Decision-Making)
Under many privacy laws, it is required to disclose any automated processing and that subjects must be given access to results. He cited a white paper written by the Federal Trade Commission (FTC) about privacy impacts of Big Data, defining “disparate treatment” and “disparate impact.” Civil rights regulations prohibit discrimination against individuals in protected classes (disparate treatment), and in most cases, he said, organizations comply. “But what happens when you take that data out and put it into a risk model and then use the outcome for a certain purpose? You might end up with what is called ‘disparate impact.’”
He used an example of a company that wants to reduce churn rates by hiring candidates whose profiles indicate they are less likely to leave.
“What you’ll find generally is that employees who live closer to their place of work are less likely to leave because their commute is shorter,” which could inadvertently lead to redlining certain zip codes that might contain minorities, he said. “Even though zip code by itself is not one of the protected categories, you might end up with disparate impact because you’re discriminating against minorities.”
13.Manage End User Computing (EUCs)
He defined end user computing as anything that’s outside the control of IT. “Think of this as spreadsheets, MS Access databases, Word docs, SharePoint files,” or any file that could contain sensitive data, such as SSN, DOB, he said. Some companies use an inventory or a catalog that pertains to EUCs, some have an honor system done manually, and some use tools that crawl through databases and search for documents containing sensitive data, he said.
14.Govern the Lifecycle of Information
(GDPR Article 17 – Right to be Forgotten)
Companies need to plan for the lifecycle of information, which refers to “the right to be forgotten.” These are situations where data subjects – customers or employees – petition you to remove their information, he said. “This is a curious artifact of the EU, but even in the US, having an Information Lifecycle Management program in place with retention guidelines is important.”
15.Create Data Sharing Agreements
(GDPR Article 28 – Processor, GDPR Article 46 – Transfers Subject to Appropriate Safeguards)
It’s not enough to have acceptable use standards that only apply within your organization, he said, you need to know what happens when that data gets propagated outside your organization. “Data sharing agreements are a good way to make sure that you can enforce a level of commitment with the recipients of the data.”
16.Enforce with Compliance Controls
Compliance controls can be set up to ensure that legal signs off on necessary events, for example, or any other contingency where compliance is important, he said.
By using these 16 steps, Soares said, “We’re not trying to be privacy professionals, but as I said earlier, you really need to think about how Data Governance can support privacy compliance,” and the playbook can help make that support a reality.
Here is the video of the Enterprise Data Governance Online 2017 Presentation:
Photo Credit: Rawpixel.com/Shutterstock.com