Advertisement

The Cool Kids Corner: Equity and Privacy

By on
Read more about co-authors Mark Horseman and Karen Lopez.

Hello! I’m Mark Horseman, and welcome to The Cool Kids Corner. This is my monthly check-in to share with you the people and ideas I encounter as a data evangelist with DATAVERSITY. This month, we’re talking about personal identifiable information (PII) and the privacy of that data, but also about ways organizations use personal data to provide better, more personal service to customers. Specifically, we’ll be looking at data as it relates to equity and inclusion for under-represented groups. When does our desire to do good in an ethical way run up against the expectation of privacy, which ultimately must be respected? We’ll dive into some examples of how this becomes a grey area and ultimately, we’ll see what the cool kids are saying. 

Early in my career, I was tasked with managing the security and organizational definition of self-identified indigenous students at a higher-education institution in Canada. We encouraged students to self-identify so that we could do analysis on retention rates and other “student success” metrics. This wasn’t my first go around with privacy and security at this time, thankfully, but I quickly had the epiphany (as well as learning from the wisdom of our Institutional Analysis team) that with a much smaller population, identifying a small number of students in a program became a privacy risk. While our goals were to improve the student experience for an under-represented population at the university, we had to be careful we weren’t identifying individual students who might have been struggling in a program. 

This set my brain on the path of thinking about equity data and where our desire to do good can run up against people’s expectations of privacy. Extend this to other verticals; for example, creating a marketing campaign based on population segmentation by area gets to this grey area very quickly. It’s difficult to remain ethical while retaining privacy and equity and doing good for society. Imagine an insurance vendor that markets “renters’ insurance” to a set of geocoded locations known to rent, not realizing that the population selection for that campaign overselects from a minority group. The reputational risk of not being vigilant about how data is used is real.  

As I have been thinking about this for many years, it naturally came up in conversation with my good friend Karen Lopez, who is an authority in privacy, security, and data modeling. In a bit of a departure from the normal Cool Kids format, I asked Karen to write her thoughts below. 


I love this topic! Speaking of higher education, I remember that in university our grades would be posted outside our professor’s door, listed by our Student ID, which was also our social security number (SSN). In those days, SSNs were issued locally, and one could tell the approximate location of where someone grew up by the first three digits of their number.  If you were in a small class, you could pretty much figure out half the class’s grades based on their SSN listing. As a special note, foreign students would get an artificial SSN starting with “9.” That made tracking the grades of foreign students easier too. 

Of course, now we don’t use SSN for other IDs (if you still are, STOP). However, there are other ethical conversations to be had about the use of obscured data and how one might be able to derive the underlying data just by matching it with other data. In one case, the masks on data across several systems were applied differently, so one could pull together three reports to see the complete data. 

Good Data Governance requires us to govern the data across project, keeping an eye on what each project is doing to ensure we aren’t leaking data like my TSA-safe liquid containers do on every trip. Many people see the Data Governance groups as just a “get into production with a checkmark” approach. We know better. It’s not just the compliance needs, it’s our ethical mandate to ensure we are governing our data so that we are doing no harm. It requires we check on reports, not just persisted data. It requires we review data uses that are compliant with whatever consent we collected. It requires diligence.  

Tips: 

  1. Ensure that management knows that we are trying to protect data from all kinds of harms. 
  2. Ensure that management knows we are trying to manage ROI – Risk of Incarceration of them
  3. Be ready to escalate issues with how data is collected and used. 
  4. Be ready to call “stop all work on this” even if you don’t have the authority to stop work. 
  5. Understand how some data protection methods only work if they are used everywhere the data appears. 

      Ethical Data Governance gives us the responsibility to safeguard the data of the world; we have to be able to call out when harm is imminent. 

      Check out what Karen Lopez is up to: 

      Remember that you can meet and join Cool Kids like Karen Lopez at DATAVERSITY events (use code COOLKIDS to save 15%): 

      Want to become one of the Cool Kids? All you need to do is share your ideas with the community! To be active in the community, come to DATAVERSITY webinars, participate in events, and network with like-minded colleagues. 

      Next month, we’ll be taking a look at data leadership!