You are here:  Home  >  Data Education  >  Big Data News, Articles, & Education  >  Big Data Blogs  >  Current Article

The Larger Stakes Behind Big Data Ethics

By   /  August 20, 2014  /  2 Comments

by James Kobielus

Ethics isn’t just a bunch of guidelines to help us refrain from spying on and stalking each other. But you wouldn’t know that if you’re following popular discussions of so-called “big data ethics.” As far as most commentators on the subject are concerned, the ethical dilemmas surrounding big data can all be nailed to a single issue: privacy.

That’s reductionist in the extreme, especially when you consider the wider perspective. For starters, ethics is an ancient branch of philosophy that established itself long before privacy, as currently understood by modern Westerners, emerged as a cause célèbre. Also, privacy issues surrounding data, analytics, and decision support existed long before what we now term “big data” came into vogue. Allowing privacy to exclude other ethical dimensions from discussions around big data tends to distract from the broader value context of practices that have been stigmatized.

In the broader context, ethics is an investigation into the normative principles that guide human decisions. This plays, of course, into the central purpose of most business analytics applications: data-driven decision support. These applications go by various names, including performance management, prescriptive analytics, next best action, recommendation engines, and the like. What they all have in common is a focus on how data-driven applications clarify the “should vs. shouldn’t” choices that guide human action.

Clearly, ethical decision support has as many application domains as there are facets to our lives, and as many root principles as all the world’s religions and philosophies combined. As this Wikipedia entry makes clear, there are business ethics, medical ethics, science ethics, political ethics, military ethics, and on and on. Some of these may make you snicker, inasmuch as you believe that the people in these fields of endeavor are unprincipled and self-serving. But even a casual glance at this list should convince you that privacy considerations, though integral to many of these domains, are far from the sole focus.

To speak of “big data ethics” implies that “big data” is an application domain, a la business or medicine. But it’s not. Speaking of “big data ethics” is a bit like talking about “mathematics ethics.” We can’t attribute ethical sensitivities to any big data analytics practice until we specify the sphere of application. Indeed, big data is all about deriving differentiated insights from advanced analytics on data at scale, regardless of the application domain under consideration.

We can speak meaningfully about big data applications in finance, engineering, manufacturing, logistics, and other domains that barely touch on privacy issues. Certainly, we can speak of the larger ethical issues associated with the application of big data in any of these domains. Many of these issues might cross over into domains in economic philosophy, such as fairness, equity, distributive justice, and utilitarianism. But they’re ethics issues at heart because they concern the normative principles that should guide legislators, regulators, judges, business executives, and others in positions of authority.

But to give privacy advocates their due, it’s true that many business big data initiatives focus on an application domain that raises privacy concerns: customer relationship management (CRM). Indeed, you could substitute the term “scalable CRM analytics” for “big data” to describe the precise application domain that has privacy advocates up in arms. CRM covers the principal customer-facing applications: marketing, sales, and customer service. So, essentially, it makes more sense to speak of the ethics of scalable CRM analytics applications than to posit some non-domain-specific concept of “big data ethics.”

In that light, it’s true that customer data is the heart and soul of many big data repositories, as well as online transaction processing systems, data warehouses, and data-mining repositories. Most of these databases hold customer records that constitute personally identifiable information. Hence, they are indeed privacy-sensitive and should be managed in compliance with all applicable laws and regulations.

It’s in this limited context that I agree with the authors of this recent article, who state that the core ethical issues at stake in many big data analytics initiatives are fourfold: privacy, confidentiality, transparency and identity.

And it’s certainly true that the larger CRM issues at stake are indeed, as they state, “money and power.” These are the larger stakes behind “big data ethics,” as it’s commonly understood in the CRM arena.

As I stated in a recent post about utilities’ smart-metering initiatives, data is often a principal weapon in the balance of power between consumers and businesses, and also between business competitors. “Privacy is a big concern [in smart metering],” I stated,” in large part because visibility into your energy-utilization patterns can give others insights into when, where, and how you’re using that energy–in other words, into every aspect of your lifestyle. The same concern applies to access to data on commercial energy consumption. If your competitor can see that you’re using twice as much electricity in your factory now as you did last quarter, they can in principle figure out the extent to which you’ve increased production and, through that datum, how fast your booked business is growing.”

Target marketing is another privacy-sensitive scalable-CRM analytic application that has a direct bearing on the balance of money and power in business-to-consumer relationships. The power to aggregate, correlate, analyze, and act on larger pools of consumer data gives some merchants an advantage in acquiring us as customers, holding onto us, and selling us lots of stuff. Lack of those capabilities may spell the difference between life and death for businesses that could otherwise have served consumers well. And consumers suffer if it becomes excessively difficult and costly to switch to alternative vendors for the things upon which they depend most.

When you transpose this issue from the consumer to the business realm, it’s not really about privacy as normally understood. The larger ethical issue concerns the principles governing control over the data resources upon which so much of modern life depends.

At heart, ethics separates the shoulds from the shouldn’ts. Who should be able to own, access, use, and profit from the data in our lives? Who shouldn’t? For what purposes?  And to whose benefit?

About the author

James Kobielus, Wikibon, Lead Analyst Jim is Wikibon's Lead Analyst for Data Science, Deep Learning, and Application Development. Previously, Jim was IBM's data science evangelist. He managed IBM's thought leadership, social and influencer marketing programs targeted at developers of big data analytics, machine learning, and cognitive computing applications. Prior to his 5-year stint at IBM, Jim was an analyst at Forrester Research, Current Analysis, and the Burton Group. He is also a prolific blogger, a popular speaker, and a familiar face from his many appearances as an expert on theCUBE and at industry events.

  • Many of the people who are writing about the ethics of big data would agree with you that privacy is only one facet of the issue. Beyond the questions that you end with, there are more: for example, whose data gets included? Who often gets left out of “big data” pictures? Academics and other researchers are dealing with the “larger stakes,” too–you just have to be willing to join that conversation.

  • Well said. We similarly wrestled with definitions in our Big Data Ethics paper. When should one refer to “big data” or to “big data analytics.” What is meant by “ethics,” “privacy” and “identity?” We settled on four principles of Privacy (redefined as information rules), Confidentiality, Transparency and Identity to try and expand the debate past privacy (as right to be left alone) and begin thoughtful dialog like you bring forward here.

You might also like...

Machine Learning Will Do Auto-Programming’s Heavy Lifting

Read More →