Is Data the Achilles Heel of AI?

By on
Read more about author Amy Brown.

As Benjamin Franklin once said, “Nothing is certain except death and taxes.” Add this 21st-century irrefutable fact: The world can’t live without data. With its numbers, characters, facts, and statistics – the operations performed, stored, and analyzed – data has become an irreplaceable facet of daily life. 

We use data to identify strengths and weaknesses. It helps businesses establish baselines, benchmarks, and goals to drive growth. It empowers people to make informed decisions, formulate strategies, and solve complex problems. And it enables leadership to understand behavior and address needs.

But organizations face a serious challenge: data silos. A Forrester study showed that cross-organizational, external, and internal data silos slowed machine learning (ML) deployments and outcomes, with 38% of respondents agreeing they needed to break down data silos across the organization and partners.

Enter artificial intelligence (AI), offering potential insights and opportunities across all industries, including health care – and the power to desilo data. But organizations often find themselves caught up in the idea of deploying AI but fail to take advantage of it. 

Ancients believed the legendary Achilles was immortal. Yet one exposed part of his ankle left the mighty warrior vulnerable in battle. Is AI similarly vulnerable, or can organizations protect themselves from the challenges posed by valuable untapped data? 

Data: Dangerous Obstacle or Opportunity?

More than one industry oracle has predicted that data could become AI’s Achilles heel. Data literacy lags because many companies lack the people and financial resources to review business challenges and identify opportunities for leveraging data to resolve those challenges.

But even those companies with resources to be data-literate face other battles. Data silos, for example, inhibit AI from doing its job. Cross-organizational, external, and internal data silos can slow ML deployments and outcomes. Data silos hinder the ability to derive actionable insights decreases, inhibiting operationalization. Companies need to break down these silos across an organization and its partners.

Data transparency is also difficult but necessary to achieve. Without transparency, executive leadership and other decision-makers struggle to see the ROI after their companies adopt AI/ML solutions – and without a clear connection, buy-in decreases. Without transparency, it’s also more difficult to safeguard against bias, misguided information, and potential harm resulting from algorithms that carry inherent bias in training data. 

But What About Dark Data?

As the name implies, dark data is stealthy because while it’s stored securely, it’s rarely used for other business use cases. Several factors contribute to its inaccessibility and the difficulty people have in analyzing and using it. It’s typically high volume, exists in a variety of file formats, and is generated at such a high velocity that no one human can manage its volume.

Working with diverse, messy (and large) dark data sets isn’t easy, but is it an Achilles heel? It doesn’t have to be. 

Tapping into Dark Data

According to Gartner, dark data comprises the majority of an enterprise’s information universe. Companies able to identify the data they have, store it securely while maintaining accessibility, and tap into it to gain and leverage insights to benefit the business position themselves well to counteract any data weakness. 

But because dark data lives in company archives, it becomes too easy to ignore or forget like Achilles’ overlooked heel. Unstructured data hasn’t become dark – yet. But because it’s often difficult to access, analyze, and understand, it doesn’t take much for it to age, grow dark, and become obsolete.

Organizations must bring meaning, structure, and visibility to that unstructured data – and that’s where AI technologies shine, using AI and ML to demystify and unlock data. Companies can more effectively manage their data by:

  • Reviewing existing data inventory generated from current operations and processes, identifying and deleting stale data that won’t generate any valuable insights
  • Identifying and establishing the context where the data was generated
  • Structuring and analyzing unstructured data via natural language processing (NLP) and other automation tools to extract insights
  • Optimizing processes to streamline and reduce places where data is generated and stored

Helping Humans Understand Humans

Conversational data helps humans make sense of what’s most important. AI enables organizations to sift through the noise to find themes, patterns, and important or urgent things that matter to humans. 

Mining unstructured data isn’t easy. With all its formats (or lack thereof), it poses quite a challenge to harness. But information like unstructured conversations hiding in recorded phone conversations, emails, files, or other communications yields rich insights to help organizations:

  • Understand the voice of their customers and employees for unsolicited feedback
  • Identify disruptions, recurring patterns, and positive trends
  • Smash silos to build cross-functional teams sharing one source of data insights
  • Pinpoint pain points and training opportunities
  • Make data-backed decisions supported by statistically significant sample sets

The most effective way to extract meaning from unstrcutured data is to use a domain-specific, vertically and expertly trained AI that focuses only on highly sensitive, highly complex conversations. When you can extract meaning for users in a highly contextualized, relevant way, dark data doesn’t become an Achilles heel.

The “Real” Trojan Horse

Perhaps the real Trojan Horse is ChatGPT (and others like it). These complex chatbots are trained by an enormous dataset using myriad documents from the internet, whether the data is accurate or not. But while these bots have value for answering some questions for humans, the problem remains that those conversations generate more unstructured data silos. Organizations seeking to improve will still need to bring structure, visibility, and meaning to their unstructured data.

The most effective, advanced solutions will focus on domain specificity – like the health care space – and leverage highly accurate, reliable training data. The arrival of ChatGPT and other bots won’t change the fact that human leaders in corporate America are empowered and entrusted to understand the customers they serve – and there are consumers (like health care patients) desperate to be heard and understood. Whether communicating via chatbot, interactive voice response (IVR), email and text, or old-fashioned human-operated call centers, they’re communicating, nonetheless.

Suppose dark data is the Achilles heel of AI. If so, humans can better leverage AI by championing the effective use of unstructured data to inform and develop actionable insights that drive business outcomes.