Citizen Data Scientists: Where Do They Belong?

By on

The phenomenal growth of data technology in recent years has led to the rise of the citizen data scientists (CDS).   Developments in augmented analytics and artificial intelligence (AI) automation have now made it possible for ordinary business employees to conduct advanced analytics or business intelligence (BI) which would have required expert knowledge even some years ago.

Gartner research VP, Alexander Linden, describes a citizen data scientist (CDS) as someone:

“Who creates or generates models that use advanced diagnostic analytics or predictive and prescriptive capabilities, but whose primary job function is outside the field of statistics and analytics.”

According to Gartner, this year the automated business analytics solutions and tools, powered by AI and machine learning (ML), are likely to expand the community of CDSs.  Augmented analytics is a popular buzzword in the global industry currently, pushing the “unicorns” to high-profile, special projects. Five years from now, you may see a CDS handling all the routine and even some advanced analytics tasks with automated analytics platforms, while the data scientist get the cream of the crop projects. You might also see the use of ML at a scale where mid-sized and small businesses will have gained access to ML-powered, augmented analytics platforms that “amplify” human knowledge and skills. This signals an era of heightened human-computer collaboration.

A Webinar, Designing a Successful Governed Citizen Data Science Strategy that aired in March 2019, shares proven strategies for beginning the “journey into citizen data science with existing analytics talent.”

Who are largest beneficiaries of the current enterprise analytics trends?  Data Scientists, of course, as they are already in short supply. With the CDSs handling all routine Data Modeling and analytics tasks, the data scientists will be free to pursue more complex projects.

In a recent IT Briefing, the author reminds the readers that Gartner’s Predicts 2017: Analytics Strategy and Technology report had forecasted the following:

  1. Data Science “will explode through automation by 2020.”
  2. By 2020, well over 40 percent of Data Science tasks will be automated.
  3. Data Science solutions in 2020 cater to the Citizen Data Scientists (CDSs) to enhance vendor reach across the enterprise, and to reduce enterprise analytics skill gaps.
  4. The CDSs will most likely “outnumber the amount of data scientists for the amount of analysis produced.”
  5. The CDSs will have a unique role to play in this environment, connecting the self-service analytics platform with highly advanced analytics conducted by data scientists.

You are probably witnessing the outcome of all those predictions now.

The Subject Matter Experts (SMEs), Data Scientists, and the Citizen Data Scientists: A Natural Team

In one of Forbes posts, the author explains that data scientists, CDSs, and SMEs complement each other in advanced analytics. While SMEs provide the “context” for advanced data analysis, CDSs use readymade tools to “create more ‘ah ha’ moments using data, algorithms and models.” However, CDSs cannot fulfill the requirements of highly complex analysis, and data scientists must be present to contribute to advanced Data Modeling activities. As data scientists are becoming increasingly expensive and scarce, only a few may be available in the future to support large teams of SMEs and CDSs.

The Data Scientist vs. the Citizen Data Scientist

With CDSs growing “five times faster in numbers” than data scientists, there is a clear indication that automated ML packages and augmented analytics platforms are working. Author Kartik Patel, in Analytics Translator vs. Citizen Data Scientist: What is the Difference?, reiterates that Gartner has pointed out that the CDS creates models for predictive or prescriptive analytics, but this person’s main job function is outside of the realm of analytics or statistics. The CDS is not trained to be a computer scientist or data analyst, but with the help of advanced data technologies and tools, this role becomes empowered to analyze data and make data-driven, business decisions. The increasingly growing number of self-serve analytics and BI platforms with automated data preparation, built-in ML models, and augmented analytics capabilities empowers the CDS to conduct routine analytics and BI tasks without the presence of a data scientist.

Here are the major differences between the two roles:

  • CDSs can be trained using technology to handle routine analytics tasks while data scientists have additional skills that allow them to tackle challenging analytics projects.
  • In a theoretical sense, the CDS stands for data democracy whereas the trained data scientist stands for controlled data access.
  • CDSs have limited working knowledge of Data Science. The knowledge and experience of a trained data scientist that can be applied is indispensable
  • Citizen Data Science “augments Data Discovery and simplifies Data Science.”

The Blending of Citizen Data Scientists with Strategic Data Management

The good way to understand the collaboration between the data scientists and the CDSs is to probably visualize a close integration between the best Data Management platforms, the wisdom of data scientists, and the functional knowledge of the CDSs. As more CDSs gain data-related knowledge, the less they will depend on data scientists. The automated data platforms will augment CDSs business-process knowledge to facilitate superior Data Management tasks. The specialized data scientists will shrink in number, and only very few will be available to an enterprise for “validating and operationalizing models, findings, and applications.” Gartner recommends developing the appropriate supportive infrastructures to make the collaboration between the data scientist and the CDSs work successfully.

A Datanami author argues for humanized machine learning (ML). This unique concept highlights the use of “augmented intelligence” in a human team of business analysts. In a humanized ML platform, where human data scientists will work side-by-side with ML tools to “explore their data and easily deploy models to unlock the value it holds.” The author of this post, Nathan Korda, Director of Research at University of Oxford’s ML spin-off, Mind Foundry, explains how human data scientists will be able to “input, cleanse, and visualize data in minutes,” for further data exploration. This kind of augmented analytics platform may enable business owners to connect the ML capabilities to actual business value. In addition, the growth of CDSs is allowing smaller businesses to harness the power of ML for profitable business insights.

The Future of Citizen Data Scientists in Business Intelligence

In the world of BI, the CDS has already played a complementary role to data scientists. An e-book, The History of Business Intelligence and its Evolution, reveals how organizations are gradually trusting the CDSs to use automated BI tools for data-driven insights. In other words, the CDSs have been empowered to do “cool things with data” without having a sound knowledge of statistics or mathematics. The modern AI-powered BI platforms and tools facilitate advanced analytics tasks via automaton. The e-book explains that a benefit of developing a “large pool of citizen data scientists” that have access to business data, is creating an “opportunity for user collaboration to create shared, immersive analytic experiences.” In the near future, the CDSs will play a leadership role by influencing their fellow employees to enhance the organizational Data Literacy analytics maturity.

Image used under license from

Leave a Reply