As data continues to rise in volume and variety, and businesses grow more data-centric, the decision-making power of advanced algorithms is now openly welcome in enterprises of all sizes. Now, the average business user is not affected even if a knowledgeable Data Scientist is not around. Today, technology innovations are increasingly empowering the ordinary staff with tools to conduct analytics on the fly and extract insights. With Artificial Intelligence (AI) and Machine Learning (ML) gaining prime attention in the Analytics and BI markets, the traditional roles of Data Scientists are about to change, as discussed in the DATAVERSITY® article Will Data Scientists Automate Themselves Out of Jobs?
In popular industry literature, there has been much discussion about Data Scientists soon becoming obsolete. However, some fundamentally wrong assumptions have led industry reviewers to jump to such conclusions. Data Scientists bring in a bundle of skills – computer science, programming, mathematics, statistics, and domain knowledge, and it is not easy to replicate these skills through automated tools. Moreover, real-life Data Science projects require “collaboration,” which cannot happen without human intervention.
Recent KD Nugget poll results, shared in the post titled Data Scientists Automated and Unemployed by 2025?, reveal that about 51 percent of respondents believe that the full automation of Data Science will happen within the next 10 years. However, about 25 percent of the respondents think this change will happen in either 50 years or never.
Sebastian Raschka, researcher of applied Machine Learning and Deep Learning at Michigan State University, thinks that the future of Data Science does not indicate machines taking over humans, but rather human data professionals embracing open-source technologies.
It is common understanding that future Data Science projects, thanks to advanced tools, will scale to new heights where more human experts will be required to handle highly complex tasks very efficiently. However, according to McKinsey Global Institute (MGI), the next decade will witness a sharp shortage of around 250,000 Data Scientists in the U.S. alone. The question is whether machines can ever enable seamless collaboration between technologies, tools, processes, and end users. Automated tools and assistants can aid the human mind to accomplish tasks more quickly and accurately, but machines cannot ever be expected to substitute for human thinking. The core of problem-solving is intellectual thinking, which no machine, no matter how sophisticated it is, can replicate.
Widespread ML Automation is Inevitable in Near Future
What the current generations of Data Scientists cannot escape is the all-pervasive automation of ML-powered business systems, where many laborious human tasks will be routinely conducted by tools or bots. So far as Data Scientists are concerned, that is good news, because human minds will be left free to pursue the complex problem-solving issues.
Forrester’s Report Massive Machine-Learning Automation is the Future of Data Science implies that though modern organizations are overjoyed by Machine Learning systems that reveal actionable insights, predict customer behavior, and aid better decision-making, too often these systems are hard to crack. The general understanding is that as ML starts delivering automated models, the learning curve for Analytics users will substantially reduce.
Many businesses either cannot afford to keep a sufficient number of Data Scientists or simply cannot find experts with the right balance of skills. In such scenarios, the automated Analytics and BI platforms will empower the “skilled information analysts” to conduct the daily data science tasks. This will mean a broader access to data sources, data types, and Analytics capabilities.
IT and Data Education Predictions: Democratizing Tech Skills suggests that while automated systems partially solve the supply-demand gap in the field of Data Science, academia has to keep up with the pace of technology vendors to churn out qualified DS professionals in the future.
The Golden Era of Citizen Data Scientists is Imminent
Widespread automation of business Analytics and BI systems will encourage more business users to pursue data-technology tasks on their own. Gartner Says the Age of the Citizen Data Scientist Is Dawning states that the automation brings a huge financial relief to businesses. Data Scientists typically cost a lot, thus getting the work done with fewer “unicorns” and more automated tools will be a welcome change.
Alexander Linden, the Research Vice President at Gartner, says:
“Making Data Science products easier for citizen Data Scientists to use will increase vendors’ reach across the enterprise as well as help overcome the skills gap. The key to simplicity is the automation of tasks that are repetitive, manual intensive and don’t require deep data science expertise.”
In keeping with this belief, Gartner’s review of Self-Service Analytics makes a solid prediction: That Citizen Data Scientists will produce more Analytics than the real experts by 2019. However, this press report also warns that the success of “self-service” will heavily depend on the robustness of the Data and Analytics Governance.
Self-service, by definition, implies free-form exploration of data, which only a highly flexible but effective governance framework may be capable of handling. This is where the Data Scientists of the future will come in. They will initiate the ordinary business users into self-service through formal “on-boarding” programs.
Are Data Scientists Needed in the Self-Service Analytics World? seems to point out that only a qualified Data Scientist can unravel the mystery hidden behind the flashing dashboards. The average business user, while capable of simple filtering or grouping of data, will never be able to conduct advanced data visualization.
Automated Systems Cannot Replace Data Scientists
Why Automation Won’t Replace Data Scientists Yet from Cloud Computing News confirms that Data Science is set to be the primary differentiator for business success, and thus all major Data Analytics vendors are now focusing on simplifying their systems for broader and faster adoption.
What does that mean for Data Scientists? If systems perform the rote tasks of data cleaning, data integration, and basic data modeling, then the Data Scientists will have plenty of time to concentrate on complex algorithms that machines cannot deliver.
Which Data Science Tasks Cannot be Automated?
There is an impressive list of tasks like Data Cleaning, Data Integration, and routine Data Modeling that ML-enables systems are handling well. Yet, there is a lot more to Data Science. Take the example of Data Wrangling, which involves converting raw data into a machine-readable form. This process requires keen human judgment, which machines cannot be trusted with. The Venture Beat post 4 Reasons Bots Won’t Replace Data Scientists Anytime Soon discusses why Data Wrangling is not a machine-driven task.
Another good example is Data Visualization, where a data expert guides the C-Suite executives or other business users through personal interpretations to arrive at good decision. Data Interpretation and Visualization is still very much a domain of Data Scientists.
The Innovation Enterprise article Expert View: Can Data Science Be Automated? provides an insider view of the future Data Science industry, and includes a curated view of opinions from industry leaders.
Major Trends in Data Science for 2018
A blog post from Data Science predicts that the Data Science projects implemented in global enterprises this year will be more complex, but more collaborative in nature. The inherent nature of these projects will necessitate these changes in the enterprise:
- The Chief Data Officer (CDO), who is a seasoned Data Scientist, will feature in every organization to design, develop, and manage the enterprise-wide data strategy. The CDO will directly report to the CEO of an organization.
- Data Scientists in enterprises will embrace open-source tools to pursue day-to-day Data Science activities.
- AI projects will turn attention to crowd-sourced data for enabling useful solutions like road-accident prevention and flood warnings, which mean Data Scientists will be required in the teams.
- Data Security regulations (GDPR) will be at the forefront of enterprise operations. Data Scientists will have to be trained in GDPR regulations for companies to stay in business.
Forrester states that by 2020, data-driven businesses will be “collectively worth $1.2 trillion, up from $333 billion in 2015.” Governing and managing these huge data troves will require the participation of seasoned Data Science professionals. Thus, Data Scientists are here to stay and take on new challenges.
Photo Credit: jijomathaidesigners/Shutterstock.com