by Charles Roe
The reports are complete and the data is out there for everyone in the industry to read: there is an increasing shortage of data professionals trained with the analytical skills necessary to effectively deal with Big Data and its offspring. According to the now oft-quoted 2011 McKinsey Global Institute (MGI) study, the USA will face a shortage of well over 140,000 analytics professionals by 2018, and that number increases to over 1 million when managers/administrators are added into the equation.
Data is not going anywhere but up. The levels of data humans are creating on a daily basis are growing at near exponential rates due to the growth of unstructured formats, social media, and smart devices, along with traditional relational structures.
In a recent DATAVERSITY™ interview with IBM’s Deepak Advani, the Vice President of Business Analytics Products & Solutions, the problem of Big Data was addressed:
“The 2.5 quintillion bytes of ‘Big Data’ we produce every day is everywhere – it is in every industry and touches all layers of a business, from the C-suite to entry-level. And this information overload shows no sign of stopping. In fact, IDC predicts total data volume will reach 35,000 exabytes in 2020, compared to 1,200 exabytes in 2010, representing a 29 fold increase over ten years.”
Businesses worldwide need to be able to capture, organize, examine, and evaluate all that data so that they can gain competitive advantages in their particular market sphere, develop more efficient operations, increase customer satisfaction and deeper relationships, gain new insights and opportunities, and ultimately utilize all those terabytes, exabytes and zettabytes effectively. According to Brian Hopkins at Forrester, “[W]e estimate that firms effectively utilize less than 5% of available data.” Much of this is due to a lack of trained employees who can deal with such massive volumes of data proficiently.
Analytics and Data Education
Universities around the world are now moving forward with new Master’s Degree programs in Analytics and other related disciplines. IBM has been working with many institutions to help train their students with the requisite skill sets to move into the world of Big Data after graduation. As stated by Mr. Advani:
“Since 2002, IBM has nurtured an academic program by partnering with more than 6,000 universities across the globe to help meet this demand and keep the evolving global workforce strong. In the area of analytics, specifically, IBM is currently working with more than 200 academic organizations globally to expand and strengthen analytics curricula to meet the growing demand of highly skilled analytics business workers of the future. What’s unique about IBM’s initiatives is that we are working with Business Schools to develop new course curricula and merge business and IT skills.”
Such schools in the United States include University of Rochester Simon School of Business, Northwestern McCormick School of Engineering, North Carolina State Institute for Advanced Analytics, Rutgers Discovery Informatics Institute, University of Louisville, the Rensselaer Polytechnic Institute, and Yale School of Management.
The push is focused on giving students practical experience in the field of analytics as well as the business skills necessary to succeed in the contemporary world of Big Data. Success in the new Data Management industry requires students to have many seemingly disparate skills, from mathematics and statistics to an understanding of business operations, advanced computer programming skills to data mining, economics, complex reasoning, and data modeling. Universities are working with IBM, as well as other companies like SAS Institute and Teradata to properly train students for jobs in Big Data. According to Diego Klabjan, associate professor and director of Northwestern’s Master of Science in Analytics (MSiA), “[F]or anyone with even basic knowledge of Hadoop and Cassandra, getting a job is a piece of cake and there is a shortage of people with deep knowledge of these things.”
A Discussion of Curriculum
A quick look at the offered coursework of any of the new Analytics programs offers some clarity into what the push into the future of Big Data is all about: analytics and data science. Bachelor’s Degree students will most likely come into such programs with Computer Science, Mathematics or other similar experience, but many of the new programs are attempting to entice Business and Economics students as well, since the industry needs professionals who have strong backgrounds in both areas. Northwestern’s MSiA program covers all areas of analytics including prescriptive, descriptive and predictive, plus other courses such as:
- Statistical Methods for Data Mining
- Decision Analysis
- Introduction to Databases
- Data management (which includes information on ETL, Governance, Stewardship and Metadata)
- Big Data Analytics
- Data Mining
- Possible electives cover a range of areas such as Social Networks, Analytics for Finance, Supply Chain Management, Healthcare, Energy and so forth
Rutgers Graduate Certificate Program in Computational and Data-Enabled Science and Engineering (CDS&E) “is a cross disciplinary graduate program administered by RDI2 and the Master of Business and Science (MBS) program. The goal of the program is to provide the necessary structures, learning opportunities, and experiences, beyond the more traditional university curriculum, that are necessary to drive science, engineering, and business using advances in cyber infrastructure.”
Rutgers also offers a Graduate Certificate in Discovery Informatics: the Master of Business and Science (MBS) in Discovery Informatics and Data Sciences. Each of those programs was created to cover both ends of the enterprise spectrum – the business and the technical. A few of the courses for the various programs (though each differs significantly) include:
- Fundamentals of Analytics
- Machine Learning
- Database Design and Management
- Parallel and Distributed Computing
- Programming Methodologies for Numerical Computing and Computational Finance
- Principles of Communication & Professional Development for Science & Technology Management
- Ethics for Science & Technology Management
- Principles of Finance and Accounting
The University of North Texas recently started the iCAMP (Information: Curate, Archive, Manage, Preserve) project focused on “educating librarians and researchers for digital curation and Data Management.” The program utilizes many of the emphases listed above from Northwestern and Rutgers, but with a primary focus on preparing data management professionals who want careers in the academic world. The first courses started in early June, 2012 and include:
- Fundamentals I – Cyberinfrastructure Fundamentals and Data Management
- Fundamentals II – Technology Infrastructure, Tools, and Applications for Digital Curation
North Carolina State’s Master of Science in Analytics (MSA) “is an integrated curriculum” that focuses on deep analytics skills along with coursework and skill building necessary for the real world of Data Management (the points listed below are areas covered in various courses):
- Marketing Science and Customer Analytics
- Teamwork and Conflict Resolution
- Technical Writing
- Risk Analytics
- Fraud Detection
- Legal Issues and Responsibilities
- Linear and Matrix Algebra
- Consulting Skills
The four universities detailed in this article are only a few of the many hundreds worldwide that are adding new degrees, new coursework, and new concentrations to their programs in an attempt to train and educate data professionals in the modern exigencies of Data Management, Data Science, Analytics, and Big Data. These programs are transforming their curricula so that the entire gamut of required aptitudes is covered. “As universities prepare students for future job opportunities, it is important to look beyond traditional computer science and engineering courses,” said IBM’s Deepak Advani. “Rather, incorporating analytics and evidence-based reasoning across all areas of business ranging from marketing to economics and brand development to entrepreneurship allows students combine their technical savvy and business skills.”
As the prognostications of industry experts keep coming in, as the world’s data expands to over 7.9 zettabytes by 2015, as enterprises begin to utilize more than the current 5% approximation of Brian Hopkins and Forrester, as more governments understand the need for Big Data Initiatives so they can save more than $100 billion per year in administrative costs as estimated by McKinsey Global Institute, the jobs for well educated, trained, and highly skilled data professionals are going to expand as well. The lack of 140,000 to 190,000 jobs may increase two or even threefold over the next two decades. Universities and high tech companies are working together to fill the gaps.
The National Science Foundation has taken up the call as well, and recently awarded UC Berkeley a $10 million grant to continue their work on Big Data analysis. The work will continue and the government is getting on board. The Cloud and Autonomic Computing (CAC) Center now has many universities on board, including the University of Florida, the University of Arizona, Rutgers, the State University of New Jersey and Mississippi State University, as well as a host of top technology companies such as Intel, Microsoft, Raytheon, Xerox, the US Army Corps of Engineers’ Engineer Research and Development Center (ERDC), and Avirtec. More initiatives and programs are created all the time, new cognitive systems are being built and perfected, and while Big Data might scare some people right now, with so many resources and so much focus, it’s only a matter of time that Big Data becomes just another fish in the pond, well fed and happy, no longer a shark lurking about waiting for its next easy meal.