The Data Scientist is the quintessential “go-between,” carrying the problems of the business folks to the IT department, and then bringing back the technological solutions to the business units. A good Data Scientist not only explores, designs, and engineers business solutions, but also determines the best technological option at hand to arrive at such solutions. In that sense, this data magician is the data explorer, investigator, innovator, and the inventor all rolled into one!
The article It’s More Than Science explains that many of the available academic programs in Data Science are all designed to teach the hard skills, which range from statistics or mathematics to Machine Learning and computer programming. However, the “softer” skills sets required of a Data Science job are as important, or even more important, than the listed hard skills in a typical job opening. Data Science is a lot more than simply mathematics or statistics. The successful Data Scientist is expected to comprehend and analyze the daily business needs, visualize an appropriate technological solution, and then pair the vision with an appropriate technological method to arrive at an effective solution.
Though Data Scientists work with high-end technologies, the beneficiaries are the business as a whole. The primary business expectation from a Data Scientist is a reliable IT system that delivers data-driven decisions for day-to-day business problems. So, when thinking of businesses, one has to keep in mind that the aim of a business solution is not to showcase technology, but to solve the identified business solution.
Thus, to articulate the required list of hard and soft skills for a Data Scientist can be a tricky proposition. The Data Scientist on the job typically begins with an exploration of business needs and gaps while keeping a firm eye on where the business is headed in the future. In other words, the Data Scientist’s technological goals must be aligned with the overall business goals of an enterprise to reach effective solutions.
An older McKinsey Report about the future of IT stated:
“The United States alone faces a shortage of 140,000 to 200,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of Big Data.”
Thus, the tremendous need for qualified data professionals has been clearly felt throughout global business landscape. Data Science Skills Must Haves points out that in March 2015, about 60,000 Data Science job postings were found on LinkedIn, and some other 250,000 individuals on LinkedIn listed themselves as Data Science professionals.
The Aspiring Data Scientist? by Saranya Anandh calls the Data Scientist “the sexiest job of the 21st century.” This post further notes that data technologies like Big Data, Hadoop, Cloud Computing, and Data Visualization are finally empowering IT professionals to wrestle with data variety and volume never encountered before.
So, Who Exactly is this Data Scientist?
According to Hillary Mason, a reputed Data Scientist at Accel:
“A Data Scientist is someone who can obtain, scrub, explore, model, and interpret data, blending hacking, statistics and Machine Learning. Data Scientists not only are adept at working with data, but appreciate data itself as a first-class product.”
The visible warning emanating from most recent Data Scientist job descriptions is that under this unique “job title,” many different business roles are implied. At one end of the spectrum, an enterprise can empower a Data Scientist to simply visualize and design data-centric systems to deliver data-driven decisions to young start-ups. On the other end of the spectrum, a Data Scientist could be smack in the middle of data product evolution, where superior statistics, Machine Learning, and computer science skills are required to communicate with the colleagues and engineer master products.
The Essential Skills Set for a Data Science Job
The 2015 article titled The Hard and Soft Skills of a Data Scientist explains that in the current marketplace, it is hard to identify a Data Scientist with the exact set of skills required for a given job title. In situations where a Data Scientist is part of a large team equipped with Data Analysts and Data Engineers, the Data Scientist simply takes the output from these colleagues and makes informed business decisions on behalf of the top management or department heads. In that role, the Data Scientist’s data analysis role comes more into play. This business publication has also sorted through thousands of Data Science job descriptions to collect and organize the information related to essential skill sets for the job. This article claims that to be a top performer, the Data Scientist must demonstrate a wide range of both hard and soft skills.
What are the Required Hard Skills for Data Scientists?
When the major placement services or job applicants sift through the keywords related to Data Science jobs, they are likely to find terms such as MA or PhD in Computer Science, computer engineering, data mining, statistics, mathematics, business analytics, Big Data, Hadoop, or MapReduce. Along with that, a long list of programming language requirements like Python, Ruby, SAS, PHP, or SPSS will also appear. Thus, the clear indication is that a Data Scientist must possess all, or at least some, of the specified hard skills for a job description, which may include large database experience exposure with Teradata or Oracle, matched with five or six computer programming languages, and basic computer science skills. But, that’s an ideal listing, because one cannot expect the beginner Data Scientist fresh out of college to possess such diverse skill sets. Another danger in hard-core academic programs in Data Science is that they are often conceived by repackaging existing syllabi from different programs available elsewhere. Also, hard skills per se do not have any value till they help deliver results in the workplace.
A growing challenge among academic campuses offering Data Science curricula is keeping the bright students in school. Typically during academic programs, many Bachelor’s or Masters-level students are tempted to drop out of their coursework as they are hounded by companies with very high paying job openings. Some of the starting positions offer above $200,000! Even summer internships offer anything between $6,000 to and $10,000 a month.
To counter this problem, many reputed businesses and industry leaders offer crash programs to prepare and train Data Scientists that usually attract a lot of participants. An example of this trend is found in MapR Expands Free Hadoop and Spark Training as Participants Surpass 50,000.
Another problem is that though academic programs offer basic foundation courses, how do students accumulate those eclectic skills like Machine Learning, Predictive or Prescriptive Analytics, Big Data, and Business Intelligence—without which a Data Scientist’s professional life will remain incomplete? Thus, one has to appreciate and understand that Data Science jobs begin with a set of expressed hard skills, but the actual journey of a Data Science job involves ongoing learning and application of many other hard skills and soft business skills. The DATAVERSITY® article The Big Data Skills Gap Is Getting Bigger to get an understanding of this ongoing skill gap.
One such path is discussed in the article titled DAMA’s New Certified Data Management Practitioners (CDMP) Exam. Originating in 2004, the CDMP exam has been accepted as the industry standard in the field of Data Management. DATAVERSITY® conducted an interview of DAMA’s President to get a glimpse of the new exam format and its significance in the Data Management industry. There are numerous other certifications and training programs available for new Data Scientists wishing to expand their skill sets while working within the industry.
What are the Required Soft Skills for Data Scientists?
When it comes to soft skills, most enterprises face a problem in trying to define the skills required to conduct daily business at the workplace. Thus, these “soft skills” are not easily identifiable in the posted Data Science job descriptions. Nowadays, professional talent acquisition specialists probably spend a lot of time researching the industry keywords and requirements for job titles before drafting the list of soft skills for a given job title.
Some common and oft-used keywords articulating the soft skill requirements for Data Scientist positions include:
- Curious and explorative mindset
- Ability to question existing practices and devise alternatives
- Strong analytical skills
- Effective communication skills for diverse audience
- Business problem-solving skills
- Cross-functional team management skills
The Masters in Data Science article titled Data Scientist Skills states that apart from the usually listed hard skills, the most important skills for a Data Scientist is to possess keen investigative skills that enables the person to ask the right questions about a given business situation to discover the root causes of and the extent of a problem. Additionally, the Data Scientist must use effective inter-personal skills to enlist the support of key personnel who can aid the solution identification and discovery process from beginning to end. This article hints at a combined “investigative and inter-personal” mindset that promises success on the grounds. Please also review the past Udacity blog post to get a more rounded view of this job title.
In 5 Things You Should Know Before Getting a Degree in Data Science, Dr. Tara Sinclair, the Chief Economist at Indeed.com, feels that though the volume of Data Science job postings have substantially increased on a YOY basis to 57% in 2015, employers do not make it clear what they require from the applicants of these positions. There is an implicit hint that a degree, combined with a boot camp or crash course in hard and soft skills may be an ideal training for a beginner Data Scientist. The job, this article claims, is really a “mashup” of diverse skills sets, thus it may be very difficult to locate candidates with all the required skills. So, looking for candidates fulfilling partial requirements of long lists of skill sets may be a more realistic approach for hiring! In this context, the article Data Scientist Skill Development evaluates other important observations within this complicated discussion.
Getting Started in the Data Science Profession
KDNuggets has some interesting facts to share about the Data Science profession. In Data Science Skills, this industry watcher shares some great news for aspiring professionals. The individuals, who already have an algorithmic mindset, coupled with zeal to compete, may stand a good chance of succeeding as Data Scientists. For starters, these aspiring data professionals may take a shot at the Kaggle website and see how they perform. Regardless of prior experience and education, if such individuals fare well on the projects available on this site, then that should be a good indication that they are cut out for a career in Data Science. According to KDNuggets, Kaggle provides the best preparation ground for future Data Scientists.
Some other pointers are also available in this article: 5 Tips for Getting Started as a Data Scientist.
Linda Burtch of Burtch Works, and the author of The Must have Skills You Need to Become a Data Scientist, feels that still a lot of aspiring professionals curious to enter the Data Science profession are completely at a loss about where to begin. Linda has compiled a list of technical and non-technical skills, which she has identified as the essential skills for Data Scientists. She has broken down these skills into three groups: Analytics, Computer Science, and Non-Technical Skills. She offers this list to aid all aspiring Data Scientists to position themselves in the market with the right profile for open positions. She warns that as individual companies may have esoteric needs, this list is by no means exhaustive, but it can certainly help to develop a strong profile.