Data scientists are highly educated. In our study of data scientists, we found that over half of them, both men and women, hold either a Masters or PhD degree and about a quarter of them hold a 4-year degree. The level of educational attainment is related to proficiency in data science skills (more advanced degrees are associated with greater proficiency), but only for specific types of data scientists.
Data scientists have the hottest jobs in America this year. Due to a shortage of these data-savvy people, data scientists are in high demand and well-paid. Because of this demand, many universities are now offering graduate degrees in the field of big data and data science (see here for a recent ranking of the top 50 Big Data graduate programs). But how much education does a person need to help their company successfully solve problems using data. Working data professionals offer their answers to this question (see this thread on Quora). We sought to take a more rigorous approach to answer this question. In our ongoing study of data scientists, we examined if data professional's education level was related to their proficiency across different data science skills.
In our study, we asked data professionals to rate their proficiency levels of 25 specific data science skills across five broad skill areas: Business, Technology, Programming, Math & Modeling and Statistics. We also asked these professionals to indicate their job role: Business Management, Developer, Creative and Researcher. Finally, we asked about their level of education: high school / tech / 2-year degree, 4-year degree, Masters degree and PhD degree. The results appear below.
Educational Attainment by Job Role
We looked at educational attainment by job role. The results are presented in Figure 1. On the whole, these data science professionals in this sample are highly educated. Irrespective of their job role, most of these professionals held, at least, a 4-year college degree (87% to 97%).
Data professionals who self-identified as Business Management or Creatives held the most Masters degree (58% hold this degree).
Most of the data professionals who self-identified as Developers held either a 4-year degree (43%) and a Masters degree (52%). None of them held a PhD.
Data professionals who self-identify as Researchers possessed the largest number of PhD degrees (41% of them held this degree) compared to data professionals in other data science roles. Researchers held the most advanced degrees (80% held either a PhD or Masters degree).
Impact of Education on Skill Proficiency
For our study, skill proficiency could vary from 0 (none) to 100 (expert). A skill proficiency rating of 60 indicated that the data professional could essentially perform the skill independently, without help from others (see dotted line in Figure 2). See this blog post for a more complete discussion of the rating scale.
Data professionals with a PhD degree possessed proficiency (proficiency rating greater than 60) to independently perform more skills (10 skills) compared to the data professionals with other degrees (Masters degree: 5 skills; 4-year degree: 2 skills; High school degree: 1 skill).
While a PhD degree can be granted in a variety of disciplines, a common theme among them is that they are research-intensive; as part of your PhD, you learn how to conduct research, typically culminating in an independent research project focused on the PhD candidate's area of interest. Not surprisingly, we found that data professionals with a PhD were generally more proficient in skills related to math and statistics, skills necessary to conduct scientific research.
We did not see any significant differences (or very little differences) across educational attainment levels in data science skills related to programming and technology. The only exception is that data professionals with a PhD had greater proficiency in machine learning compared to the rest (however, prior research showed that machine learning skill is more closely associated with statistics/math than technology).
Impact of Education Depends on Data Science Job Role
Next, I compared data skills proficiency across education for the different job roles. To simplify the results, I averaged the proficiency ratings of specific skills into their respective skill area (i.e., Business, Technology, Programming, Math & Modeling and Statistics).
Even though four different job roles were included in the study, we only had enough data to examine three of the job roles: Business Management, Developers and Researchers. Due to small sample sizes (n <=20), we were unable to study some educational segments (e.g., high school / tech / 2-year degree and PhD) for many of the comparisons. The results appear in Figure 3.
Developers. Education attainment (at least for 4-year and Masters degrees) does not appear to have an impact on skill proficiency for Developers. No significant differences in skill proficiency were found between educational levels.
Business Management. Educational attainment (at least for 4-year and Masters degrees) does have an impact on skill proficiency for data professionals who self-identify as Business Management. Business Management professionals with a Masters degree possessed greater (statistically significant at p < .05) proficiency in skills related to Business, Math & Modeling and Statistics compared to their Business Management counterparts with a 4-year degree.
Researchers. For data professionals who are Researchers, educational attainment (for 4-year, Masters and PhD degrees) does have an impact on skill proficiency. Not surprisingly, more advanced degrees translated into greater proficiency (statistically significant at p < .05) in skills around Math & Modeling and Statistics.
Education level is related to proficiency in data science skills for some data science roles. In general, research data scientists who hold a PhD are more proficient in Statistics and Math & Modeling compared to their counterparts who hold a Masters or 4-year college degree. Also, business management data scientists who hold a Masters degree were more proficient than their peers who hold a 4-year degree in Business, Math & Modeling and Statistics. We did not, however, see any skill difference between Developers who held a Masters or those who held a 4-year degree.
If you are a recruiter or manager of data science teams, the study results suggest that you need to consider applicants' formal educational background. Education does play a role in how well data professionals can do their jobs, but only for specific types of data science roles. Getting a PhD degree is beneficial for the data scientist who is a researcher. These researchers are better equipped to independently perform tasks in math and statistics compared to their counterparts. For the data scientist who specializes in business management, a Masters degree sets them apart with respect to their business skills compared to their peers who only possess a 4-year degree. For the data scientist who is a developer, getting a Masters degree does not necessarily give them an advantage over developers who possess a 4-year degree.
There are many ways to acquire data science skills. While universities are offering graduate degrees in data science, education companies are offering immersive courses in the same areas that require significantly less time (see Galvanize, Udacity, Coursera). Future research needs to understand how graduates of these programs compare to other data professionals who have taken a more traditional route to learning data science skills.