The value of data is measured by what you do with it, and organizations are relying on data scientists to extract that value. I recently conducted a survey of data professionals to better understand what it means to be a data scientist. I discovered a few things that can help organizations optimize the value of their data. While I wrote about these findings in prior posts, I want to summarize the major points here, in a more concise way.
Facts about Data Science
While some of these points below seem rather mundane or obvious, it's important to note that these ideas are no longer only opinions; they are backed up by empirical data. This is how data science really works.
1. There are a handful of different skills that make up the field of data science. While we measured five distinct skill types, a factor analysis of proficiency ratings of these five skills resulted in three distinct skill types:
- Technology / Programming
- Statistics / Math
2. There are different kinds of data scientists. Our study examined four distinct job roles among these data professionals:
- Developer (e.g., developer, engineer)
- Researcher (e.g., researcher, scientist, statistician)
- Creative (e.g., Jack of all trades, artist, hacker)
- Business Management (e.g., leader, business person, entrepreneur)
Respondents were asked to select which of these job roles best described their work. They could choose one or any combination of job roles. The correlation across job roles (1 = selected; 0 = not selected) was quite low (average r was -.07; highest r was -.30), suggesting that these four job roles are distinct from each other.
3. Different job roles require different skill sets. Data professionals in different job roles have different skill sets. Not surprisingly, data professionals who identified as Developers reported the highest levels of proficiency in Technology and Programming skills compared to their counterparts. Additionally, Researchers reported the highest levels of proficiency in Statistics and Math while data professionals who identified as Business Management reported the highest levels of proficiency in Business. Finally, data professionals who identified as Creative reported moderate ratings across all skill sets, suggesting they are indeed jack-of-all-trades.
4. The scientific method is an effective way to approach data-intensive projects. Scientists have been getting insight from data for centuries using the scientific method. Formally defined, the scientific method is a body of techniques for objectively investigating phenomena, acquiring new knowledge, or correcting and integrating previous knowledge. The scientific method includes the collection of empirical evidence, subject to specific principles of reasoning. The application of the scientific method helps us be honest with ourselves and minimizes the chances of us arriving at the wrong conclusion. The scientific method plays a critical role in understanding any data, irrespective of their size or speed or variety.
5. Statistics skills, compared to other data skills, are good predictors of success of analytics projects. We found that, of the 25 data skills studied, proficiency in Data Mining and Visualization Tools was among the top 4 skills that was correlated with satisfaction with project success across the four different job roles; no matter what your job role is, a solid understanding of data mining and visualization tools will improve your success and satisfaction in analytics projects. Additionally, for data professionals in Business Management roles, their proficiency business skills were the weakest predictors of their project success while their proficiency Statistics skills (e.g., statistics and statistical thinking, data mining and visualization tools, science/scientific method) were among the strongest predictors of project success.
6. Finding a data professional who is proficient in all data science skill areas is extremely difficult. Data professionals rarely possess proficiency in all five skill areas at the level needed to be successful at work. In fact, the chance of finding a data professional with expert skills in all five data science skills is akin to finding a unicorn; they just don't exist.
7. A team approach is an an effective way of approaching data science projects. We found that data professionals who worked with other data professionals who had complementary skills were more satisfied with their work than when they did not work with another data professional. For example, Business Management professionals were more satisfied with the outcome of their work when they had quantitative-minded experts on their team (e.g., Math & Modeling and Statistics) compared to when they did not have them on their team. Also, Researchers were more satisfied with their work outcome when they were paired with experts in Business and Math & Modeling. Developers were more satisfied with their work outcomes when paired with an expert in Business. Creatives’ satisfaction with their work product is not impacted by the presence of other experts; this finding is likely due to the fact that Creatives are not able to contribute sufficiently to teamwork success because they are not highly proficient in any of the data skills (see point 3 above).
8. Gender diversity is lacking among data science professionals. Women make up about 25% of data scientists. Women and men data scientists possess comparable skills and achieve the same level of education, a majority of them with Masters (~50%) and PhD (~20%) degrees. Most females worked in research (63%) roles while men chose to work in research (36%) or business management (32%).
9. You can find out what kind of data scientist you are for free. As part of our study of data scientists, we at AnalyticsWeek developed the Data Skills Scoring System (DS3), a free web-based self-assessment survey that measures proficiency across five broad data science skills: business, technology, math and modeling, programming and statistics. The DS3 takes less than 5 minutes to complete. Our hope is that the DS3 can optimize the value of data by improving how data professionals work together. If you are a data professional, the DS3 can help you:
- identify your analytics strengths
- understand where you need to improve your analytics skill set
- capitalize on development opportunities offered by current or prospective companies
Find out your data skills score by taking the free Data Skills Scoring System Survey: http://pxl.me/awrds3
We are studying the practice of data science by studying those who practice it. We have collected information from 607 data professionals, asking them about their skills, job roles and more. These data helped us learn a lot about the field of data science. The result have helped us clarify how we think and talk about the terms "data science" and "data scientist." We will continue to share our research on data science, helping data professionals and organizations that employ them understand how to maximize the value of their data.
The practice of data science requires proficiency in a handful of specific data skills, including business acumen, technology / programming and statistics / math. Different data professionals report vastly different proficiency levels across these skills. Because data professionals tend to specialize in only one or two skill areas, organizations have a better chance of extracting value from their data when they adopt a team approach consisting of data scientists who have complementary skill sets. Finally, the DS3 provides a reliable and valid way of measuring proficiency across different data skills. My hope is that our research will help companies optimize the value of their data.
Measuring proficiency in data skills is useful for many constituencies, including budding data scientists and companies who employ them. Companies, getting feedback from their data professionals about their skills, can better manage their data science teams and identify and close data science talent gaps. Chief data/analytics officers need to know the proficiency of their data professionals to ensure they staff data-intensive projects with the correct mix of data professionals. Recruiters need to know the skills of their current data professionals to effectively market to and recruit data professionals who have the right skills to fill specific roles. Knowing the data science capabilities of data professionals is a good first step to help organizations improve the value of their data.
An earlier version of this article appeared in CustomerThink.