Each year, I write a summary of my most popular blog posts. This year-end summary is a rather simple exercise. I count the pageviews of my blog posts and present the top 10 posts of the year. This year's BOB blog count, however, only goes from January through June 2016. Since June, I have been writing primarily on the Appuri blog - this summer, I joined this fantastic startup (Appuri) where we help companies integrate their data silos and apply machine learning capabilities to the consolidated set of metrics to improve their customer retention efforts. As their Chief Research Officer, I direct research on customer analytics best practices to help companies improve their customer experience and customer success initiatives. If you want to learn more about what we do, please contact me.
In 2016, 9 of the top 10 BOB blogs are about my research on the topic of data science. The research focused on the meaning of data science and how organizations can best leverage their skills. The other top blog is on the statistics topic of sampling error (which is one thing everybody needs to know).
As always, thanks for reading. The top 10 B.O.B. blog posts for 2016 were:
- Top 10 Skills in Data Science: Data professionals report that they are most proficient in such skills as communication, managing structured data, math, project management and data mining and visualization tools. Top skills varied by data science roles.
- The Most Important Skill in Data Science: Mining and Visualizing your Data: The top skill that drives project success is knowing data mining and viz tools. If you're proficient in being able to analyze data, you're more likely to have successful project outcomes.
- Making Sense of Our Big Data World: Samples, Populations and Sampling Error: While we live in a Big Data world, we still work with relatively small data sets. In those cases, you need to appreciate the concept of sampling error to help you distinguish signal from noise.
- 10 Data Science Skills You Need to Improve Project Success: In this post, I looked at the skills that are predictive of the quality of projects. Many of the important data science skills are highly quantitative in nature; in fact, 8 of the top 10 skills include data mining and viz tools, statistics and statistical modeling, science/scientific method and algorithms and simulations.
Empirically-Based Approach to Understanding the Structure of Data Science: This blog was a fun one to write and visualize. I conducted a principal components analysis of 25 data science skills. The results clearly showed that the skills fall into three general skill areas: 1) subject matter expertise, 2) technology and 3) statistics.
- For Data Scientists, Big Data is not so Big: I found that only about a third of data scientists possessed skills needed to handle big and distributed data.
- When Does Education Level Matter in Data Science?: Data scientists are highly educated. I found that over half of them hold either a Masters or PhD degree and about a quarter of them hold a 4-year degree. The level of educational attainment is related to proficiency in data science skills (more advanced degrees are associated with greater proficiency), but only for specific types of data scientists.
- Industry Differences in Data Science Roles, Skills and Project Outcomes: The Information Technology industry is home to most data scientists. Data scientists who are Researchers outnumber the other types of data scientists in six of the top 10 industries that employ data scientists. Data scientists in the Professional Services industry, compared to data scientists in other industries, are highly proficient in all three data science skills: Business, Technology and Math/Statistics skills.
- How Data Scientists Extract Value from Data [Infographic]: I put this infographic together to summarize some of my research to illustrate the necessity of building data science teams who employ the scientific method.
- Are You Using the Right Data Science Tools for Your Big Data Projects?: I found that, on average, data professionals use 5 data tools. No single data science tool will do it all. One way to help in your tool selection process is to identify the tools that are commonly used together and use them.
In 2017, I am continuing my research on best practices in customer analytics, data science and how companies can get the most value from their data. If you want to keep up with my latest thinking on these topics and more, be sure to read my posts at Appuri.