The proverbial "Data Science Unicorn" who equally has mastered software development, data science, and business. That's like seeking Major League Baseball prospects who are both outstanding infielders and outstanding pitchers. While there may be the rare bird who can both pitch and hit (New York Yankees Babe Ruth, California Angels Shohei Ohtani), it is more common for successful Major League Baseball teams to draft the BEST infielder and the BEST pitcher and have them fulfil different roles on the team.
So here is what we teach our future business and society leaders what they need to understand about Data Science:
1) Data Science is a Team Sport
Data science is a team sport comprised of Data Engineers, Data Scientists and Business Stakeholders. And like a baseball team can't function effectively with only shortstops and catchers. One's data science initiative MUST clearly articulate the roles; responsibilities and expectations of the Data Engineers, Data Scientists and Business Stakeholders (see Figure 1).
Figure 1: Data Science Team Roles
If the goal of your organization is to become more effective at leveraging data and analytics to power your business models and drive digital transformation, you can't win that game with a team full of pitchers.
See the blog "A Winning Game Plan For Building Your Data Science Team" for more details on creating a winning data science team.
2) Embrace "Thinking Like A Data Scientist"
It is critical to the effectiveness of your data science strategy that your business stakeholders not only intimately understand the business, but also know how to "think like a data scientist." That is, the business stakeholders must understand the data science process in order to not only collaborate, but ultimately to lead the data science efforts to ensure that the precious data science resources are focused on the most important business opportunities (see Figure 2).
Figure 2: The "Thinking Like A Data Scientist" Process
See the blog "Refined Thinking like a Data Scientist Series" for more details on each of the steps in the Thinking Like A Data Scientist (TLADS) process.
Note: as you can see from the differences in Figure 2 versus the process laid out in the blog, we continue to refine and update the TLADS process based upon customer engagements as well as class work. That refinement effort has led to the following development Ã?Â¢?? the Hypothesis Development Canvas.
3) Hypothesis Development Canvas Is Your Monetization Guide
Our most recent development to capstone the "Thinking Like A Data Scientist" Process is the Hypothesis Development Canvas. The Hypothesis Development Canvas uses a common design thinking technique to summarize on a single document all the critical aspects of the TLADS process (see Figure 3).
Figure 3: Hypothesis Development Canvas Ties Business Strategy with Data Science Execution
And while we are still in the early stages of testing and refining the canvas, the early feedback and results are very encouraging. See the blog "Data Science 'Paint by the Numbers' with the Hypothesis Development..." for more details on the Hypothesis Development Canvas, as well as the Business Model Canvas and the modified Machine Learning Canvas.
4) Gain High-level Understanding of Advanced Analytics
While we don't expect that business students master data science, it is very important that they understand what data science (and advanced analytics) can do to power the organization's business models. Let's start the advanced analytics conversation with an overly-simplified definition of Artificial Intelligence or AI:
AI is about codifying customer, product, operational or market patterns and relationships in order to learn, act and/or automate.
The supporting advanced analytics can then be categorized or layered into the 3 levels of advanced analytics (see Figure 4).
Figure 4: Three Levels of Advanced Analytics
Components of Advanced Analytics include:
1. Level 1: Statistics & Predictive Analytics quantifies cause-and-effect (correlation coefficient) and goodness of fit (Chi-squared test)
2. Level 2: Deep Learning (Neural Networks) learns from a training data set and then applies those learnings to new data sets (photos, images, audio, handwriting)
3. Level 2: Supervised Machine Learning identifies known unknown relationships that drive "labeled" or known outcomes (e.g., fraud, attrition, product failure, spam)
4. Level 2: Unsupervised Machine Learning identifies unknown unknown relationships - clusters, segments, associations hidden in the data
5. Level 3: Reinforcement Learning & Artificial Intelligence learns (mostly through trial and error) and adapts in order to operate within continuously changing environment (robots, autonomous vehicles)