Are you currently pursuing your masters in Data Science? Overwhelmed with Buzzwords and Information? Don't know where and how to start your study? Then start with this article and a starter kit provided, but learn it for excellence and not just for the exams.
Dear future Masters of Data Science,
Many of you might be recently started your master's studies in data science/data analytics/business analytics and little overwhelmed with buzzwords (like Big Data, Machine learning, Artificial Intelligence) and information coming from every possible channel like friends, social media, blogs and websites. Everybody is talking about it and you must be thinking "I'm doing my Masters in Data Science and I want to learn everything about it but where should actually I start and how?"
First of all, Welcome! You have entered fascinating world of Data Science, which comprises science, technology and creativity. From career perspective, you made a right decision at right time, but believe me, "This is not a Joy ride". We, the humans, are on our way to fourth industrial revolution which will lead to dramatic transformation of our world. It will change the way we transact, communicate and make decisions. Data science has opened tremendous possibilities and opportunities in every aspect of business and our daily lives. Shortly, machines will take over low skilled jobs, bots will become smarter and smarter, and the only job of humans will be to teach these robots using algorithms, mathematics and other basic sciences. Data science world is evolving and moving forward so fast that the only option remains is to adopt it as fast as you can. Which, I believe, is only possible if you make your foundation of data science very strong. Programming languages and frameworks will come and go but basics will always be the same. What I'm trying to say is, don't just learn to pass the exams or add a degree on your LinkedIn profile, earn it to excel in Tomorrow's Competitive World and also to contribute to it. Bottom line: Utilise your academic time wisely to learn data science thoroughly and make your foundation rock solid.
Across the world, there are various courses in data science offered with different names and different combinations of modules in it by different universities. E.g. MS in Data Science, MS in Business Analytics, MS in Data Analytics etc. There are certain reasons behind it, based on different expertise you will earn and the profile you will work as. Well, this will be altogether different topic if discussed further, what important is to get to the crux of it. As we all know the Venn diagram of data science below, it requires skills in mathematics, computer science and particular domain. (You can refer different versions Data Science Venn diagrams from previous KDnuggets post: /2016/10/battle-data-science-venn-diagrams.html ) As below diagram is very generic in nature, it is important to know specifics about it from academic perspective. I'll not use names of modules from any of universities/colleges but try to generalise module names in context of academia.
is the most important fundamental science (after Physics
) which many of us hated in our schools and colleges. Ya, I know that smile :) from all engineering students, but guys this maths will really going to help you mint $s. Statistics
, an important branch of mathematics, is your starting point for your journey in data science world. As a starter kit, a list of selected videos and links have provided at the end of this article which will guide you to understand core concepts of statistics.
Computer Science Skills
: In the context of data science, important modules from computer science are programming
and data warehouse
, data mining
and data visualization
. Programming is a must and very critical skill in data science, which is the best way to automate the tasks in your analysis. Database skills are also very important to understand how data is stored and retrieved in structured as well as un-structured format. Data warehouse is advanced concept of database which is designed with specific business needs. And here comes the creative part of data science - data mining and visualization. For me, data science is fascinating because of these two modules. Data mining is about analyzing the data and drawing meaningful insights from it using different scientific methods, machine learning algorithms and creative thinking. Data mining could also be a manual. E.g. You have your monthly expenses data for last few years. If you plot a graph of months vs expenses, you will really find some interesting insights like, in which month you spent or saved less money and why? How can you save money in future by understanding how you saved in past? Did you realize, in this example while understanding what is data mining, we also talked about data visualization (Graph of months vs expenses). Like data mining, data visualization module also needs creative thinking and where you learn how to visualize data so that insights can be easily found by visual exploration or how to communicate your findings in intuitive way to different people. Please check some of the useful links and videos provided at the end of this article that will help you study these subjects thoroughly.
Domain Skills: Modules in this group are mostly elective and numerous in number depending on university/college. These are not the core but the applied modules where you learn domain/business skills and how to apply concepts from core modules to solve real world problems in particular domain/business function E.g. Marketing Analytics, Financial Analytics, Web Analytics, Social Media Analytics etc. You can choose the module depending on your interest in particular area.
Research/Industrial Project: Research project is very important module from academic as well as career perspective. You should try to apply most of your knowledge that you learnt from core and other modules to your research project. Try to read as many articles/academic papers as possible, brainstorm ideas and think loud with friends/professors/meetup groups while finding your research topic.
Apart from academics, soft skills are also important for analytics professionals. Good communication, team work, leadership and creativity, these skills can't be learned from books or videos, you have to learn it by actually working on it. These days, in every big city, different technological meetups/conferences/datathons are regularly arranged by many experts and companies to share knowledge and learn from each other. Attending such events helps you to improve your skills to convey your ideas, to learn from other's ideas and mistakes, to create your professional network and ultimately to be better than what you were yesterday.
So, learn thoroughly, share globally and grow rapidly.
Data Science Core Modules Starter Kit
This is just a starter kit, you can refer many other Data Science knowledge resources available on the Internet.
3. Database and Data Warehouse
4. Data Mining:
5. Data Visualization: