...

Full Bio

Is Data Science Dead? Long Live Business Science

12 days ago

New Way to write code is about to Change: Join the Revolution

13 days ago

Google Go Language Future, Programming Language Programmer Will Get Best Paid Jobs

334 days ago

New Coder Tool Promises to Turbo-Charge Coding In Major Programming Language

335 days ago

Why Many Companies Plan To Double Artificial Intelligence Projects In The Next Year

335 days ago

Highest Paying Programming Language, Skills: Here Are The Top Earners

654837 views

Top 10 Best Countries for Software Engineers to Work & High in-Demand Programming Languages

596460 views

Which Programming Languages in Demand & Earn The Highest Salaries?

459648 views

50+ Data Structure, Algorithms & Programming Languages Interview Questions for Programmers

273834 views

Top 5 Programming Languages Used By Google Programmers To Developed All Product

245175 views

### I Wanted to Learn Machine Learning and Data Science, But Where to Start?

In other words, the world of physical electronics (semiconductor industry comprises a central portion of that world), must do more to embrace fully the fruits of information technology and new developments in AI or data science.

I had to start somewhere to learn the basics and then study my way deep. The obvious choice was MOOC (Massive Open Online Courses). I am still very much in the learning phase but believe that I have at least gathered some good experience in choosing the right MOOC for this path. In this article, I wanted to share my insights on that aspect.

Therefore, it was important for me to quickly jump into the packages and methods used most widely for data science - NumPy, Pandas, and Matplotlib.

- Data Science Orientation: Discusses the everyday life of a typical data scientist and touches upon the core skills one is expected to have in this role along with the basic introduction to the constituting subjects.
- Introduction to Python for Data Science: Teaches the basics of Python - data structures, loops, functions, and then introduces NumPy, Matplotlib, and Pandas.
- Introduction to Data Analysis using Excel: Teaches basic and few advanced data analysis functions, plotting, and tools with Excel (e.g. pivot table, power pivot, and solver plug-in).
- Introduction to R for Data Science: Introduces R syntax, data types, vector and matrix operations, factors, functions, data frames, and graphics with ggplot2.

If you are allowed to read only one book in your lifetime to learn machine learning and nothing else, pick this book and read all the chapters, no exception. By the way, there is no neural network or deep learning material in this book, so there's that...

- Statistical Thinking for Data Science and Analytics (Columbia Univ.): Foundation statistics course from Columbia University on their Data Science Executive certificate program on edX. Rigorous but drills down the concepts very well in a structured manner.
- Computational Probability and Inference (MIT): This is a hard one from MIT, be aware! It covers advanced topics like Bayesian models and Graphical models in unparalleled depth.
- Statistics with R Specialization (Duke Univ.): This is a 5-course (the last one is a capstone project, you can ignore that) specialization from Duke University to enhance your statistics foundation along with hands-on programming exercise. Recommended for balanced difficulty level and rigor.
- LAFF: Linear Algebra - Foundations to Frontiers (UT Austin): This is an amazing course in linear algebra foundation (along with deep discussion about high-performance computing of linear algebra routines) that you must give a try. Offered by the University of Texas, Austin on edX platform. Trust me when I say, after taking this course, you will never want to invert a matrix to solve a linear system of equations even if that is tempting and easy to understand but you will try to find a QR factorization or Cholesky decomposition to reduce the computation complexity.
- Optimization Methods in Business Analytics (MIT): This is a course in optimization/operation research methods for business analytics from MIT. I signed up because this was the only highly-rated course on a good platform (edX) that I could find about linear and dynamic programming techniques. I believed that learning about those techniques could be immensely helpful as the optimization problem turns up in almost all machine learning algorithm.

I took multiple machine learning courses and the aspect I enjoyed most was realizing how the treatment of the same fundamental subject becomes a function of the personality and worldview of different instructors :) This was a fascinating experience.

- Machine Learning (Stanford Univ.): Andrew Ng's widely known course. Talked about it in the paragraph above.
- Machine Learning Specialization (Univ. of Washington): This comes with a different flavor than Ng's. Emily Fox and Carlos Guestrin present the concepts from a statistician's and a practitioner's perspective respectively. I could not install the Python package that Carlos' company offers as a free license but this specialization is worth completing for its theory lectures alone. The proofs and discussion of some of the fundamental concepts like bias-variance trade-off, cost computation, and comparison of analytic vs. numerical approaches for cost function minimization, are more intuitively and carefully presented than even Prof. Ng's course (and that's saying something given the superb quality of Prof. Ng's teaching).
- Machine Learning for Data Science and Analytics (Columbia Univ.): This course had a little unusual syllabus for a general machine learning course by devoting the full first half on conventional algorithms lectures. It covered essential sorting, searching, graph traversing, and scheduling algorithms. There is not a much one-to-one discussion about how these algorithms are exactly used in the machine learning problems but studying about them gives you an idea about the traditional computer science knowledge necessary to appreciate how large-scale data science problems are tackled. Think O(n^3) whenever you are about to multiply to matrices or think O(nlog(n)) whenever you are sorting a list. You may not exclusively use this knowledge in your day-to-day job, but knowing about these nuts and bolts of computation process certainly broadens your worldview about the problem at hand.
- Data Science: Data to Insights (MIT xPro 6 weeks online course): This one is among the very few paid courses I have taken (I generally go Audit route for MOOCs). This is not available on public edX website although it uses the edX platform for delivering content. The 6-week course is well-structured and full of interesting content which opens up the wide world of data science and machine learning to the uninitiated. The case studies are very interesting but reasonably hard and time-consuming to codify. Lectures are very engaging with the illustration of those case studies. My particular favorite module was the one about recommendation system. I literally started viewing the Netflix screen on my laptop in terms of adjacency matrix after taking this class!
- Neural Networks for Machine Learning (Univ. of Toronto): This is a somewhat underrated course on Coursera, even with the neural network pioneer Jeff Hinton as the instructor. I realize that Andrew Ng's new Deep Learning specialization will directly compete with this course and I would not be surprised if Coursera removes this in near future. However, while it is there, a deep learning enthusiastic should sit through this one, even if just to gauge the pattern of the historical development of deep networks.
- Deep Learning Specialization (deeplearning.ai): This is the newest kid on the block but it stands of the very board shoulder of Andrew Ng, and therefore boasts of very strong legs :) I have finished the 2nd course and on to the 3rd now. The jury is still out there but definitely, you should consider completing this series if you want to brush over the latest trends in deep learning. Even if the programming assignments look hard and you want to stay out of programming a deep network by hand (you can argue there are always excellent open-source packages like TensorFlow, Keras, Theranos, out there to take care of the nuts and bolts under the hood), it is imperative to have deep understanding of the essential concepts such as regularization, exploding gradient, hyperparameter tuning, batch normalization, etc. to effectively use those high-level deep learning frameworks.

- Data Science Specialization (John Hopkins Univ.): This one is a well-known 10-course specialization offered on Coursera. Not every course will appeal to every learner. I personally completed only 5 of the 10. The key thing is the timing i.e. when to start this specialization. Often this comes up at the top of the Google result when one researches about MOOCs for data science and therefore this becomes the first MOOC for many new learners. Personally, I would have had a problem getting the full value from this course if I had done that. The introductory Microsoft and Udemy courses on R and few statistics and linear algebra courses before this helped me immensely to extract the full benefit from these set of courses. As the specialization is instructed by professors from the bio-statistics department of JHU, one gets an excellent treatment of two aspects of data science which are often under-represented in many curriculum- research study and design of the experiment.
- Data Science Micromasters certificate program (UC San Diego): I have just enrolled and started the 1st of the 4 courses in this series/certificate program. I like the fact that this is similar in breadth and goals as the John Hopkins specialization, except it chooses Python as the working language for the hands-on portion. The structure and content seem well thought out covering basics of Python, Git, Jupyter all the way up to Big data processing with Apache Spark framework (statistics and machine learning courses thrown in the middle). The case studies and hands-on examples are drawn from the real-world application of data science such as wildfire modeling, cholera outbreak, or world development indicator analysis. One of the lead instructors is Ilkay Altintas, who has created an amazing platform for helping wildfire dynamics prediction and is putting the fruits of data science research for pursuing societal good. I am sure my journey with this specialization will be an exciting and rewarding one. You are welcome to join the party!