Beginner's guide to learning Data Science

Learning anything new can be quite daunting. But, with some determination and time, almost anything can be learnt.

Two ways to learn Data Science

There are two ways to learn anything new - Top down approach and bottoms up approach.

Data Science is no different. If you follow the bottoms up approach, you will learn linear algebra, probability, calculus, multi-variate stats, Statistics, Information theory etc before you even get started on data science projects.

On the other hand, using the top down approac, you learn the process based on what other people have already done, learn the tools of the trade, start solving problems as early in the process as you can. Later on you can get down to the details and understand how the algorithms work and implement them to get better understanding of underlying principles.

The approach that worked for me

When I was doing some research, I came across Jason Brownlee's blog. I was impressed with the work Jason had done and bought his books to get started.

Jason's Machine Learning with Python book gave me a good head start. Later on, I wanted to build some more expertise in Python, Pandas and solve real world problems.

For Python, I used the following resources:

  1. Dataquest.io
  2. Intro to pandas data structures

For hand-on experience, I completed the following courses:

Although, I'm very familiar with the use cases at our company, I wanted to understand other the popular use cases in Machine Learning. So I signup for free courses online and completed these:

  1. Machine Learning Foundations: A Case Study Approach I'm a fan of Turi (recently acquired by Apple)
  2. Microsoft: DAT210x Programming with Python for Data Science (Course sponsored by Microsoft, available on Edx.org). The assignments are really interesting and you'll learn a lot.
  3. Python for Data Science and Machine Learning Bootcamp, Udemy. Jose Portilla is a great teacher, explains everything in detail.
  4. Computational Investing, offered by Georgia Tech via Coursera. Really fun course to play with financial data.

You will quickly build confidence in solving problems you thought you couldn't solve before.

Other accessible resources that I found helpful:

  1. Think Stats, by Allen Downey
  2. Principles of Data Mining
  3. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management
  4. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking