There are amazing introductions, courses and blog posts on Deep Learning. But this is a different kind of introduction.
But why weird? Maybe because it won't follow the "normal" structure of a Deep Learning post, where you start with the math, then go into the papers, the implementation and then to applications.
Why Am I making this introduction?
Sometimes is important to have a written backup of your thoughts. I tend to talk a lot, and be present in several presentations and conference, and this is my way of contributing with a little knowledge to everyone.
Deep Learning (DL)is such an important field for Data Science, AI, Technology and our lives right now, and it deserves all of the attention is getting. Please don't say that deep learning is just adding a layer to a neural net, and that's it, magic! Nope. I'm hopping that after reading this you have a different perspective of what DL is.
Deep Learning Timeline
I just created this timeline based on several papers and other timelines with the purpose of everyone seeing that Deep Learning is much more than just Neural Networks. There has been really theoretical advances, software and hardware improvements that were necessary for us to get to this day. If you want it just ping me and I'll send it to you. (Find my contact in the end of the article).
What is weird about Deep Learning?
Deep Learning has been around for quite a while now. So why it became so relevant so fast the last 5�??�?�¢??7 years?
As I said before, until the late 2000s, we were still missing a reliable way to train very deep neural networks. Nowadays, with the development of several simple but important theoretical and algorithmic improvements, the advances in hardware (mostly GPUs, now TPUs), and the exponential generation and accumulation of data, DL came naturally to fit this missing spot to transform the way we do machine learning.
Deep Learning is an active field of research too, nothing is settle or closed, we are still searching for the best models, topology of the networks, best ways to optimize their hyperparameters and more. Is very hard, as any other active field on science, to keep up to date with the investigation, but it's not impossible.
Methods from algebraic topology have only recently emerged in the machine learning community, most prominently under the term topological data analysis (TDA). Since TDA enables us to infer relevant topological and geometrical information from data, it can offer a novel and potentially beneficial perspective on various machine learning problems.
Luckily for us, there are lots of people helping understand and digest all of this information through courses like the Andrew Ng one, blog posts and much more.
This for me is weird, or uncommon because normally you have to wait for sometime (sometime years) to be able to digest difficult and advance information in papers or research journals. Of course, most areas of science are now really fast too to get from a paper to a blog post that tells you what yo need to know, but in my opinion DL has a different feel.
Breakthroughs of Deep Learning and Representation Learning
We are working with something that is very exciting, most people in the field are saying that the last ideas in the papers of deep learning (specifically new topologies and configurations for NN or algorithms to improve their usage) are the best ideas in Machine Learning in decades (remember that DL is inside of ML).
I've used the word learning a lot in this article so far. But what is learning?
In the context of Machine Learning, the word "learning" describes an automatic search process for better representations of the data you are analyzing and studying (please have this in mind, is not making a computer learn).
This is a very important word for this field, REP-RE-SEN-TA-TION. Don't forget about it. What is a representation? It's a way to look at data.
Let me give you an example, let's say I tell you I want you to drive a line that separates the blue circles from the green triangles for this plot:
Ian Goodfellow et al. (Deep Learning, 2016)
This example is from the book of Deep Learning by Ian Goodfellow, et al. (2016).
So, if you want to use a line this is what the author says:
"... we represent some data using Cartesian coordinates, and the task is impossible."
This is impossible if we remember the concept of a line:
A line is a straight one-dimensional figure having no thickness and extending infinitely in both directions. From Wolfram MathWorld.
So is the case lost? Actually no. If we find a way of representing this data in a different way, in a way we can draw a straight line to separate the types of data. This is somethinkg that math taught us hundreds of years ago. In this case what we need is a coordinate transformation, so we can plot or represent this data in a way we can draw this line. If we look the polar coordinate transformation, we have the solution:
Ian Goodfellow et al. (Deep Learning, 2016)
And that's it now we can draw a line:
So, in this simple example we found and chose the transformation to get a better representation by hand. But if we create a system, a program that can search for different representations (in this case a coordinate change), and then find a way of calculating the percentage of categories being classified correctly with this new approach, in that moment we are doing Machine Learning.
This is something very important to have in mind, deep learning is representation learning using different kinds of neural networks and optimize the hyperparameters of the net to get (learn)the best representation for our data.
This wouldn't be possible without the amazing breakthroughs that led us to the current state of Deep Learning. Here I name some of them:
2. Idea: Better initialization of the parameters of the nets. Something to remember: The initialization strategy should be selected according to the activation function used (next).
capsule-networks - A PyTorch implementation of the NIPS 2017 paper "Dynamic Routing Between Capsules".
And there are many others but I think those are really important theoretical and algorithmic breakthroughs that are changing the world, and that gave momentum for the DL revolution.
How to get started with Deep Learning?
It's not easy to get started but I'll try my best to guide you through this process. Check out this resources, but remember, this is not only watching videos and reading papers, it's about understanding, programming, coding, failing and then making it happen.
Deep Learning from deeplearning.ai. If you want to break into AI, this Specialization will help you do so. Deep...
Siraj Raval: He's amazing. He has the power to explain hard concepts in a fun and easy way. Follow him on his YouTube channel. Specifically this playlists:
As you know by now, machine learning is a subfield in Computer Science (CS). Deep learning, then, is a subfield of...www.datacamp.com
Distributed Deep Learning
Deep Learning is one of the most important tools and theories a Data Scientist should learn. We are so lucky to see amazing people creating both research, software, tools and hardware specific for DL tasks.
DL is computationally expensive, and even though there's been advances in theory, software and hardware, we need the developments in Big Data and Distributed Machine Learning to improve performance and efficiency. Great people and companies are making amazing efforts to join the distributed frameworks (Spark) and DL libraries (TF and Keras).
Here's an overview:
1. Databricks: Deep Learning Pipelines (Soon will be merge to Spark)
Apache Spark and TensorFlow are both open-source projects that have made significant impact in the world of enterprise...
Getting stuff done with Deep Learning
As I've said before one of the most important moments for this field was the creation and open sourced of TensorFlow.
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them.
Tensors, defined mathematically, are simply arrays of numbers, or functions, that transform according to certain rules under a change of coordinates.
But in the scope of Machine Learning and Deep Learning a tensor is a generalization of vectors and matrices to potentially higher dimensions. Internally, TensorFlow represents tensors as n-dimensional arrays of base datatypes.
We use heavily tensors all the time in DL, but you don't need to be an expert in them to use it. You may need to understand a little bit about them so here I list some good resources:
Deep Learning 101: Demystifying Tensors
Tensors and new machine learning tools such as TensorFlow are hot topics these days, especially among people looking...
Welcome to part four of Learning AI if You Suck at Math. If you missed parts 1, 2, 3, 5, 6 and 7 be sure to check them...
After you check that out, the breakthroughs I mentioned before and the programming frameworks like TensorFlow or Keras (for more on Keras go here), now I think you have an idea of what you need to understand and work with Deep Learning.
But what have we achieved so far with DL? To name a few (from Fran�???�??�?�§ois Chollet book on DL):
Near-human level image classification.
Near-human level speech recognition.
Near-human level handwriting transcription.
Improved machine translation.
Improved text-to-speech conversion.
Digital assistants such as Google Now or Amazon Alexa.
Near-human level autonomous driving.
Improved ad targeting, as used by Google, Baidu, and Bing.
Improved search results on the web.
Answering natural language questions.
Superhuman Go playing.
And much more. Here's a list of 30 great and funny applications of DL:
Over the last few years Deep Learning was applied to hundreds of problems, ranging from computer vision to natural...
Thinking about the future of Deep Learning (for programming or building applications), I'll repeat what I said in other posts.
I really think GUIs and AutoML are the near future of getting things done with Deep Learning. Don't get me wrong, I love coding, but I think the amount of code we will be writing next years will decay.
We cannot expend so many hours worldwide programming the same stuff over and over again, so I think these two features (GUIs and AutoML) will help Data Scientist on getting more productive and solving more problems.
On of the best free platforms for doing these tasks in a simple GUI is Deep Cognition. Their simple drag & drop interface helps you design deep learning models with ease. Deep Learning Studio can automatically design a deep learning model for your custom dataset thanks to their advance AutoMLfeature with nearly one click.
Design, Train, and Deploy Deep Learning Models without Coding. Deep Learning Studio simplifies and accelerates the...
Take a look at the prices :O, it's freeeee :)
I mean, it's amazing how fast the development in the area is right now, that we can have simple GUIs to interact with all the hard and interesting concepts I talked about in this post.
One of the things I like about that platform is that you can still code, interact with TensorFlow, Keras, Caffe, MXNet an much more with the command line or their Notebook without installing anything. You have both the notebook and the CLI!
I take my hat off to them and their contribution to society.
Other interesting applications of deep learning that you can try for free or for little cost are (some of them are on private betas):
Thanks for reading this weird introduction to Deep Learning. I hope it helped you getting started in this amazing area, or maybe just discover something new.