We’ve previously said that deep learning is a subfield of machine learning (both are subfields of artificial intelligence) and can be analogized to a layered cake. In keeping with our initial (and delicious) premise, here’s a more thorough explanation of deep learning:
Imagine you’re making a multi-layered cake, like a towering wedding cake. Each layer of the cake represents a layer in a neural network.
There’s a straightforward process in simple neural networks (let’s compare them to a simple single-layered cake): input goes in, gets transformed, and output comes out. It’s akin to having just one layer of sponge in our cake – simple.
Now, “deep learning” is like our multi-layered wedding cake. Instead of one layer, we have many, many layers stacked on top of each other. Each layer contributes to the overall flavor and structure of the cake. In neural networks, having multiple layers allows for more complex transformations of the input data, enabling the network to understand intricate patterns and details.
Why is depth important? Well, just as each layer of our cake can have a different flavor or texture, adding depth (or layers) to our network allows it to capture distinctive features and nuances in the data. For instance, in image recognition, initial layers might identify simple patterns like edges or colors, while deeper layers could recognize more complex features like shapes, and even deeper layers might identify objects or scenes. There are different hidden layer “recipes” for each kind of deep learning application (image recognition would use different stacks of layers than natural language processing, for example).
As with creating the most delicious cake, intuition and trial and error go into building the layers. A good baker can predict that a cake with alternating layers of raspberry and chocolate will turn out better than one with layers of lemon and chocolate. Similarly, a good AI engineer has some intuition as to what layer designs are more likely to work. But, just as with fusion cuisine, delightful surprises abound when new combinations are tried, and the reasons for the results are not always clear. The layers and how they are combined are some of the hyperparameters that are pieces of the model (cake) you can set (vs. the parameters or weights learned in the training process).