Investigating Deep Learning

AI

The term Deep learning was coined sometime in the 1980s, but its use in the common vernacular in these modern times of 2023, is attributed to Geoffrey Hinton. Its first academic use is reported to have occurred in 1986, by Rina Dechter. Here is a link to her paper where this term was born.

I believe that her work was far ahead of its time, and data availability against the cost of computational loads for Deep Learning wasn’t economically feasible in those days. Even in 2023, it’s still expensive so a project of that size back then would probably need to be sponsored by a government.

It took continuous effort and innovation from Hinton to bring the practice into a more widely adopted audience, where by now it’s in some way a part of most technology we interact with. Potentially even most media because the articles we read, the videos we watch, and recommendations on entertainment platforms are driven via these processes.

OK, but what is Deep Learning?

High-Level View

Deep Learning is just one of the many branches of machine learning. Deep Learning uses neural networks that have multiple interconnected layers. Most of the neural networks that I have read about have more than one layer, but that could also be because I’m late to the game. I’m sure as I research more I’ll find uses for these single-layered networks (A Single Layered Perceptron for example).

Mean to emulate animal brains, these synthetic networks contain interconnected nodes that pass information through layers of nodes, eventually completing their journey and terminating at ‘outputs’.

The reason it’s called deep is because of the nature of hidden layers in the network of nodes that give the model capabilities that enable progressive learning, as well as other useful but sometimes puzzling features.

This process can have enough layers and complexity that the researchers and engineers building the network have a fairly high likelihood of not being able to understand how the system produces predictions or outputs, even if they are very accurate.

Unique Training Procedures

Typically Deep Learning is used on massive data sets. It’s possible to effectively use it on smaller sets of data, but I think it’s generally used on sets of data that are very difficult or time-consuming for humans to classify or label. That’s one part of its value.

Another part of its value is that it can adjust its own parameters while it’s actively training. It will identify relationships in the data with absolutely no labeling and instead have an intermediary output that it evaluates and uses to tune its own parameterization.

It uses Backpropagation and Gradient Descent techniques to tune the relationships between connections inside the model. Deep learning can become adept at tasks ranging from classification to Clustering, and Anomaly Detection. Engineers and researchers sometimes struggle or fail to prove that the correlations are actually true, even if they are highly repetitive over a given data set.

Deep Learning use cases

The versatility of Deep Learning makes it powerful as a general prediction engine or solution generator. The areas where it has had some of its greatest successes are:

  1. Computer Vision

  2. Natural Language Processing

  3. Recommendation Systems

  4. Anomaly Detection

  5. Predictive Analytics

It’s absolutely pervasive and for good reason. In my opinion, this is a tool to help drive human creativity by removing redundant and dreadful work out of the scope of the human requirement. Allow people to focus on utilizing insights and to make those be beneficial to the society we have built.

Conclusion

Understanding what deep learning is was a requirement for me before continuing through tutorials via tensorflow. I was sick of being able to see these systems be able to just start interpreting symbols from the MNIST data set and me having literally no idea what in the world was going on.

This post does not attempt to teach how to do Deep Learning, rather, it is meant to show the scope of deep learning, and what missions it should be used to accomplish. It also shows that I am a fan of it, and am excited to get moving on with the next steps and become an Engineer capable of handily building Deep Learning systems to start generating insights.

Previous
Previous

Backpropagation in ML Systems

Next
Next

Types of Machine Learning