What is Deep Learning and How Does It Work?

What is Deep Learning and How Does It Work?

OpenAI is one of the world’s top AI companies.  Elon Musk, Sam Altman and other Silicon Valley luminaries founded the company in 2015.  Since then, OpenAI has raised substantial amounts of money, including $1 billion from Microsoft in 2019.

One of the company’s innovations is GPT-3, which is trained on over 300 billion words of content.  The system has can create human-sounding text like short stories, press releases, songs and even recipes.

So how is this possible?  Well, at the heart of GPT-3 is deep learning.  This is a form of AI that uses complex neural networks – similar to the brain – that detects patterns in huge datasets.

Keep in mind that deep learning is not new.  The origins actually go back to the 1940s when computers were in the nascent stages.

But it was in the early 1980s that deep learning started to show promise.  This was the result of the pioneering work of academics like Yann LeCun, Geoffrey Hinton and Yoshua Bengio.

Although, it would not be until the last decade that deep learning would be commercialized.  This was because of access to massive data, the rapid advances in GPUs (Graphics Processing Units) and the development of new algorithms.

So then, what is this?  How does it work?  Well, as with anything in the AI world, it’s complex.  But the main concepts can be explained at a high level.

What is deep learning?

First, this model ingests data that is processed through nodes, which are called neurons.  Each of the neurons have three parts:  the input layer, the hidden layer and the output layer.

Let’s say we want a deep learning model to recognize the number “3.”  The data would be the pixels and these are placed into the input layer.  There will be weights assigned to each of the pixels that range from 0 to 1.  They essentially provide a way to get a sense of the accuracy of the data.  So the closer the data is to 1, the higher the accuracy.

The data will then move through the hidden layers through a process called forward propagation and the weights will be adjusted for the output layers.  But this will not be the end of the process. The data will go through the input layer again and again – so as to get more accurate results (this is called backpropagation).

A key advantage to deep learning – especially in comparison to other forms of AI like machine learning – is that there is not as much need for labeling the data, which can be a tedious and time-consuming process.  The reason is that the models essentially find the underlying patterns.

Different types of deep learning models

Now there are different types of this models to choose from.  The two main ones include the following:

  • Convolutional Neural Network (CNN): This system is used primarily for computer vision applications, such as for detecting patterns in 2D images.   To do this, there are “convolutions” to arrive at more accurate results.
  • Recurrent Neural Network (RNN): This is for data where patterns change over time.  This is often used in cases for speech recognition or even forecasting stock prices.  The RNN will predict the next word or number in a series.

And yes, there are a myriad of use cases for.  Here are just some examples:

  • Self-Driving Cars: This is perhaps the most popular.  It is able to “see” pedestrians, understand when a light changes or detect a stop sign.
  • Medicine: A sophisticated deep learning can help analyze X-rays to diagnose cancer.  In some cases, the accuracy rates have been higher than for trained physicians.
  • Chatbots: Deep learning models like RNNs are critical for applications like Siri, Cortona and Alexa to recognize speech.
  • AIOPs (Artificial Intelligence for IT Operations): This is an emerging category that uses AI to monitor the performance of IT assets.  As for deep learning, this can help find patterns quickly and accurately, which reduces the costs and helps to improve the performance of the systems.

As with any technology, deep learning has its problems and challenges.  The models can take considerable time to train and the compute costs can be high.  There is also the problem with explainability.  In other words, it can be extremely difficult to understand why a model is getting certain results (this is called the “black box”).  As a result, there could be less trust in the AI or even problems with regulatory authorities.

But despite these problems, it has proven to be quite powerful and versatile.  All in all, the technology will continue to gain momentum and become a greater part of the AI megatrend.

Free Component Quote

If you have any questions or would like to tour our facility,

please feel free to call us at 800-727-7844.

Transform your IT Operations with this exclusive eBook on AIOps

Subscribe to our mailing list

Categories

Get more industry trends and insights