Today, neural networks are used for solving many business problems such as sales forecasting, customer research, data validation, and risk management. For example, at Statsbot we apply neural networks for time series predictions, anomaly detection in data, and natural language understanding.
In this post, we’ll explain what neural networks are, the main challenges for beginners of working on them, popular types of neural networks, and their applications. We’ll also describe how you can apply neural networks in different industries and departments.
←→Watch my Webinar Series on “Machine Learning for Beginners” — aimed at helping Machine Learning/AI enthusiasts understand how to pursue their career, what lies ahead for them by interviewing several successful engineers.
Recently there has been a great buzz around the words “neural network” in the field of computer science and it has attracted a great deal of attention from many people. But what is this all about, how do they work, and are these things really beneficial?
Essentially, neural networks are composed of layers of computational units called neurons, with connections in different layers. These networks transform data until they can classify it as an output. Each neuron multiplies an initial value by some weight, sums results with other values coming into the same neuron, adjusts the resulting number by the neuron’s bias, and then normalizes the output with an activation function.
A key feature of neural networks is an iterative learning process in which records (rows) are presented to the network one at a time, and the weights associated with the input values are adjusted each time. After all cases are presented, the process is often repeated. During this learning phase, the network trains by adjusting the weights to predict the correct class label of input samples.
Advantages of neural networks include their high tolerance to noisy data, as well as their ability to classify patterns on which they have not been trained. The most popular neural network algorithm is the backpropagation algorithm.
Once a network has been structured for a particular application, that network is ready to be trained. To start this process, the initial weights (described in the next section) are chosen randomly. Then the training (learning) begins.
The network processes the records in the “training set” one at a time, using the weights and functions in the hidden layers, then compares the resulting outputs against the desired outputs. Errors are then propagated back through the system, causing the system to adjust the weights for application to the next record.
This process occurs repeatedly as the weights are tweaked. During the training of a network, the same set of data is processed many times as the connection weights are continually refined.
One of the challenges for beginners in learning neural networks is understanding what exactly goes on at each layer. We know that after training, each layer extracts higher and higher-level features of the dataset (input), until the final layer essentially makes a decision on what the input features refer to. How can it be done?
Instead of exactly prescribing which feature we want the network to amplify, we can let the network make that decision. Let’s say we simply feed the network an arbitrary image or photo and let the network analyze the picture. We then pick a layer and ask the network to enhance whatever it detected. Each layer of the network deals with features at a different level of abstraction, so the complexity of features we generate depends on which layer we choose to enhance.
In this post on neural networks for beginners, we’ll look at autoencoders, convolutional neural networks, and recurrent neural networks.
This approach is based on the observation that random initialization is a bad idea and that pre-training each layer with an unsupervised learning algorithm can allow for better initial weights. Examples of such unsupervised algorithms are Deep Belief Networks. There are a few recent research attempts to revive this area, for example, using variational methods for probabilistic autoencoders.
They are rarely used in practical applications. Recently, batch normalization started allowing for even deeper networks, we could train arbitrarily deep networks from scratch using residual learning. With appropriate dimensionality and sparsity constraints, autoencoders can learn data projections that are more interesting than PCA or other basic techniques.
Let’s look at the two interesting practical applications of autoencoders:
• In data denoising a denoising autoencoder constructed using convolutional layers is used for efficient denoising of medical images.
A stochastic corruption process randomly sets some of the inputs to zero, forcing the denoising autoencoder to predict missing (corrupted) values for randomly selected subsets of missing patterns.
• Dimensionality reduction for data visualization attempts dimensional reduction using methods such as Principle Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE). They were utilized in conjunction with neural network training to increase model prediction accuracy. Also, MLP neural network prediction accuracy depended greatly on neural network architecture, pre-processing of data, and the type of problem for which the network was developed.
ConvNets derive their name from the “convolution” operator. The primary purpose of convolution in the case of a ConvNet is to extract features from the input image. Convolution preserves the spatial relationship between pixels by learning image features using small squares of input data. ConvNets have been successful in such fields as:
In the identifying faces work, they have used a CNN cascade for fast face detection. The detector evaluates the input image at low resolution to quickly reject non-face regions and carefully process the challenging regions at higher resolution for accurate detection.
Calibration nets were also introduced in the cascade to accelerate detection and improve bounding box quality.
In the self driving cars project, depth estimation is an important consideration in autonomous driving as it ensures the safety of the passengers and of other vehicles. Such aspects of CNN usage have been applied in projects like NVIDIA’s autonomous car.
CNN’s layers allow them to be extremely versatile because they can process inputs through multiple parameters. Subtypes of these networks also include deep belief networks (DBNs). Convolutional neural networks are traditionally used for image analysis and object recognition.
And for fun, a link to use CNNs to drive a car in a game simulator and predict steering angle.
RNNs can be trained for sequence generation by processing real data sequences one step at a time and predicting what comes next. Here is the guide on how to implement such a model.
Assuming the predictions are probabilistic, novel sequences can be generated from a trained network by iteratively sampling from the network’s output distribution, then feeding in the sample as input at the next step. In other words, by making the network treat its inventions as if they were real, much like a person dreaming.
Can we learn to generate handwriting for a given text? To meet this challenge a soft window is convolved with the text string and fed as an extra input to the prediction network. The parameters of the window are output by the network at the same time as it makes the predictions, so that it dynamically determines an alignment between the text and the pen locations. Put simply, it learns to decide which character to write next.
A neural network can be trained to produce outputs that are expected, given a particular input. If we have a network that fits well in modeling a known sequence of values, one can use it to predict future results. An obvious example is Stock Market Prediction.
Neural networks are broadly used for real world business problems such as sales forecasting, customer research, data validation, and risk management.
Target marketing involves market segmentation, where we divide the market into distinct groups of customers with different consumer behavior.
Neural networks are well-equipped to carry this out by segmenting customers according to basic characteristics including demographics, economic status, location, purchase patterns, and attitude towards a product. Unsupervised neural networks can be used to automatically group and segment customers based on the similarity of their characteristics, while supervised neural networks can be trained to learn the boundaries between customer segments based on a group of customers.
Neural networks have the ability to simultaneously consider multiple variables such as market demand for a product, a customer’s income, population, and product price. Forecasting of sales in supermarkets can be of great advantage here.
If there is a relationship between two products over time, say within 3–4 months of buying a printer the customer returns to buy a new cartridge, then retailers can use this information to contact the customer, decreasing the chance that the customer will purchase the product from a competitor.
Neural networks have been applied successfully to problems like derivative securities pricing and hedging, futures price forecasting, exchange rate forecasting, and stock performance. Traditionally, statistical techniques have driven the software. These days, however, neural networks are the underlying technique driving the decision making.
It is a trending research area in medicine and it is believed that they will receive extensive application to biomedical systems in the next few years. At the moment, the research is mostly on modelling parts of the human body and recognising diseases from various scans.
Perhaps NNs can, though, give us some insight into the “easy problems” of consciousness: how does the brain process environmental stimulation? How does it integrate information? But, the real question is, why and how is all of this processing, in humans, accompanied by an experienced inner life, and can a machine achieve such a self-awareness?
It makes us wonder whether neural networks could become a tool for artists — a new way to remix visual concepts — or perhaps even shed a little light on the roots of the creative process in general.
All in all, neural networks have made computer systems more useful by making them more human. So next time you think you might like your brain to be as reliable as a computer, think again — and be grateful you have such a superb neural network already installed in your head!
I hope that this introduction to neural networks for beginners will help you build your first project with NNs.