How to Train an Artificial Neural Network Part 2 by Ray Sulewksi
Published on: 10/16/2018
Author: Ray Sulewksi, Senior Consultant, Award Solutions
In Part 1 of this four-part series, we focused on the data used to train our critter ANN model. In Part 2, we will look at the methodology for training the model based on the critter input data.
Supervised Learning was selected for the training method. It requires that we provide the correct result (“label”) associated with the training sample. The output (“prediction”) of the Critters ML model is compared to the correct result (“label”) for each training data sample and this is used to send adjustments back into the Critters training model to help it learn. This is called backpropagation.
What are these adjustments?
There is a weight applied to each (neuron-to-neuron) connection in an ANN. In this model, this weight would just be the weights between neurons in the input layer and the next layer. The data and the associated weights are used by the neurons in the output layer to execute an activation function on the data to attempt to determine the relevance of each of the features towards classification of the critter. (In most ANNs, these activation functions are also used in the hidden layers.) This is where the machine learning comes in. The weights will be adjusted after error check and error calculation is performed, based on the machine output after the completion of each epoch. A single processing pass through the ANN with a complete set of training data (which could be thousands of data examples describing the classes that you want the ANN to learn) is called an epoch. Training an ANN could require hundreds of epochs or more.
What are the weights?
Before we start training the machine, we need to apply an initial random weight, wDi (for Dog) , wCi (for Cat) or wSi (for Squirrel) to each critter feature for each of the critter classes. The weight value is the key to the learning method. Each weight value will be adjusted after an epoch is completed.
After the completion of each epoch during training, a delta weight value, ∆w, for each critter feature is calculated and propagated back. This is used to adjust the weights used in the prior epoch. The new weights will be used in the next epoch. This training method is separate from the architecture of the trained network after training is completed.
With three classes (dog, cat and squirrel) and eight identifying features, this requires the initial input of 24 random weight values: one for each feature-to-class connection.
Here are the sample initial random weights that we used:
Note that in this example, some of the initial random weights are negative, some are as high as 82, and some are as low as -15. To keep the summation and activation formulas working properly, these weights are normalized to values between a value of -1 and 1. For this we use a normalization algorithm shown here for each feature weight:
After applying the weight normalization formula, the normalized initial weights are:
Normalization of the weights is only done before the start of machine learning. It is not done during the actual machine learning.
Now what does the critters machine look like with the training data and weights?
wi1,D is the Feature 1 data value and the associated weight value for a dog. There are similar weight values on other connections for cat and squirrel.
At this stage we have defined the training data and decided to use a supervised learning approach to train the model. We have also set the initial weights of the model that are the starting point of the training process. One of the key elements of a neuron is the activation function. The activation function will be explored in more detail in Part 3.
About the Authors
Ray Sulewski is a Senior Consultant at Award Solutions. He joined Award Solutions in 2006, bringing his expertise in CDMA technologies and overall experience in real-time product development, delivery and support of wireless telecommunications systems. Ray has over 36 years of experience in the wireless telecom industry.
About Award Solutions, Inc.
Award Solutions is the trusted training partner to the world's best networks. We help companies tackle new technologies by equipping their teams with knowledge and skills. Award Solutions invests heavily in technology, research, engineering, and labs to ensure our customers make the most of their resource and network investments.
Award has expertise across all technologies that touch wireless: 5G, Artificial Intelligence, Machine Learning, Network Virtualization, Data Visualization, Data Manipulation, 4G LTE, and more.