How to Train a Neural Network Part 4

This is Part 4 of a four-part series on training an ANN model. In Part 1 of this series, we explored the data that would be used to train the model. In Part 2, we looked at the training methodology. In Part 3, we explored the role of the activation function in the model and the role of backpropagation for training the weights. In this final part, we will see the model being trained. Training means that the weights will be adjusted so that the prediction is as accurate as possible.

Training the Model

When we train the model, we provide an input file with the training data set, the correct result for training (“labels”) and random initial weights. Alternatively, the random initial weights could be calculated at runtime instead. We also indicate how many times we want to run through the training data; in other words, how many training cycles or epochs. In addition, a Learning Rate number (“Alpha”) is provided that works like a multiplier to increase or decrease the learning rate (increases/decreases the size of the weight adjustments).

Here are two examples where we ran 100 training epochs with a learning rate value (“Alpha”) of 0.5 and another with learning rate value = 3.0. Note the differences in number of epochs before a low error rate was achieved.

Training Run using Alpha = 0.5

Here are some of the results from training using an Alpha of 0.5 for 100 epochs. In this training run, the training results converged to a low of 2 errors out of 15 training samples in epoch number 80.

Training Run

Weight Adjustments, Alpha = 0.5

The charts below represent the new weight value for the indicated feature after adjustments used for the indicated epochs.

Note the change in adjusted weights in the first 4 epochs and the 40th and 80th epochs.

Dog Feature Weights

Since we used an Alpha of 0.5, the learning rate caused small adjustments in the weight values between epochs for the dog features. It tended to average < 0.02 weight adjustment during each epoch before it converged after 80 epochs.

Cat Feature Weights

While it might appear that all weight adjustments go in one direction, we observe here in the Cat feature weights that, while minor, the weight adjustments changed direction during training for the features Indoor/outdoor, Shoulder height, and Tail.

Squirrel Feature Weights

In this example, all of the squirrel feature weights adjusted gradually in the same direction.

Training Run using Alpha = 3.0

Here are some of the results from training using an Alpha of 3.0 for 100 epochs. In this training run, the training results converged to a low of 2 errors out of 15 training samples in epoch number 14, probably due to the higher alpha number.

Training Run

Weight Adjustments, Alpha=3.0

Note in this example, the weight adjustments range between 0.02 and 0.2 between epochs. These are larger than the weight adjustments when the Alpha of 0.5 was used. The training data results start to converge in epoch 14, where 2 errors out of 15 are reached.

Dog Feature Weights

The feature weight for Temperament stayed in a narrow range while the other weights gradually moved in the same direction until convergence.

Cat Feature Weights

The Cat feature weights Hair and Shoulder height stayed in a narrow range. Indoor/outdoor weight values had two adjustments in different directions before it settled in a narrow range.

Squirrel Feature Weights

The squirrel feature weight for the Tail feature changed in both the negative and positive direction while the other feature weights progressed gradually in one direction until convergence.

Note: This is just a simple example showing how weights are adjusted. The weight values for each feature are adjusted independently of the other feature weights. Depending on the initial starting weights, the final adjusted weight values could be different for the same number of epochs. Results will differ with more complex ANNs with different formulas and larger numbers of sample data.

End of Training

The end of training is whenever you see satisfactory results based on the convergence of the trained results to a small number of errors. Now you are ready to use the trained model.

At the end of the epoch where you are satisfied with the error rate, the final weight values from that epoch are now used for running all test data through the trained ANN. The test data must be a different set of data and associated label values from the training set.

At this point, the model could be deployed and used to take new data and identify them as either cats, dogs, or squirrels.

Based on this model, you should be able to create your own ANN that classifies objects based on data similar to the model that we built.

About the Author

Ray Sulewski is a Senior Consultant at Award Solutions. He joined Award Solutions in 2006, bringing his expertise in CDMA technologies and overall experience in real-time product development, delivery and support of wireless telecommunications systems. Ray has over 36 years of experience in the wireless telecom industry.

About Award Solutions, Inc.

Award Solutions is the trusted training partner to the world's best networks. We help companies tackle new technologies by equipping their teams with knowledge and skills. Award Solutions invests heavily in technology, research, engineering, and labs to ensure our customers make the most of their resource and network investments.

Award has expertise across all technologies that touch wireless: 5G, Artificial Intelligence, Machine Learning, Network Virtualization, Data Visualization, Data Manipulation, 4G LTE, and more.

Don’t forget to connect with Award Solutions on Twitter and LinkedIn. You can also check out Part 1, 2 and 3 of this blog series at www.awardsolutions.com.