Argentum Solutions, Inc.

    Sterling guidance on corrosion and materials degradation


 

Potential-pH Diagrams
THERMEXPERT - Potential-pH diagram generator

Intelligent Tools

POLEXPERT - Polarization Scan Artificial Neural Network Expert System

SEQEXPERT - Sequential Immersion Test Artificial Neural Network Expert System

CYLEXPERT - Rotating Cylinder Electrode Intelligent Rotation Rate Calculator

Corrosion Calculator

Corrosion Rate Calculator


Corrosion Economics Estimator

FINCALCULATOR - Corrosion Economic Calculator


TUTORIAL ON ARTIFICIAL NEURAL NETWORKS

David C. Silverman


Table of Contents

Overview of Tutorial
Artificial Neural Network Background
The Back-propagation Computing Element
The Back-propagation Artificial Neural Network
Training the Back-Propagation Neural Network
Example of Back-propagation Artificial Neural Network
Radial Basis Function Artificial Neural Network
Probabilistic Artificial Neural Network
General Regression Artificial Neural Network
Modular Artificial Neural Network

Radial Basis Function Artificial Neural Network

While the back-propagation neural network is the one most commonly implemented for classification problems using supervised training, other technologies exist. Radial Basis Functions are one such technology. Neural networks containing radial basis functions can be used in many of the same situations in which back-propagation networks are used. This section briefly describes radial basis functions and then provides a comparison of the results for the simple example of rounding the square of two numbers used in the back-propagation neural network example, the probabilistic neural network example, the general regression neural network example, and the modular neural network example Radial basis functions were first reported as another type of artificial neural network in the late 1980’s. Several articles provide the early background for this technology (J. Moody and C. J. Darken, "Fast Learning in Networks of Locally Tuned Processing Units", Neural Computation, 1, p281-294, 1989 and J. A. Leonard and M. A. Kramer, "Radial Basis Functions for Classifying Process Faults", IEEE Control Systems, April, 1991). An example of an artificial neural network containing radial basis functions is shown in this figure .  This figure is an example of two inputs and one output. It corresponds to the example discussed below. An additional summation node would be present for each additional output. Each hidden computing element would have a different radial basis function.

Radial basis functions tend to be embedded in a two layer neural network where each hidden computing unit has a radial activated function. Radial basis functions use radially symmetric computing elements and radially bounded transfer functions in the hidden layer. The output units implement a weighted sum of outputs from the hidden unit to form their outputs. In pattern classification as required for the simple example, the inputs represent feature entries while each output corresponds to a class. The hidden units correspond to subclasses.

A number of algorithms exist to train this type of network. One example is an algorithm with two main steps. The first step is a clustering step in which the incoming weights from the input layer become centers of clusters of input vectors. One algorithm often used for centering is the k-means clustering algorithm. The second step finishes the training by setting the radii of the Gaussian functions centered at the cluster centers. These radii encompass the information in each cluster that is most likely related.

The k-means clustering algorithm determines the Euclidean  distance  between the input and each clustering center (the input weights) which have been initialized randomly. In the first step, the algorithm determines the closest center. A significant number of training sets are required for this algorithm to train successfully. For example, five hundred were used in the example below. The radial basis function most often used in neural networks is Gaussian:
                                                                          (5)

where x is the vector of input values, μ is the mean location, σ is the standard deviation (cluster width), and ||.....|| is the Euclidean  distance  between the input vector and the center of the cluster. The dimensions are determined by the size of the input vector. Each input element is part of that vector. Equation (5) is the activation function for the hidden node. The algorithm determines the centers, μ, so that the sum of the squares of distances between each training vector, x, and its closest center is a local minimum. An equation (5) is written for each center which appears as a node in the hidden layer. In the second step the algorithm determines the value of σ. One point is that this entire first step has been accomplished in the absence of output information. That is, the hidden layer functionality or activation function has been determined completely by self-organization of the input values.

Upon completion of the self-organizing step, the output layer can be trained, e.g. its weights determined, using the delta rule learning algorithm as in the back-propagation network. In this case, the square of the difference between the desired output and the calculated output are minimized. This type of mapping is linear because the summation is over the product of the weights times the outputs of the hidden layer containing the radial basis functions. These weights when multiplied by the activation function for each hidden node determine which node influences the output (in which class the input vector resides) to determine the appropriate output value to be predicted. An additional hidden layer could be inserted between the existing hidden layer and the output layer if needed.

The following example is used for illustration. It is identical to the one used for the back-propagation neural network example, the probabilistic neural network example, the general regression neural network example, and the modular neural network example and is provided to show that a radial basis function neural network can sometimes be used in place of these others.
  1. Square two random numbers each between 0 and 1 and add the results together.
  2. Train a radial basis function neural network to decide how to round the number. If the sum is greater than or equal to 0.5 round to 1, if less than 0.5 round to 0.
This problem is a simple classification decision problem in which the neural network is presented with 2 numbers as input and outputs a value of 0 or 1 depending on the value of the sum of the squares. To make the example more realistic, the inputs are the non-squared values of the two random numbers and the output is 0 or 1. The network has to learn the relationship between the two numbers and from that the decision on whether the output value is 0 or 1. The actual values of the outputs are not important, only whether or not they are greater than, equal to, or less than 0.5. This type of decision represents a very typical real life decision in which several independent observables are present and a decision has to be made from their relationship without knowing anything about their relationship.

Five hundred training sets were created randomly about evenly divided between those that round to 1 and those that round to 0. Since the number of appropriate hidden nodes could not be determined beforehand, networks were designed with 50, 25, 10, 5, and 2 hidden nodes (e.g. 50, 25, 10, 5, and 2 possible centers). The network with 2 hidden nodes failed to train. The others trained to about the same error using the sum of the squares of predicted minus actual values. Since the goal was to use the simplest network, only that with 5 hidden nodes was examined. Networks with 3 and 4 hidden nodes were not constructed.

A different set of 100 randomly generated input-output combinations were used to test the trained networks. The calculated outputs were between about -.1 and +1.1. The strategy used to assess error was to assume that if the value is less than 0.5, the prediction would have been zero and if the value is greater than or equal to 0.5, the prediction would have been 1. These values were compared to the expected outputs to assess error. No attempt was made to compare actual values because that information did not enter into the decision. This figure . shows correct and incorrect responses for the radial basis function neural network. Only one point was in error and that point was at the boundary. Classification was very good.

  • The network based on radial basis functions generalized to about the same accuracy as the back-propagation network with three hidden nodes. It also trained to about the accuracy as the probabilistic neural network, the general regression neural network, and the modular neural network. This result is in agreement with the comment earlier that classification problems that can be generalized by back-propagation neural networks can sometimes be generalized by neural networks using radial basis functions especially if enough data points are available.
  • The error is at the boundary. This observation is not limited to this example. Dividing information among classes becomes more difficult the closer one is to the boundary between the classes.



  • Previous Page: Example of Back-propagation Artificial Neural Network

    Next Page: Probabilistic Artificial Neural Network

    Return to Table of Contents





    David C. Silverman, Ph.D. - Primary Consultant
    E-Mail:     dcsilverman@argentumsolutions.com
    Phone:     314-576-3586
    Fax:         314-754-9825
    Address:   The Argentum House
                    14314 Strawbridge Ct.
                    Chesterfield, MO 63017