Argentum Solutions, Inc.

    Sterling guidance on corrosion and materials degradation


 

Potential-pH Diagrams
THERMEXPERT - Potential-pH diagram generator

Intelligent Tools

POLEXPERT - Polarization Scan Artificial Neural Network Expert System

SEQEXPERT - Sequential Immersion Test Artificial Neural Network Expert System

CYLEXPERT - Rotating Cylinder Electrode Intelligent Rotation Rate Calculator

Corrosion Calculator

Corrosion Rate Calculator


Corrosion Economics Estimator

FINCALCULATOR - Corrosion Economic Calculator


TUTORIAL ON ARTIFICIAL NEURAL NETWORKS

David C. Silverman


Table of Contents

Overview of Tutorial
Artificial Neural Network Background
The Back-propagation Computing Element
The Back-propagation Artificial Neural Network
Training the Back-Propagation Neural Network
Example of Back-propagation Artificial Neural Network
Radial Basis Function Artificial Neural Network
Probabilistic Artificial Neural Network
General Regression Artificial Neural Network
Modular Artificial Neural Network

Modular Artificial Neural Network

A modular artificial neural network is a neural network that decomposes the classification problem into parts and assigns those parts to individual or groups of computing elements. Division of the problem is according to one or more parameters determined during training. All of these parameters taken together characterize the overall problem structure. The task of training the network is then broken into training subtasks. Separate architectures or modules can be developed to solve each subtask with the best architecture being employed for each. Each module operates independently with no intercommunication. These networks act as "local experts" that compete to "learn" their respective relationships. The outputs from these modules are mediated by another module that controls the competition and does not feed information back to the modules. It acts as a "gating" network (R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton," Adaptive Mixtures of Local Experts", Neural Computation, Vol. 3, 1991, p. 79). The modular architecture combines supervised and competitive learning schemes. The supervised learning scheme trains the modules. The gating network learns to assign different patterns to a module in a competitive training mode. That network acts as a "mediator". The local expert modules can themselves be self-contained neural networks. A general schematic of such a network is shown in this figure . All connections and modules are not shown. Note that radial basis functions are not appropriate as local experts because their learning is not driven by propagation of errors. Their learning is instead self-organizing. They could be used in a lower level to reorganize the input data prior to input to the local experts.

Training of the local expert modules (or computing elements) and the gating network can be done using back-propagation of the error. Each module works as a feed forward neural network and each receives the same inputs and has the same number of outputs. The gating network also functions as a feed forward network and receives the same inputs from the input layer. The final output from the overall network is given by
                                                                          (11)
where is the vector of outputs from the network, gj is the activation value of the output from the gating layer to local expert j, and yj is the output from local expert j. The activation values are often normalized. The training of the modules and gating function is accomplished by minimization of the error. The error function for each module j is
                                                                        (12)
where Ej is the local error function for module j, y* is the desired output, and is the calculated output. The concept is to maximize the objective function
                                                                        (13)
where Obj is the objective function, Ej is defined by equation (12), and gj is the activation value for that module. The objective function is often logarithmic (the logarithm of the summation). Each of the modules are assumed to have a probability density function equal to .

The error function of the gating network is different and minimization of the error requires a different technique. One module comes closer to producing the desired output than the others for each training set. If present system performance is improved for a given training set then the weights of the gating network are adjusted to make the output of the winner closer to 1 and the outputs of all others closer to 0. If system performance does not improve, then all weights are moved closer to a neutral value. The error function in equation (13) is compared for each pair of steps in time for each input vector and a new error function is written as a weighted sum of the previous and present error function. The gating error network function is a complex function of the individual gating functions gj calculated at each time step and constants determined by the proximity of the output of each module to the desired output for the training sets.

The following example is used for illustration. It is identical to the one used for the back-propagation neural network , the radial basis function neural network , the probabilistic neural network , and the general regression neural network . The example is provided to show that a modular neural network can sometimes be used in place of the others for classification. This type of network may be more appropriate for more complex situations as long as enough training data sets are available. Otherwise, the networks might not generalize on the information.
  1. Square two random numbers each between 0 and 1 and add the results together.
  2. Train a probabilistic neural network to decide how to round the number. If the sum is greater than or equal to 0.5 round to 1, if less than 0.5 round to 0.
This problem is a simple classification decision problem in which the neural network is presented with 2 numbers as input and outputs a value of 0 or 1 depending on the value of the sum of the squares. To make the example more realistic, the inputs are the non-squared values of the two random numbers and the output is 0 or 1. The network then has to learn the relationship between the two numbers and from that the decision on whether the output value is 0 or 1. This type of decision represents a very typical real life decision in which several independent observables are present and a decision has to be made from their relationship without knowing anything about their relationship.
br> Five hundred training sets were created about evenly divided between those that round to 1 and those that round to 0. Two networks were constructed, one with two gates and two local expert modules and one with three gates and three local expert modules. No additional hidden nodes were added. Only one output was used. The network constructed for the case of 2 local experts is shown in this figure . Both networks were trained. A different set of 100 randomly generated input-output combinations were used to test the trained networks. The calculated outputs were between about -.1 and +1.1. The strategy used to assess error was to assume that if the value is less than 0.5, the prediction would have been zero and if the value is greater than or equal to 0.5, the prediction would have been 1. These values were compared to the actual outputs to determine error. No attempt was made to compare actual values because that information did not enter into the decision. This figure shows correct and incorrect responses for the neural network with one output. Four points, three rounding to 0 and 1 rounding to 1 were in error and were at the boundary. They were the same for both networks. Adding an additional hidden layer eliminated two of the error points. For this simple example, additional complexity did not change the results much. An additionalhidden layer might be useful in more complex situations.

  1. The network based on the modular neural network generalized to about the same accuracy as the back-propagation neural network , the radial basis function neural network the probabilistic neural network and the general regression neural network . This result is in agreement with the concept that classification problems that can be generalized by back-propagation neural networks can often be generalized by neural networks based on the modular neural network. This agreement does not mean that the modular neural network is a direct substitute for the back-propagation network. The ability to develop more complex structures with the modular neural network can sometimes make this network structure more versatile.
  2. The errors are congregated at the boundary. This observation is not limited to this example. Dividing information among classes becomes more difficult the closer one is to the boundary between the classes.



Previous Page: General Regression Artificial Neural Network

Return to Table of Contents





David C. Silverman, Ph.D. - Primary Consultant
E-Mail:     dcsilverman@argentumsolutions.com
Phone:     314-576-3586
Fax:         314-754-9825
Address:   The Argentum House
                14314 Strawbridge Ct.
                Chesterfield, MO 63017