|
Potential-pH Diagrams
|

|
|
|
Intelligent Tools
|
|

|
|
|
|

|
|
|

|
|
|
Corrosion Calculator
|
|

|
|
|
|
Corrosion Economics Estimator
|
|

|
|
|
|
|
|
TUTORIAL ON ARTIFICIAL NEURAL NETWORKS
David C. Silverman
|
|
Table of Contents
Modular Artificial Neural Network
A modular artificial neural network is a neural network that decomposes the
classification problem into parts and assigns those parts to individual or groups
of computing elements. Division of the problem is according to one or more
parameters determined during training. All of these parameters taken together
characterize the overall problem structure.
The task of training the network is then broken into training subtasks.
Separate architectures or modules can be developed to solve each subtask with
the best architecture being employed for each. Each module operates independently
with no intercommunication. These networks act as "local experts" that compete
to "learn" their respective relationships. The outputs from these modules are
mediated by another module that controls the competition and does not feed
information back to the modules. It acts as a "gating" network (R. A.
Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton," Adaptive Mixtures
of Local Experts", Neural Computation, Vol. 3, 1991, p. 79).
The modular architecture combines supervised and competitive learning schemes.
The supervised learning scheme trains the modules. The gating network learns
to assign different patterns to a module in a competitive training mode. That
network acts as a "mediator". The local expert modules can themselves be self-contained
neural networks. A general schematic of such a network is shown in
this figure .
All connections and modules are not shown. Note that
radial basis functions
are not appropriate as local experts because their learning is not driven by
propagation of errors. Their learning is instead self-organizing. They could
be used in a lower level to reorganize the input data prior to input to the local
experts.
Training of the local expert modules (or computing elements) and the gating
network can be done using back-propagation of the error. Each module works as
a feed forward neural network and each receives the same inputs and has the
same number of outputs. The gating network also functions as a feed forward
network and receives the same inputs from the input layer.
The final output from the overall network is given by
(11)
where
is the vector of outputs from the network, gj is the activation
value of the output from the gating layer to local expert j, and yj is
the output from local expert j. The activation values are often normalized.
The training of the modules and gating function is accomplished by minimization
of the error. The error function for each module j is
(12)
where Ej is the local error function for module j, y*
is the desired output, and
is the calculated output. The concept is to
maximize the objective function
(13)
where Obj is the objective function, Ej is defined by equation (12),
and gj is the activation value for that module. The objective function
is often logarithmic (the logarithm of the summation). Each of the modules are assumed to have a probability
density function equal to .
The error function of the gating network is different and minimization of the error
requires a different technique. One module comes closer to producing the desired
output than the others for each training set. If present system performance is
improved for a given training set then the weights of the gating network are
adjusted to make the output of the winner closer to 1 and the outputs of all
others closer to 0. If system performance does not improve, then all weights
are moved closer to a neutral value. The error function in equation (13) is
compared for each pair of steps in time for each input vector and a new error
function is written as a weighted sum of the previous and present error function.
The gating error network function is a complex function of the individual
gating functions gj calculated at each time step and constants
determined by the proximity of the output of each module to the desired output
for the training sets.
The following example is used for illustration. It is identical to the one
used for the back-propagation neural network ,
the radial basis function neural network ,
the probabilistic neural network ,
and the general regression neural network .
The example is provided to show that a modular neural network can
sometimes be used in place of the others for classification. This type of network
may be more appropriate for more complex situations as long as enough
training data sets are available. Otherwise, the networks might not
generalize on the information.
- Square two random numbers each between 0 and 1 and add the results together.
- Train a probabilistic neural network to decide how to round the number.
If the sum is greater than or equal to 0.5 round to 1, if less than 0.5 round to 0.
This problem is a simple classification decision problem in which the neural
network is presented with 2 numbers as input and outputs a value of 0 or 1
depending on the value of the sum of the squares. To make the example more
realistic, the inputs are the non-squared values of the two random numbers and
the output is 0 or 1. The network then has to learn the relationship between
the two numbers and from that the decision on whether the output value is 0 or 1.
This type of decision represents a very typical real life decision in which several
independent observables are present and a decision has to be made from their
relationship without knowing anything about their relationship.
br>
Five hundred training sets were created about evenly divided between those
that round to 1 and those that round to 0. Two networks were constructed,
one with two gates and two local expert modules and one with three gates and
three local expert modules. No additional hidden nodes were added. Only
one output was used. The network constructed for the case of 2 local experts
is shown in this figure .
Both networks were trained. A different set of 100 randomly generated
input-output combinations were used to test the trained networks.
The calculated outputs were between about -.1 and +1.1.
The strategy used to assess error was to assume that if the value
is less than 0.5, the prediction would have been zero and if the value
is greater than or equal to 0.5, the prediction would have been 1.
These values were compared to the actual outputs to determine error.
No attempt was made to compare actual values because that information
did not enter into the decision.
This figure
shows correct and incorrect responses for the neural network with one output.
Four points, three rounding to 0 and 1 rounding to 1 were in error and were
at the boundary. They were the same for both networks. Adding an additional
hidden layer eliminated two of the error points. For this simple example,
additional complexity did not change the results much. An additionalhidden layer
might be useful in more complex situations.
- The network based on the modular neural network generalized
to about the same accuracy as the
back-propagation neural network ,
the radial basis function neural network
the probabilistic neural network
and the general regression neural
network . This result is in agreement with the concept that classification
problems that can be generalized by back-propagation neural networks can often
be generalized by neural networks based on the modular neural network.
This agreement does not mean that the modular neural network is a direct
substitute for the back-propagation network. The ability to develop more complex
structures with the modular neural network can sometimes make this network structure
more versatile.
- The errors are congregated at the boundary. This observation is not limited
to this example. Dividing information among classes becomes more difficult the
closer one is to the boundary between the classes.
|
Previous Page: General Regression Artificial Neural Network
Return to Table of Contents
|
David C. Silverman, Ph.D. - Primary Consultant
E-Mail: dcsilverman@argentumsolutions.com
Phone: 314-576-3586
Fax: 314-754-9825
Address: The Argentum House
14314 Strawbridge Ct.
Chesterfield, MO 63017
|
|