Argentum Solutions, Inc.

    Sterling guidance on corrosion and materials degradation


 

Potential-pH Diagrams
THERMEXPERT - Potential-pH diagram generator

Intelligent Tools

POLEXPERT - Polarization Scan Artificial Neural Network Expert System

SEQEXPERT - Sequential Immersion Test Artificial Neural Network Expert System

CYLEXPERT - Rotating Cylinder Electrode Intelligent Rotation Rate Calculator

Corrosion Calculator

Corrosion Rate Calculator


Corrosion Economics Estimator

FINCALCULATOR - Corrosion Economic Calculator


TUTORIAL ON ARTIFICIAL NEURAL NETWORKS

David C. Silverman


Table of Contents

Overview of Tutorial
Artificial Neural Network Background
The Back-propagation Computing Element
The Back-propagation Artificial Neural Network
Training the Back-Propagation Neural Network
Example of Back-propagation Artificial Neural Network
Radial Basis Function Artificial Neural Network
Probabilistic Artificial Neural Network
General Regression Artificial Neural Network
Modular Artificial Neural Network

Training the Back-Propagation Neural Network

Once the network is constructed, it has to be trained. As mentioned in the discussion of the back-propagation neural network, the artificial neural network most often applied in corrosion (and elsewhere) has been the back-propagation network. In corrosion applications, it has been used to solve either complex pattern-matching problems or fit relationships among variables for which explicit functions cannot be written. This type of network is shown in this figure .. The training algorithm is summarized below. The actual equations for the algorithm are available in a number of textbooks, for example "Artificial Intelligence, A Modern Approach", S. J. Russell and P. Novig, Prentice Hall, 1996 and "Neural Networks-Algorithms, Applications, and Programming Techniques", J. A. Freeman and D. M. Skapura, Addison-Wesley, 1992.

The network learns ("fits" might be an alternative description) a predefined set of input-output pairs known as the training set. The methodology is a two phase cycle consisting of propagation and adaptation. The algorithm is known as the generalized delta rule. A number of variations exist.

A set of weights are arbitrarily chosen for each processing element . After the input is applied to the first layer of network units, the output is propagated to the next layer as an input to that layer until an output from the network is calculated (output layer). This output pattern is compared to the desired output. An error is created as the sum of the squares of the differences between the each calculated and desired output. Nothing magical exists with respect to the error function. Cubic and fourth power differences have also been used.

This error is sent backward from the output layer to each computing element that contributed to the output (the next lower hidden layer). But the total error is divided among the computing elements according to their relative contributions to the original output. Once completed, the process is repeated for each layer of computing elements. The weights for each pathway to each computing elements are updated. The network converges toward a state that should enable all training patterns to be coded properly. Back-propagation is a fairly robust form of non-linear regression.

As the network trains, the computing elements in the intermediate layer(s) organize themselves so that different computing elements "learn" to recognize different features in the input space. At some point, often at an arbitrarily chosen minimum sum of squares, the training is deemed complete. Another set of input-output patterns, the test set, is then fed to the network to determine how it trained. If the new input contains features that the computing element "recognizes" (that resembles features that it has learned during training), it responds with an active output. If the new input does not contain features that the computing element recognizes (e.g. that does not resemble features that it has learned during training), its response is inhibited (zero). The goal is to ensure that the training set encompasses the test set.

Additional Considerations

Following are issues that should be considered when designing and using back-propagation artificial neural networks.
  1. Expressiveness or How large should the network be?
     Neural networks provide attribute representation, not logical representation. The class of multilayer networks taken together can represent any desired function of a set of attributes but any particular network may have too few hidden units. So the question is how many layers and nodes are enough? One source ("Artificial Intelligence, A Modern Approach", S. J. Russell and P. Novig, Prentice Hall, 1996) has stated that 2n/n hidden units are needed to represent n Boolean functions of n inputs. The network would have O(2n) weights. But, in practice smaller networks have sufficed. Statements have been made that 3 layers, an input layer, hidden layer, and output layer would suffice for most situations. As mentioned in the description of the backpropagation neural network one hidden layer should be able to represent a continuous function, two a discontinuous function.

     The question of how many nodes or computing elements are required is not straightforward. One important point is to use as few hidden nodes (hidden computing elements) as possible because of computation time and the need for generalization (see below). A reasonable philosophy is that if a network fails to converge the user should increase the number of nodes. If the network converges, decrease the number of nodes until the network fails to converge. In addition, one can selectively remove connections to determine if certain nodes or links are important.

  2. Computation time
     As mentioned above, the number of hidden nodes directly impacts the computation time required to train the network. For n examples and W weights, each epoch takes O(nW) time. The epoch is the number of training sets presented to the network during each training cycle. It is usually the total number of training sets. A significant fraction of the computational research effort with respect to feed forward artificial neural networks has been to design more effective training algorithms that more quickly converge. Local minima in the error surface can cause convergence to the wrong point much like in conventional non-linear regression.

  3. Generalization vs. Memorization
     When constructed properly, artificial neural networks generalize nicely. The concept of generalization is important. Generalization means that given several input-output combinations all belonging to the same class, the artificial neural network "learns" (fits) the significant similarities of the input data. It will be able to produce sensible output to a previously unseen input in the same class. Memorization means that the network has learned the specific input-output combinations and not the significant similarities of the input data. The difference may loosely be thought of as learning the structure of the function y=f(x) (generalization) and not the specific (x,y) pairs used in place of the function to define it (memorization). In the case of the back-propagation artificial neural network, the number of connections vs. the number of input-output pairs can have a direct effect on the ability of the network to generalize vs. memorize. The effect is similar to having too many constants relative to data pairs in non-linear regression using polynomials. In the absence of other information, one rule of thumb is that generalization requires that the number of input-output combinations used in training be 3 to 5 times the total number of connections in the network.

  4. Transparency or The Black Box
     Back-propagation neural networks are black boxes. An input is fed into them and an output retrieved using a structure that provides no understanding of why the output is correct. The network cannot be used to explain the output. Physical meaning cannot be given to the weights or hidden nodes.



Previous Page: The Back-propagation Artificial Neural Network

Next Page: Example of Back-propagation Artificial Neural Network

Return to Table of Contents





David C. Silverman, Ph.D. - Primary Consultant
E-Mail:     dcsilverman@argentumsolutions.com
Phone:     314-576-3586
Fax:         314-754-9825
Address:   The Argentum House
                14314 Strawbridge Ct.
                Chesterfield, MO 63017