Table of Contents

I. Introductory Information
 A. System Requirements
 B. Getting Started
 C. Menus

II. File Types
 A. Summary of File Types
 B. The MLP Network Structure File
 C. Training and Testing Data Files
 D. Data Files Included With This Package

III. Data Pre-Processing
 A. Summary
 B. Data Format Program
 C. Time Series Program 
 D. Data Compression Program
 E. Feature Selection

IV. Multilayer Perceptron Networks
 A. MLP Network Limitations and Characteristics
 B. Multilayer Perceptron (MLP) Options
 C. Error Functions for Training MLP Networks
 D. MLP Network Sizing
 E. Fast Training Program;
 F. Network Modeling and Pruning Program
 G. Automated MLP Design
 H. Processing Data with a Trained MLP Network

V. Functional Link Training and Testing Program
 A. Program Purpose
 B. Network Characteristics
 C. Files Needed or Produced
 D. Example Run of Functional Link Program
 E. Error Functions for Functional Link Net

VI. Modular Network Training and Testing Program
 A. Program Purpose
 B. Network Characteristics
 C. Files Needed or Produced
 D. Example Run of Modular Net Program
 E. Error Functions for Modular Net

VII. Unsupervised Learning
 A. Input Data Formats 
 B. Output Data Format
 C. Available Algorithms 
 D. Error Function for Unsupervised Learning
 E. Conventional Clustering Comments 
 F. Demo Run for Conventional Clustering
 G. Self-Organizing Map Comments 
 H. Demo Run for Self-Organizing Map
 I. Classify Vectors

VIII. Utilities

Appendix 1. Terminology
Appendix 2. Frequently Asked Questions

***************************************************************************
I. Introductory Information
 A. System Requirements
  1. Machine; 486 (66 MHz) or Pentium PC with 16 MB of RAM.
  2. Operating systems; Windows 95 or Windows NT
  3. 12 megabytes of disk space
  4. VGA monitor and graphics card.

 B. Getting Started
  1. Processing Examples
     An example run can be made for each processing option in 
     this package, using the data files in directory 
     \nnmap\data. These processing examples are run as follows.
   a. From Windows EXplorer or from the Start Menu on the Task Bar 
      run NNMap.exe.
   b. You will see the Neural Networks for Mapping pull down menu.
   c. Go to the option you are interested in. However, under 
      "Multilayer Perceptron Processing", you cannot "Analyze
       a Trained MLP" until you have designed one.
   d. Choose the "Demo" option. You will view a parameter
      file, which has responses to programs requests.
   e. Click on "Continue". The program will now run using responses in 
       the parameter file. 
   f. Choose the "Examine Program Log" option, if you like.
   g. For further details on the examples, see the help files for
      the option you are interested in.

  2. Designing a MLP Neural Net for Your Own Data
   a. Obtain an ascii data file and put it in the proper format. File
      twod.tra is a good example. Figure out how many inputs or features
      your data has and how many outputs you have.
   b. Decide how many hidden layers (1 or 2) you want and how many 
      hidden units (up to 40 per hidden or output layer and up to 
      100 for the input layer) you want. One way to do this 
      is to try "MLP Sizing" under the "Multilayer Perceptron"
      option. Create a network structure file by editing a copy of
      two.top or during the manual run of the fast training program.
   c. Go to the "Batch Processing" option under "Fast Training", 
      which is under the "Multilayer Perceptron Processing" menu.
   d. Alter the keyboard response file, following the choices you
      have made, and exit the EDIT program. The training program will
      now run.
    If you delete a parameter file and then attempt to use the "batch 
      processing" option, you need to make a fresh copy of the parameter file
      from the original package to your current directory.

   e. Choose the "Examine Program Log" option if you like.


 C. Menus
  1. Using Menus
   a. Use mouse or arrow keys to move cursor. Press "enter" to make
      a choice, or click the mouse.

  2. Menu Options
   a. Data Pre-Processing; 
       Training Data Format; Put data into correct format, calculate
                     input means and standard deviations, calculate
                     output means and standard deviations
       Create Time Series Training Data; Create a training data file 
                     from a file of columnar time series data
       Data Compression; Compress or expand training data file inputs
                     or outputs using KLT
   b. MLP Nets
       MLP Sizing from Data; Estimate size of MLP from training data file
       Fast Training; Design MLP neural nets via fast training
       Modeling and Pruning a Trained Net; Analyze and Prune a trained MLP
       Automated MLP Design; Design an MLP with little user input
       Process Data Using a Trained Net; Process a data file using a 
                     neural net
       Generate a Formatted Weight File; Create a formatted weight file
   c. Functional Link Nets; Design and apply functional link nets
   d. Modular Nets; Design and apply modular nets
   e. Unsupervised Learning 
       Cluster data file using
         Conventional Clustering (Sequential Leader or K-Means) or
         Neural Clustering (Kohonen's Self-Organizing Map)
       Save Clusters
       Examine Program Log
       Classify patterns; Assign them to clusters using a very simple
         nearest neighbor classifier  
   f. Code Generation
       Generate a subroutine that implements the KLT, MLP, Functional
       Link Processor, or modular net.
   g. Utilities; 
        split a file (randomly split a file into two new ones), combine
        files (combine arbitrary columns from one or more files into a
        new file), and examine a file ( calculate means, standard 
        deviations, histograms, and plots of columns in a data file)
   h. Help
        System requirements; Required main memory and disk space, etc. 
        Terminology; Definitions of technical terms used in neural nets
                     and in this package 
        File formats; File formats for network structure, weights, training
                     data, testing data, and network outputs
        Main menu choices; This file
        Getting started; Information on how to use this package and how to
                     run demos
        Frequently asked questions; Questions that people have asked or should 
                     ask about this software package, along with the answers
        View manual; View the manual, which is a document composed of the 
                     help files



II. File Types

Outline

 A. Summary of File Types
 B. The MLP Network Structure File
 C. Training and Testing Data Files
 D. Data Files Included With This Package



 A. Summary of File Types
    The MLP and functional link programs typically have five types 
    of files associated with them. These five types are: 
  1. The network structure file. For the MLP, this ASCII file specifies
     the number of network layers, the number of artificial neurons 
     (called units) in each layer, and the number of the first layer 
     which the third and fourth (if there is one) layers connect to.
     For the functional link net, this file contains the network degree P
     (usually an integer between 1 and 5), the number of network inputs N
     and the number of outputs, and the dimension of the multinomial vector, 
     which is L = (N+P)!/(N!P!).

  2. The weight file, which gives the gains or coefficients along 
     paths connecting the various units. This file is unformatted.

  3. The training or testing data file, which gives example inputs 
     and outputs for network learning, or for testing after learning.
     These files are always formatted.
  4.  The output file, which is a log of the user's activity while
      running the program. This ASCII file can be used to remind the user
      what was done earlier, and can be edited for use in plots. 
  5.  The result file, which has the extension res. This ASCII file
      stores training results, such as the minimum mean-square error 
      attained during training.
 B. The MLP Network Structure File
    This file is usually given the extension "top". You can create 
    your own network structure files within the backpropagation, fast 
    training and functional link programs, if you want. Consider the MLP 
    network structure file, GLS.top shown below. 

           4
           4          20          15           1
           1           1           1

   It has 4 layers. The first layer has 4 inputs, which means that 
   each training or testing pattern has 4 numbers. It has 20 units in 
   the first hidden layer, where "hidden" means that it is not an input 
   or output layer. It has 15 units in the second hidden layer. 
   The output layer has 1 unit. The last line of "1s" means that 
   layers 2, 3, and 4 connect up with layer 1, layers 1 and 2, and 
   layers 1, 2, and 3 respectively. This network is "fully connected",
   meaning that each layer connects with all previous layers. Fully 
   connected networks are more powerful than and train faster than 
   non fully connected networks. The fully connected networks are
   almost always smaller than non fully connected networks which 
   perform the same operation.

 C. Training and Testing Data Files
     All data files are in standard form. Standard form means that
     the file is formatted, and that each pattern or vector has inputs
     on the left and desired outputs on the right. You can type 
     out the files to examine them, and you can use these 
     files with other neural net software. For example, consider the 
     training data file, Max, part of which is shown below.

     .5844768      .5359043      .6196933
     .6196933

     .1291312      .4173794      .3405759
     .4173794

     .0472856      .5994965      .5638752
     .5994965

     Each training pattern consists of three random numbers. The fourth 
     number, which is the desired network output, is the maximum of the 
     three inputs. 

 D. Data Files Included With This Package

    The MAX data file, which corresponds to calculating the maximum
       of 3 random numbers, has 300 patterns, each of which has 3 inputs
       and 1 desired output.
    The twod.tra data file has 1,768 patterns with 8 inputs and 7 desired 
       outputs. The data comes from a remote sensing problem.


III. Data Pre-Processing

 A. Summary
  1. Data Format Functions; 
   a. Convert ASCII training or testing data files into standard
      form for use by this software package.
   b. Delete some inputs or outputs.
   c. Calculate input means and standard deviations, calculate
      output means and standard deviations
  2. Time Series Training Data File Creation:
     Given a file of columnar time series data, in which each
     row corresponds to a different time, the program
   a. Asks the user which row and column elements correspond
      to inputs and outputs,
   b. Slides a window down the file and writes inputs and desired
      outputs to a training or testing data file.
  3. Data Compression:
     This program performs the forward or inverse Karhunen Loeve 
     transform (KLT) in order to
   a. Compress the input or desired output vectors in a 
      training data file so that redundancy is reduced and
      smaller networks can be trained,
   b. Expand or inverse transform a file containing network 
      output vectors, so that these vectors are the same size 
      as the original uncompressed desired output vectors.
  4. Feature Selection:
     This program analyzes a training data file for classification 
     or mapping and
   a. Prints out a list of feature numbers and numbers denoting importance.
      Larger values of the measure denote more importance
   b. Orders the features according to their importance
  5. For information on file formats for this package, see "File Formats"
     under main menu "Help".

 B. Data Format Program
  1. Data Format Functions
   a. Count number of input patterns
   b. Normalize individual features, if desired
   c. Write new data files to disk, if desired
   d. Delete some inputs or outputs.
   e. Calculate input means and standard deviations, calculate
      output means and standard deviations
  2. Data file size
   a. Any number of training or testing patterns
   b. Up to 100 inputs
   c. Up to 40 outputs 
  3. Demo Run of Data Format Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;

1                   ! 1 for input in standard form, 2 if some columns need to be deleted or re-ordered
twod.tra            ! data filename
8                   ! number of inputs per pattern 
7                   ! number of outputs per pattern 
2                   ! 1 to save data to a new file, 2 for input means 
                    !   and standard deviations, 3 for output means and 
                    !   standard deviations, 0 to stop
0                   ! Enter 0 to stop

      Here, "standard form" means that each training pattern 
      or vector consists of N inputs followed by M desired outputs.
      In this run, patterns in file twod.tra are counted and input means
      and standard deviations are calculated.
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>
   e. The program has counted the total number of training
      patterns or vectors.
   f. You can run this program on your own data by editing the parameter
      file under the "Batch" option, or by using the "Manual Run". If the
      data is not in standard form, use the "Manual Run" option.

 C. Time Series Program
  1. Creating a Training Data File from Time Series:
     Given a file of columnar time series data, in which each row 
     corresponds to a different time, the program slides a window 
     down the file and writes inputs and desired outputs to a 
     training data file.
  2. The user chooses some rows and columns in the sliding window
     as inputs, and chooses others as desired outputs. For example, 
     a file of price and volumn data on a stock can be processed 
     into a training data file for stock market prediction, if past 
     prices and volumns are inputs, and the present price is the 
     desired output.
  3. Input File Size
   a. Any number of training or testing patterns
   b. Up to 100 inputs
   c. Up to 40 outputs 
  4. Demo Run of Training File Creation From Time Series
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;

Exxon1     ! filename of file containing columns of time series data
4          ! number of columns of data in the file
1          ! number of columns from which desired outputs will be obtained
3          ! columns containing desired outputs (the current price)
2          ! number of columns from which inputs will be obtained 
2  3       ! specific columns from which inputs will be obtained (past prices and volumns)
1  4       ! number of rows back which contain inputs for column 2
1  4       ! number of rows back which contain inputs for column 3
Ex         ! filename chosen for the training data file

      Here, we will use historical stock price and volume data 
      to form training data for stock price prediction.
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>
   e. The program has created a file called "EX" which has 8 inputs 
      (four past prices and four past volumns) and 1 desired output (a price).
   f. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option.

 D. Data Compression Program
  1. Purpose of the Data Compression Program
     This program performs the forward or inverse Karhunen Loeve 
     transform (KLT) in order to
   a. Compress the input or desired output vectors in a 
      training data file so that redundancy is reduced and
      smaller networks can be trained,
   b. Expand or inverse transform a file containing network 
      output vectors, so that these vectors are the same size 
      as the original uncompressed desired output vectors.
  2. Functions of the Data Compression Program
   a. Reads a KLT matrix from the hard disk or constructs one from
      a data file with a user-chosen number of features. If the
      number of output features is less than the number of input
      features, then the transform matrix is rectangular and will 
      perform compression.
   b. Exits, or
   c. Transforms or compresses some of the elements in a data
      file's pattern vectors using the KLT matrix from part a above, or
   d. Inverse transforms or expands some of the elements in a data
      file's pattern vectors using the KLT matrix from part a above.
  3. Data file size
   a. Any number of training or testing patterns
   b. Up to 200 inputs
   c. Up to 200 outputs 
  4. Demo Run of the Data Compression Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;


TWO.KLT             ! filename for storing enter KLT matrix (make up a name if file doesn't exist)
TWOD.TRA            ! enter data filename
15                  ! number of elements per pattern including outputs 
2                   ! 1 if the file is a classification data file 2 else
1  8                ! indices of first and last elements to transform
4                   ! number of features desired in the compressed vector
1                   ! 1 to compress the elements, 2 to expand them, 3 to form a KLT matrix, 4 to stop
TWO                 ! filename for storing output patterns
1                   ! 1 to compress the file which is already opened, 2 to compress a new file
4                   ! 1 to compress the elements, 2 to expand them, 3 to form a KLT matrix, 4 to stop


      Here, we compress the 4 input features of the non-classification
      (mapping in this case) training data file twod.tra down to 4 features.
      "15" denotes the total number of elements per pattern, which includes
      the desired outputs. The KLT matrix is to be stored 
      in file two.klt. The output mapping training data file, two,
      has 4 features.
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>
   e. You can train a neural net using the file, two, if you like.
   f. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option, or by choosing the manual
      run.
 E. Feature Selection
  1. Actions Performed by the Feature Selection Program
     This program performs feature selection on training data files
     used for classification or mapping
   a. The program reads the training data filename, and asks the user
      whether the data is for classification or mapping. For 
      classification, the standard form, where inputs are followed by 
      the correct class number, is necessary. For mapping, the standard 
      form, where inputs are followed by the correct desired outputs,
      is necessary.
   b. The program analyzes the training data file and prints out a 
      list of feature numbers and numbers denoting importance.
      Larger values of the measure denote more importance. The program
      then orders the features according to their importance.
  2. Data file size
   a. Any number of training or testing patterns
   b. Up to 100 inputs
   c. Up to 40 outputs or classes 
  3. Demo Run of the Feature Selection Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;

twod.tra            ! enter data filename
2                   ! 1 for classification training data, 2 for mapping training data
8                   ! number of input features per pattern
7                   ! number of desired outputs per pattern


      Here, we analyze training data file twod.tra which has 8 input 
      features and 7 desired outputs. 
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>. Note 
      that the re-ordered features have the same order as the original
      features. This means that feature number 1 was the most important 
      one, feature number 2 was the 2nd most important one, etc.
   e. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option, or by choosing the manual
      run.

  
IV. Multilayer Perceptron Networks

    Outline

 A. MLP Network Limitations and Characteristics
 B. Multilayer Perceptron (MLP) Options
 C. Error Functions for Training MLP Networks
 D. MLP Network Sizing
 E. Fast Training Program;
 F. Network Modeling and Pruning Program
 G. Automated MLP Design 
 H. Processing Data with a Trained MLP 


 A. MLP Network Limitations and Characteristics
  1. There is no limitation on data file size.
  2. MLP neural nets are limited to 40 or fewer units for hidden or 
     output layers and 100 units for the input layer. 
  3. Activation Functions; Sigmoidal ( Out = 1/(1 + exp(-Net)) ) hidden 
        units and output units.
  4. Hidden layers allowed; 1 or 2
  5. Connectivity; Each hidden or output layer connects fully to the 
        previous layer and back to some user-chosen layer. For 
        example, full connectivity is allowed.

 B. Multilayer Perceptron (MLP) Options
  1. Network Sizing. Estimates attainable training error for an
     MLP or functional link network. Estimates the numbers of 
     hidden layers and hidden units, given a training data file.
  2. Fast training of MLP networks. Trains networks one or two 
     orders of magnitude faster than BP.
  3. Analyze and prune trained MLPs from BP or fast training.
     Produces weight and network structure files for the pruned network,
     which can be saved to disk, in the non-demo version.
  4. Automated MLP Design. Given a training data file and its number of
     inputs and desired outputs, an MLP is sized, designed and pruned.
  5. Process data using a trained MLP. Data may or may not include 
     desired outputs.
  6. Create formatted weight file. Given a network structure file and a 
     weight file, creates a formatted weight file that clearly shows 
     the different connections and their weights and thresholds.

 C. Error Functions for Training MLP Networks
  1. The error function that is being minimized during fast training is

                   Nout      
    MSE = (1/Npat) SUM MSE(k)     where
                   k=1  

              Npat              2
    MSE(k) =  SUM [ Tpk - Opk ]
              p=1  

     where Npat is the number of training patterns, Nout is the number 
     of network output nodes, Tpk is the desired output for the pth
     training pattern and the kth output, and Opk is the actual output 
     for the pth training pattern and the kth output. MSE is printed
     for each iteration.
  2. Additional errors printed out are defined as follows.
     The rms error of the kth output, RMS(k), is SQRT( MSE(k)/Npat ),
     where SQRT means square root. The kth output's Relative RMS Error is

    R(k) = SQRT( MSE(k)/E(k) ) where

            Npat           2
    E(k) =  SUM [ Opk-Mk ]      and
            p=1  

                  Npat 
    Mk = (1/Npat) SUM  Opk 
                  p=1  

     The kth output's Error Variance is MSE(k)/Npat.

 D. MLP Network Sizing
  1. Functions of the Network Sizing Program
   a. Reads a training data file
   b. Requests a minimum acceptable training error 
      from the user,
   c. Requests a maximum number of iterations from the user.
      A typical number is 10.
   d. Estimates the required network structure for an MLP to attain 
      the desired mean-squared error (MSE). 
   e. Plots network diagrams and corresponding mapping errors 
      (1 and 2 hidden layer cases) versus number of hidden units.
   f. Notifies user if the network will generalize or memorize
  2. Demo Run of the Network Structure Estimation Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;


TWOD.TRA            ! input training data filename
8                   ! number of inputs per pattern 
7                   ! number of outputs per pattern 
.2                  ! maximum acceptable mean-square training error
10                  ! maximum allowable number of iterations
                    ! press <Enter> to continue
3                   ! 1 to generate structure file, 2 to generate structure file, 3 to stop

      Here, we will estimate the required size of an MLP for 
      a remote sensing problem.
   c. Click on "Continue" and observe the program running. 
      Press <Pause> to get a look at the screen if it changes too fast.
   d. Go to the "Examine Program Output" option and press <ret>
   e. The program predicts that a network with 28 hidden units can be
      trained with a MSE of .33, and that networks with the proposed
      structure should successfully generalize
   f. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option.



 E. Fast Training Program;
  1. Purpose;
   a. Initialize a MLP using random initial weights
   b. Train a MLP network using a method much faster than BP
  2. Features;
   a. Uses a batching approach, so the order of training 
      patterns is unimportant
   b. Has adaptive learning factor
   c. Shows training MSE
   d. Has separating mean option when specifying a new network structure.
      The mean of some inputs can be subtracted from some of the inputs
      and some of the outputs. This is useful in image processing and 
      waveform processing. 
   e. Does not save weights to the disk, in the demo version
  3. Demo Run of Fast Training Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;


15, .2                ! Enter number of iterations, MSE threshold
2                     ! Enter 1 for old weights, 2 to initialize with random weights
two.top               ! file storing network structure
twod.tra              ! filename for training data 
1                     ! 1 if the data file contains desired outputs, 2 else
.03                   ! learning factor
two.wts               ! filename for saving the trained weights
4                     ! 1 to continue training, 2 to start new network, 3 for a new data file, 4 to stop


      The program will read all patterns from the file twod.tra, and 
      train a MLP using the network structure file two.top, which is 
      shown below.

           3
           8          20           7
           1           1


      The network will have 3 layers including 8 inputs, 20 hidden units
      in one hidden layer, and 7 outputs. In addition, layers 2 and
      3 connect to all previous layers. Training will stop
      after 15 iterations, or when the MSE reaches .2 . The final
      network weights will be stored in the file two.wts. 
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>
   e. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option, or by choosing "Manual Run".



 F. Network Modeling and Pruning Programs
  1. Given a trained MLP network;
   a. The pruning program measures the network' performance as a function of 
      the number of hidden units, and prunes the network of useless 
      hidden units. A network diagram is plotted, as is the mapping 
      error versus number of hidden units.
   b. The modelling program models the network with polynomials of 
      varying degree, allowing one to determine the effective degree 
      (the degree of a polynomial which models the network) or the 
      amount of nonlinearity of the network. Mapping error is plotted 
      versus network degree. First degree networks are virtually
      linear and do not require hidden units.
  2. The non-demo version of the pruning program; 
      Prunes the net with a user-chosen amount of error and saves the 
      pruned net to disk
  3. Demo Run of Network Pruning Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;

two.top             ! structure file of trained network
twod.tra            ! data file
two.wts             ! weight file of trained network
1.14                ! Maximum acceptable ratio of new MSE to old MSE
tw.top              ! structure filename for the pruned network
tw.wts              ! weight filename for the pruned network


      The program will prune the network from the fast training program,
      which has network structure file two.top and weight file two.wts,
      so that it has an MSE 1.14 times the original one.
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>
   e. The program has created a new network, having 11 hidden units,
      which performs about as well as the original network.
   f. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option. Realistically, it is 
      more productive to use the "Manual Run" with this program.



 G. Automated MLP Design
  1. Basic Idea
   a. To generate a mapping MLP with little or no user input
   b. The user must know the training data file's name and its
      numbers of inputs and desired outputs
  2. Processing
   a. Generates a MLP network structure file
   b. Trains an MLP
   c. Prunes the MLP
  3.Sub-Menu Options
   a. Fully Automated: Inputs are the training data filename and the
      numbers of inputs and desired outputs. Everything is done 
      automatically with no further user input.
   b. Semi Automated: Inputs are same as in Fully Automated case. 
      However, the parameter files for performing sizing, training, and
      pruning can be edited by the user to change the number of training
      iterations, the threshold pruning error, and the final output 
      network structure and weight filenames.
   c. demo: This is a demo of the Fully Automated option.
  4. Demo Software Operation
     if the software package is the demo version,
   a. The Automated MLP Design operates in non-demo mode if the training
      data file has 8 inputs and 7 desired outputs.
   b. Otherwise, the Automated MLP Design operates in demo mode, which means
      that;
   (1) the sizing program always "automatically" generates network
       structure files with the same number of hidden units,
   (2) the training program does not save weights, and
   (3) pruning is not used.
  5. Demo Run of Automated MLP Design 
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;


c:\neuron\batchm\dat\twod.tra     ! training data filename
8     7                           ! Numbers of inputs and desired outputs


      The program will process training file twod.tra and produce an
      MLP with the structure and weight files junk.top and junk.wts
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Log" option and press <ret>
   e. You can run this program on your own data, simply by using the
      Semi Automated mode, where the parameter files for sizing, training,
      and pruning can be edited.


 H. Processing Data with a Trained Network
  1. Files Read
   a. Network structure file
   b. Weight file for a trained network
   c. Input data filename and type
  2. Processing;
   a. Processes inputs into output layer activations
   b. Saves network outputs to disk. Can save desired outputs as well.
   c. Total MSE and MSE per output are calculated. 
  3. Demo Run of MLP Processing Program
   a. Go to the "Demo" option and press <ret>
   b. Observe the parameter file with commented keyboard responses;


twod.out            ! Filename for storing network Outputs 
two.top             ! Network structure filname
twod.tra            ! input data filename
1                   ! 1 if file includes desired outputs, 0 else 
TWO.wts             ! Filename containing weights
1                   ! 1 to save desired output vector followed by actual output
                    !     vector for each pattern, 0 for actual outputs only
2                   ! 1 to continue, 2 to stop


      The program will process training file twod.tra using the network
      with structure file two.top and weight file two.wts.
   c. Click on "Continue" and observe the program running
   d. Go to the "Examine Program Output" option and press <ret>
   e. You can run this program on your own data, simply by editing the 
      parameter file in the "batch Run" option.




V. Functional Link Training and Testing Program

    Outline

 A. Program Purpose 
 B. Network Characteristics
 C. Files Needed or Produced
 D. Example Run of Functional Link Program
 E. Error Functions for Functional Link Net



 A. Program Purpose 
  1. Initialize and train a functional link mapping network using 
      a fast training method.
  2. Process a data file having no desired outputs.
  3. Non-demo version saves weights to a disk file.
 
 B. Network Characteristics
  1. Activation Functions; Linear output units 
  2. Net Functions; polynomial functions of the inputs, with 
        user-chosen degrees of 1 to 5.

 C. Files Needed or Produced
  1. The network structure file; stores the number of inputs and outputs, 
        and the polynomial degree.
  2. The training or testing data file, which gives example inputs 
     and outputs for network learning, or for testing after learning.
     All data files are in formatted, standard form, which means that 
     each pattern or feature vector is followed by the desired outputs.

 D. Demo Run of Functional Link Program
  1. Go to the "Demo" option and press <ret>
  2. Observe the parameter file with commented keyboard responses;

1                   ! Enter 1 to train a network, 2 to test a network
1                   ! Enter 1 to use an old network structure, 2 for a new one
two.tp              ! old network structure filename
twod.tra            ! data filename
0                   ! Enter number of patterns to read (0 for all) 
1, 2                ! Enter numbers of first and last patterns to examine
2                   ! 1 for old weights, 2 for new random initial weights
two.wt              ! filename for saving trained weights
3                   ! 1 to start new network, 2 to test the network on a data file, 3 to stop


     The program will read all patterns from the file twod.tra, and train a
     functional link net using the network structure file two.tp, which
     is shown below.

           3           8           7
         165           7

     The network will be 3rd degree with 8 inputs and 7 outputs. The 
     final network weights will not be stored in the demo version.
  3. Click on "Continue" and observe the program running
  4. Go to the "Examine Program Output" option and press <ret>
  5. You can run this program on your own data, simply by editing the 
     parameter file in the "batch Run" option.

 E. Error Functions for Functional Link Net
  1. The error function that is being minimized during functional link
     training is

                   Nout      
    MSE = (1/Npat) SUM MSE(k)     where
                   k=1  

              Npat              2
    MSE(k) =  SUM [ Tpk - Opk ]
              p=1  

     where Npat is the number of training patterns, Nout is the number 
     of network output nodes, Tpk is the desired output for the pth
     training pattern and the kth output, and Opk is the actual output
     for the pth training pattern and the kth output. MSE is printed
     for each iteration.
  2. Additional errors printed out are defined as follows.
     The rms error of the kth output, RMS(k), is SQRT( MSE(k)/Npat ),
     where SQRT means square root. The kth output's Relative RMS Error is

    R(k) = SQRT( MSE(k)/E(k) ) where

            Npat           2
    E(k) =  SUM [ Opk-Mk ]      and
            p=1  

                  Npat 
    Mk = (1/Npat) SUM  Opk 
                  p=1  

     The kth output's Error Variance is MSE(k)/Npat.



VI. Modular Network Training and Testing Program

    Outline

 A. Program Features 
 B. Network Characteristics
 C. Files Needed or Produced
 D. Example Run of Modular Net Program
 E. Error Functions for Modular Net



 A. Program Features 
  1. Initialize and train a modular mapping network using 
      a fast training method. Plots network diagram and 
      training error versus number of iterations.
  2. Process a data file with or without desired outputs.
  3. Non-demo version saves weights to a disk file.
  4. Has separating mean option when specifying a new network structure.
     The mean of some inputs can be subtracted from some of the inputs
     and some of the outputs. This is useful in image processing and 
     waveform processing. 
 
 B. Network Characteristics
  1. Maximum number of inputs; 100.  Maximum number of outputs; 40.
     Maximum number of modules; 20.
  2. Given and input vector and a trained network, the modular net 
     switches on the appropriate module, which then processes the
     input vector into the output vector.

 C. Files Needed or Produced
  1. The network weight file; stores the number of inputs, outputs, 
     modules. Also stores the weights. The weight file is produced by
     the program during training. The file is requested by the program
     during testing.
  2. The training or testing data file, which gives example inputs 
     and outputs for network learning, or for testing after learning.
     All data files are in formatted, standard form, which means that 
     each pattern or feature vector is followed by the desired outputs.

 D. Demo Run of Modular Net Program
  1. Go to the "Demo" option and press <ret>
  2. Observe the parameter file with commented keyboard responses;


twod.tra            ! data filename
8 7                 ! Enter number of inputs and desired outputs in the file
15                  ! Enter number of training iterations
10                  ! Enter maximum number modules to use
1                   ! Enter 1 for regular network, 2 for separating mean network
0                   ! 1 to continue training, 2 to start a new network, 3 to 
                    ! apply the network to a data file, 4 to save weights, 
                    ! 5 to open a new data file, 0 to stop


     The program will read all patterns from the file twod.tra, and train a
     modular net having 8 inputs, 7 output, and 10 modules. 15 training
     iterations will be used. The final network weights will not be 
     stored in the demo version.
  3. Click on "Continue" and observe the program running
  4. Go to the "Examine Program Output" option and press <ret>
  5. You can run this program on your own data, simply by editing the 
     parameter file in the "batch Run" option.

 E. Error Functions for Modular Net 
  1. The error function that is being minimized during modular net
     training is

                   Nout      
    MSE = (1/Npat) SUM MSE(k)     where
                   k=1  

              Npat              2
    MSE(k) =  SUM [ Tpk - Opk ]
              p=1  

     where Npat is the number of training patterns, Nout is the number 
     of network output nodes, Tpk is the desired output for the pth
     training pattern and the kth output, and Opk is the actual output
     for the pth training pattern and the kth output. MSE is printed
     for each iteration.
  2. Additional errors printed out are defined as follows.
     The rms error of the kth output, RMS(k), is SQRT( MSE(k)/Npat ),
     where SQRT means square root. The kth output's Relative RMS Error is

    R(k) = SQRT( MSE(k)/E(k) ) where

            Npat           2
    E(k) =  SUM [ Opk-Mk ]      and
            p=1  

                  Npat 
    Mk = (1/Npat) SUM  Opk 
                  p=1  

     The kth output's Error Variance is MSE(k)/Npat.




VII. Unsupervised Learning

 A. Input Data Formats 
  1. Each pattern must have inputs followed by 0 or more 
     outputs. Therefore, training data files will work.
  2. Training data for classification typically has N features 
     followed by the class id. 
  3. Training data for mapping typically has N
     features followed by several desired output values. 

 B. Output Data Format
    The formatted output file from clustering includes the number 
    of clusters, followed by the cluster vectors themselves.

 C. Available Algorithms 
  1. Cluster a data file using Sequential Leader or 
     K-Means Clustering. 
  2. Cluster a data file using Kohonen's Self-Organizing 
     Feature Map.
  3. Classify a data file using clusters from K-Means or Self-Organizing 
     Map clustering. The number of the cluster to which each vector is
     closest is determined.

 
 D. Error Function for Unsupervised Learning

   The error function that is being minimized during K-Means 
   clustering and self-organizing map training is

                    N      
    MSE = (1/Npat) SUM MSE(k)     where
                   k=1  

              Npat                      2
    MSE(k) =  SUM [ x(p,k) - m(i(p),k) ]  ,
              p=1  

    Npat is the number of training patterns, N is the number 
    of inputs per pattern, x(p,k) is the kth input sample from the
    pth pattern, m(i,k) is the kth sample from the ith cluster, and
    i(p) is the index of the cluster to which the pth pattern
    belongs

 E. Conventional Clustering Comments 
  1. Cluster a data file using Sequential Leader or 
     K-Means Clustering. 
  2. Desired outputs, if any, can be ignored.
  3. Plots a network diagram. For sequential leader clustering 
     the accumulated clustering error is plotted versus pattern 
     number. For K-means clustering, clustering error is plotted 
     versus iteration number.

 F. Demo Run of Conventional Clustering
  1. Under the "Conventional Clustering" option, choose "Demo"
  2. From the parameter file,

15      ! number of elements in a pattern (inputs plus outputs)
7       ! number of outputs in a pattern (class id not used)
Twod.tra ! filename for training set
1       ! Enter 1 to start clustering, 2 to read old clusters
1       ! Enter 1 for sequential leader clustering, 2 to refine the clusters using K-Means, 3 to stop
15.     ! threshold for sequential leader clustering
        ! Enter <return> to proceed
2       ! refine the clusters using K-Means
10      ! number of K-Means iterations
        ! Enter <return> to proceed
3       ! stop clustering
1       ! save clusters
cl      ! filename for saved clusters

     we see that the program will apply sequential leader clustering
     to the file Twod.tra, with a threshold of 15. Then 10 iterations of
     K-Means clustering will be used. The clusters will be saved in 
     a file called cl.
  3. Click on "Continue" to  exit and observe the program running.
  4. After running the program, we can "Examine Program Output",
     where we observe that the normalized clustering error is 3.200854.

 G. Self-Organizing Map Comments 
  1. Cluster a data file using Kohonen's Self-Organizing 
     Feature Map.
  2. Desired outputs, if any, can be ignored.
  3. Plots a 2-D projection of the clusters and the clustering error 
     versus iteration number. 

 H. Demo Run of Self-Organizing Map
  1. Under the "Neural Clustering" option, choose "Demo"
  2. From the parameter file,

15      ! number of elements in a pattern (inputs plus outputs)
7       ! number of outputs in a pattern (class id not used)
Twod.tra ! filename for shape recognition training set
2       ! Enter 1 to display a network diagram, 2 to show 
        !   self-organizing clusters 
1       ! Enter 1 to randomly initialize clusters, 2 to read old cluster file
36      ! pick 36 as the number of clusters
20      ! number of iterations
2       ! use linearly decreasing learning factor and neighborhoods
.8  5   ! initial learning factor and half-neighborhood size
        ! Enter <return> to proceed
3       ! Enter 1 to save clusters, 2 to re-initialize them, 3 to
        ! continue clustering, 4 to stop
5       ! number of iterations
2       ! use linearly decreasing learning factor and neighborhoods
.04  0  ! initial learning factor and half-neighborhood size
        ! Enter <return> to proceed
1       ! Enter 1 to save clusters, 2 to re-initialize them,  3 to
        ! continue clustering, 4 to stop
sm      ! filename for saved clusters
4       ! 4 to stop

     we see that the program will apply Self-Organizing Map clustering
     to the file Twod.tra with 20 iterations. The number of random initial 
     clusters is 36. The initial learning factor and half-neighborhood
     size are respectively .8 and 5, and linearly decreasing neighborhoods 
     and learning factor are chosen. After 20 iterations, 5 additional 
     iterations are specified. The clusters will be saved in a file 
     called sm.
  3. Click on "Continue" to leave the parameter file and run the program.
  4. After running the program, we can "Examine Program Output",
     where we observe that the normalized clustering error is 2.84466.
  5. You can run this program on your own data, simply by editing the 
     parameter file in the "batch Run" option.


 I. Classify Vectors 
  1. Opens a data file 
  2. Opens a cluster file generated by SOM or K-means clustering
  3. Classifies all vectors in the file
  4. Saves the resulting cluster numbers to a formatted output file
  5. After running the program, you can "Examine Program Output"


VIII. Utilities
 A. View a File
     View a user-chosen ASCII file
 B. Combine Files
     Combine user-chosen columns from one or more files into a new file
 C. Split Files
     Randomly split a file into two new ones, given the input filename,
     two output filenames, and a probability of assignment to the first
     output file. This allows the user to create training and testing
     files from a single file.
 D. Examine a File
     Given a user-chosen ASCII file's name, calculate means, standard 
     deviations, histograms, and plots of columns in the file.


Appendix 1. Terminology


activation function - a usually nonlinear function that processes 
  a unit's net function into the unit's output. (See "multilayer 
  perceptron" and "net function") The sigmoid activation is

           1
  O = ------------ 
      1 + exp(net)


classification - the process of making one of a fixed number of
  possible decisions, given a fixed number of numerical inputs. 
  The output of classification is an integer which indicates 
  the class decision. A network for classifying images of 
  handprinted numerals (0 through 9) would have 10 outputs 
  (in uncoded format). A classifier for processing stock market 
  data could make buy/sell decisions but would not predict
  future prices.

clustering - see unsupervised learning

coded format outputs - in a classification network, coded output 
  format means that the number of outputs is Nout = Log2 (Nc) where 
  Nc is the number of classes. The Nc desired output vectors are 
  then just Nout-bit binary numbers between 0 and Nc-1. 

error function - the function which is minimized during neural net
  training or unsupervised learning. The specific error functions
  minimized in this software package are given in the help files
  for the algorithm in question.

Euclidean distance - given two vectors x and y with N elements each,
  the Euclidean distance is the square root of 

        N               2
       Sum (y(i) - x(i))
       i=1

functional link net - A functional link net is a network in which
  (1) nonlinear functions of the inputs are formed to augment or
  add to the input vector, and (2) the outputs are linear functions
  of the augmented input vector. In the most common form of the
  functional link net, the augmented inputs are multinomials 
  formed from the original inputs. Since linear equations can 
  be solved for the output weights, functional link net training 
  is multidimensional polynomial regression. One problem with 
  this type of network is that it suffers from combinatorial 
  explosion. In other words, the number of possible multinomials 
  grows explosively with the network degree.

KLT-Karhunen Loeve transform. A linear, orthogonal transform in which
  the rows of the N by N (for compression, M by N where M is less than
  N) transformation matrix are eigenvectors of the N by N autocovariance
  matrix of the N-dimensional input vectors to be transformed. The KLT
  transformation matrix is the transpose of the U matrix from the singular
  value decomposition (SVD) of the autocovariance matrix. The KLT is 
  the optimal transform for compressing data, if your goal is to later
  reconstruct the data with the least mean-square error, using a 
  given number of KLT coefficients. The KLT does not, in general, 
  optimally compress neural net inputs, since the KLT transformation
  matrix is not designed using information from the desired output
  vectors. However, it can optimally compress the desired output
  vectors if it was designed using them instead of the input vectors.
  Another name for the KLT is principal components.
    
k-means clustering - given Nc initial clusters, which could come from
  sequential leader clustering, k-means iteratively (1) calculates a new
  mean vector for each cluster (necessary if any input vectors have changed
  clusters) and (2) reclassifies the input vectors to their nearest
  cluster. The sum of the distances between the input vectors and the 
  closest mean vectors is reduced. A distance measure, usually the 
  Euclidean distance, is used. In adaptive k-means, the reclassification 
  and mean calculation steps are performed during one pass through the 
  data. 

layers - An MLP in this software package can have 2 to 4 layers, 
  including the input and output layers. Therefore, a 3-layer network 
  has one hidden layer and a 4-layer network has two hidden layers.

mapping - In mapping, you process numerical inputs into a real-valued 
  (floating point) outputs. A mapper for processing stock market data 
  could predict future prices, but would not make a buy/sell decision.

modular network - a neural network which consists of several networks
  connected together. The modules may all work in parallel with their
  answers then combined, in series, or in parallel with only one at a
  time being switched on. We use the latter scheme in this software 
  package. 

MSE threshold - one of the two stopping parameters. It is a threshold 
  on the MSE used in training a functional link net or MLP. If the MSE 
  falls below this threshold, training is stopped. To disable this 
  parameter, use a negative value for it such as -1.

multilayer perceptron (MLP) - An MLP, sometimes called a backpropagation
  neural network, is a feedforward (usually) network in which outputs
  are algebraic, nonlinear functions of inputs. The MLP has at least
  two layers of units or artificial neurons, the input and output layers. 
  Additional layers, which make the network nonlinear, are called 
  hidden layers. In each hidden layer or output layer unit, an inner 
  product of weights and signals from previous layers, called a net 
  function, is formed. The unit's output is formed by putting the net 
  function through the activation function, which is usually nonlinear. 
  (See "activation function" and "net function" )

multinomial - a one-term polynomial in two or more variables which 
  has no coefficient, such as 
         2    3
     (x1) (x2)

net function - In the multilayer perceptron, a unit's first operation 
  is to form a number, called the net, using an inner product of 
  weights or coefficients multiplied by signals or unit outputs 
  from previous layers. The net function is then fed into the activation 
  function, yielding the unit's output. (See "activation function" and 
  "multilayer perceptron")
 
network degree - the degree of a polynomial that approximates a 
  given MLP network with a given amount of approximation error. As
  the network degree increases, this approximation error decreases.
  Network degree estimation is performed by program Tmapc2.

network structure file - a file that specifies the structure of 
  a network. For the MLP, this file stores the number of network 
  layers, units per layer, and connectivity between layers.
  For a functional link net, this file specifies the network degree, 
  numbers of inputs and outputs, and the dimension of the multinomial 
  vector.

number of iterations - one of two stopping parameters used in 
  functional link nets and MLPs. This is the maximum number of 
  iterations that can be performed, and is user-chosen.

pruning - finding and eliminating the less useful hidden units in a
  trained MLP.

program log - This is a log of the tasks performed by and the files
  processed by  program in this package.

self-organizing map (SOM) - given Nc initial random clusters, the SOM
  performs an adaptive k-means clustering, except that when a cluster mean
  is updated, its nearest neighbors are also updated. There is a learning 
  factor and a distance threshold which decrease as clustering progresses.

separating mean processing - the mean of some of the inputs is subtracted
  from some inputs and some outputs before training takes place. In a 
  trained network, the separated mean can be added back to the appropriate
  outputs.  This is useful in image processing and waveform processing. 
  The MLP and modular net training programs allow the user to choose 
  separating mean processing. In the MLP training program, this is a 
  "special option" when a new network structure file is being produced.

sequential leader (SL) clustering - In SL clustering we are given some
  input vectors, a distance threshold, and one cluster which is the 
  first input vector to be processed. As each subsequent input vector 
  is processed, it is either (1) assigned to the cluster it is closest 
  to, if the distance is below the threshold, or (2) used as the 
  center vector of a new cluster.

sizing - determining the required size of a MLP from a training data
  file.

standard form - All training data files are in standard form, which 
  means that the file is formatted, and that each pattern or vector 
  has inputs on the left and desired outputs on the right. You can type 
  out the files to examine them, and you can use these files with 
  other neural net software. If a testing data file includes desired 
  outputs, then it too will be in standard form.

stopping parameters - parameters that specify how training 
  will end. See number of iterations and MSE threshold.

testing data file - the same as a training data file except that 
  (1) it is used to test the performance of a trained network and 
  (2) it may or may not have desired outputs. 

training data file - a formatted file with Nv vectors or patterns.
  Each vector includes N inputs and Nout desired outputs. In 
  classification training data files, the correct class id, which
  is an integer, is stored rather than the Nout desired outputs. 
  See standard form.

training parameters - the learning factor (Z in this software 
  package), and the momentum factor alpha.

uncoded format outputs - in a classification network, uncoded output 
  format means that the number of outputs is Nout = Nc where 
  Nc is the number of classes. The desired output can then be 1 for 
  the correct class and 0 for the others, or 0 for the correct class 
  and 1 for the others (inverted uncoded format). The Nc desired 
  output vectors are then just Nc-bit binary numbers. In the 
  classification network package Neucls.zip, inverted uncoded format 
  and coded format are available.

units - artificial neurons used in the MLP network.

unsupervised learning - Unsupervised learning or clustering is the 
  process of organizing a set of vectors into groups of similar 
  vectors. In many clustering algorithms, each cluster is 
  characterized using a mean or center vector. Unsupervised learning 
  algorithms usually use a distance measure, such as the Euclidean
  distance, to measure the closeness of a data vector to a cluster
  or mean or center vector.

weight file - an unformatted file which gives the gains or coefficients
  along paths connecting the various units.





