A simple Neural Net in PB

doctornash · Post by **doctornash** » Sat Aug 10, 2019 11:04 am

This is a little 2 layer neural net from scratch in 'simple code' which is supposed to produce the result y=(a*b)+c when presented with three inputs a,b,c where 0<(y,a,b,c)<1. It consists of 3 neurons in a hidden layer and one neuron at the output. Each of the neurons has a sigmoid activation function. The network learns with back propagation: after each iteration the weights are updated. The weights are initially randomly set. I have attached images which show how the forward and back propagation formulae are derived. The training data consists of 9 examples into the network of y=(a*b)+c. For instance, with reference to the data setup in the 2D array in the code:
(inputarr(0,1)*inputarr(1,1) ) + inputarr(2,1) = target01(1)
ie ( 0.8*0.8 ) + 0.1 = 0.74.
The validation data also consists of 9 examples (to test the accuracy of the model on data it has never seen before).

The issue (as can be seen when the code is run) is that the model produces virtually no error on the training data, but doesn't do well on the validation data. Any advice as to what's going on here and how to resolve it? Could this be a case of overfitting' - where a model great at memorizing, but not learning'?

Code: Select all

Global Dim InputArr.d(2,8)
Global Dim target01.d(8)
Global Dim iVal.d(2)
  
Global mu.d = 0.2 ;Learning Rate
Global NumEpochs = 1000000 ;Number of iterations
  
Global Neth1.d 
Global Neth2.d 
Global Neth3.d
Global Outh1.d
Global Outh2.d
Global Outh3.d
  
Global w1.d = (Random(99)+1)/100
Global w2.d = (Random(99)+1)/100
Global w3.d = (Random(99)+1)/100
Global w4.d = (Random(99)+1)/100
Global w5.d = (Random(99)+1)/100
Global w6.d = (Random(99)+1)/100
Global w7.d = (Random(99)+1)/100
Global w8.d = (Random(99)+1)/100
Global w9.d = (Random(99)+1)/100
Global w10.d = (Random(99)+1)/100
Global w11.d = (Random(99)+1)/100
Global w12.d = (Random(99)+1)/100

  inputArr(0,0) = 0.2
  inputarr(0,1) = 0.8
  inputarr(0,2) = 0.6
  inputarr(0,3) = 0.1
  inputarr(0,4) = 0.5
  inputarr(0,5) = 0.7
  inputarr(0,6) = 0.1
  inputarr(0,7) = 0.8
  inputarr(0,8) = 0.3
  
  inputArr(1,0) = 0.1
  inputarr(1,1) = 0.8
  inputarr(1,2) = 0.9
  inputarr(1,3) = 0.5
  inputarr(1,4) = 0.1
  inputarr(1,5) = 0.3
  inputarr(1,6) = 0.9
  inputarr(1,7) = 0.6
  inputarr(1,8) = 0.2
  
  inputArr(2,0) = 0.3
  inputarr(2,1) = 0.1
  inputarr(2,2) = 0.4
  inputarr(2,3) = 0.8
  inputarr(2,4) = 0.2
  inputarr(2,5) = 0.7
  inputarr(2,6) = 0.2
  inputarr(2,7) = 0.1
  inputarr(2,8) = 0.6
  
  target01(0) = 0.32
  target01(1) = 0.74
  target01(2) = 0.94
  target01(3) = 0.85
  target01(4) = 0.25
  target01(5) = 0.91
  target01(6) = 0.29
  target01(7) = 0.58
  target01(8) = 0.66  

For n = 0 To NumEpochs

        For v = 0 To 8
          For t = 0 To 2
            ival(t) = inputarr(t,v)
          Next

                  Neth1.d = (ival(0)*w1)+(ival(1)*w2)+(ival(2)*w3)
                  Neth2.d = (ival(0)*w4)+(ival(1)*w5)+(ival(2)*w6)
                  Neth3.d = (ival(0)*w7)+(ival(1)*w8)+(ival(2)*w9)
                  
                  Outh1.d = 1/(1+Exp(-Neth1))
                  Outh2.d = 1/(1+Exp(-Neth2))
                  Outh3.d = 1/(1+Exp(-Neth3))
                  
                  Net01.d = (Outh1*w10)+(Outh2*w11)+(Outh3*w12)
                  Out01.d = 1/(1+Exp(-Net01))
                  
                  w10 = w10 - mu*(Out01-target01(v))*Out01*(1-Out01)*Outh1
                  w11 = w11 - mu*(Out01-target01(v))*Out01*(1-Out01)*Outh2
                  w12 = w12 - mu*(Out01-target01(v))*Out01*(1-Out01)*Outh3
                  
                  w1 = w1 - mu*(Out01-target01(v))*Out01*(1-Out01)*w10*Outh1*(1-Outh1)*ival(0)
                  w2 = w2 - mu*(Out01-target01(v))*Out01*(1-Out01)*w10*Outh1*(1-Outh1)*ival(1)
                  w3 = w3 - mu*(Out01-target01(v))*Out01*(1-Out01)*w10*Outh1*(1-Outh1)*ival(2)
                  
                  w4 = w4 - mu*(Out01-target01(v))*Out01*(1-Out01)*w11*Outh2*(1-Outh2)*ival(0)
                  w5 = w5 - mu*(Out01-target01(v))*Out01*(1-Out01)*w11*Outh2*(1-Outh2)*ival(1)
                  w6 = w6 - mu*(Out01-target01(v))*Out01*(1-Out01)*w11*Outh2*(1-Outh2)*ival(2)
                  
                  w7 = w7 - mu*(Out01-target01(v))*Out01*(1-Out01)*w12*Outh3*(1-Outh3)*ival(0)
                  w8 = w8 - mu*(Out01-target01(v))*Out01*(1-Out01)*w12*Outh3*(1-Outh3)*ival(1)
                  w9 = w9 - mu*(Out01-target01(v))*Out01*(1-Out01)*w12*Outh3*(1-Outh3)*ival(2)
                  
            Next        
                  If n = NumEpochs 
                    Debug "Weights after" + " " + Str(NumEpochs) + " " + "Epochs:"
                    Debug StrD(w1) + ";" + StrD(w2) + ";" + StrD(w3) + ";" + StrD(w4) + ";" + StrD(w5) + ";" + StrD(w6) + ";" + StrD(w7) + ";" + StrD(w8) + ";" + StrD(w9) + ";" + StrD(w10) + ";" + StrD(w11) + ";" + StrD(w12) 
                  EndIf  
Next

 Debug "******************************************"  

;*************Now compare results over the training data (loss wrt training data)******************

            Debug "[target;result] for training data:"
            For v = 0 To 8
                      For t = 0 To 2
                        ival(t) = inputarr(t,v)
                      Next
  
                  Neth1 = (ival(0)*w1)+(ival(1)*w2)+(ival(2)*w3)
                  Neth2 = (ival(0)*w4)+(ival(1)*w5)+(ival(2)*w6)
                  Neth3 = (ival(0)*w7)+(ival(1)*w8)+(ival(2)*w9)
                  
                  Outh1 = 1/(1+Exp(-Neth1))
                  Outh2 = 1/(1+Exp(-Neth2))
                  Outh3 = 1/(1+Exp(-Neth3))
                  
                  Net01 = (Outh1*w10)+(Outh2*w11)+(Outh3*w12)
                  Out01 = 1/(1+Exp(-Net01))
                  Debug StrD(target01(v)) + ";" + StrD(Out01)
                  
            Next      
            
 Debug "******************************************"           
            
;**************Now compare results over VALIDATION data (loss wrt validation data)***********************
  
  InputArr(0,0) = 0.5
  inputarr(0,1) = 0.1
  inputarr(0,2) = 0.4
  inputarr(0,3) = 0.9
  inputarr(0,4) = 0.2
  inputarr(0,5) = 0.6
  inputarr(0,6) = 0.7
  inputarr(0,7) = 0.2
  inputarr(0,8) = 0.8
          
  InputArr(1,0) = 0.8
  inputarr(1,1) = 0.7
  inputarr(1,2) = 0.6
  inputarr(1,3) = 0.5
  inputarr(1,4) = 0.8
  inputarr(1,5) = 0.3
  inputarr(1,6) = 0.2
  inputarr(1,7) = 0.3
  inputarr(1,8) = 0.4
  
  InputArr(2,0) = 0.3
  inputarr(2,1) = 0.6
  inputarr(2,2) = 0.5
  inputarr(2,3) = 0.25
  inputarr(2,4) = 0.2
  inputarr(2,5) = 0.32
  inputarr(2,6) = 0.1
  inputarr(2,7) = 0.4
  inputarr(2,8) = 0.4
  
  
  target01(0) = 0.7
  target01(1) = 0.67
  target01(2) = 0.74
  target01(3) = 0.7
  target01(4) = 0.36
  target01(5) = 0.5
  target01(6) = 0.24
  target01(7) = 0.46
  target01(8) = 0.72
  
        Debug "[target;result] for validation data:"
        For v = 0 To 8
          For t = 0 To 2
            ival(t) = inputarr(t,v)
          Next
  
                  Neth1.d = (ival(0)*w1)+(ival(1)*w2)+(ival(2)*w3)
                  Neth2.d = (ival(0)*w4)+(ival(1)*w5)+(ival(2)*w6)
                  Neth3.d = (ival(0)*w7)+(ival(1)*w8)+(ival(2)*w9)
                  
                  Outh1.d = 1/(1+Exp(-Neth1))
                  Outh2.d = 1/(1+Exp(-Neth2))
                  Outh3.d = 1/(1+Exp(-Neth3))
                  
                  Net01.d = (Outh1*w10)+(Outh2*w11)+(Outh3*w12)
                  Out01.d = 1/(1+Exp(-Net01))
                  Debug StrD(target01(v)) + ";" + StrD(Out01)
         Next

doctornash · Post by **doctornash** » Mon Aug 12, 2019 8:04 am

In answer to my own question, yes the model is overfitting. When that happens one of the things one should try adjusting is the Learning Rate - start coarse and go granular (whereas I started way too granular). Check the error on the validation data on each iteration and look for the lowest level. Doing this, I found that a Learning Rate of 0.7 and only 4500 iterations gave target vs actual of within 10% on most of the training AND validation data, per below. This shows the model can actually learn ie generalize

To get higher accuracy, one obvious thing to do is use more training data. Apparently, there is a 'rule of thumb' which says number of training examples = number of weights in the network/permitted error. So for a 5% error, one should have around 12/0.05 = 240 training examples

[target;result] for training data:
0.3200000000;0.2596030106
0.7400000000;0.7184651704
0.9400000000;0.8752468075
0.8500000000;0.8368902800
0.2500000000;0.2661795673
0.9100000000;0.8649186409
0.2900000000;0.2887290556
0.5800000000;0.5903850258
0.6600000000;0.7082227154

[target;result] for validation data:
0.7000000000;0.7577143425
0.6700000000;0.7005485057
0.7400000000;0.8114189230
0.7000000000;0.6827623134
0.3600000000;0.3916939342
0.5000000000;0.5645791491
0.2400000000;0.2855101837
0.4600000000;0.4897463133
0.7200000000;0.7371721062

djes · Post by **djes** » Mon Aug 12, 2019 5:46 pm

Interesting ! Thank you for sharing.

PureBasic Forums - English

A simple Neural Net in PB

A simple Neural Net in PB

Re: A simple Neural Net in PB

Re: A simple Neural Net in PB