matlab - Using a single weight matrix for Back-Propagation in Neural Networks -
in neural network have combined of weight matrices 1 large matrix: e.g 3 layer matrix has 3 weight matrices w1, w2, w3, 1 each layer. have created 1 large weight matrix called w, w2 , w3 appended onto end of w1. if w1 has 3 columns, w2 has 3 columns, , w3 has 2 columns, matrix w have 8 columns.
the number of layers/ number of inputs/outputs stored global variable.
this means can use feedforward code 2 input arguments, feedforward code splits w w1, w2, w3...etc, inside function.
output_of_neural_net = feedforward(input_to_neural_net,w) i store training data global variable. means can use cost function 1 input argument.
cost = costfn(w) the purpose of can use built in matlab functions minimise cost function , therefore obtain w gives network best approximates training data.
i have tried fminsearch(@costfn,w) , fminunc(@costfn,w). both give mediocre results function trying approximate, although fminunc better.
i want try back-propagation train network, see if better job, implementations of networks multiple weight matrices, making more complicated.
my question is: able implement propagation single appended weight matrix, , how can this?
i feel using single weight matrix should make code simpler, can't work out how implement it, other examples have seen multiple weight matrices.
additional information
the network function approximator between 8 , 30 inputs, , 3 outputs. function approximating quite complicated , involves inverse of elliptic integrals (and has no analytical solution). inputs , outputs of network normalised between 0 , 1.
there several problems approach describing.
first, have described, there no simplification of feed-forward code or backpropagation code. combining 3 weight matrices one, allows feedforward , costfn functions take fewer arguments still have unpack w inside functions implement forward , backpropagation logic. feedforward , backpropagation logic requires evaluation of activation function , derivative in each layer can't represent simple matrix multiplication.
the second issue constraining structure of neural network packing 3 weight matrices 1 appending columns. number or rows , columns in weight matrix correspond number of neurons , inputs in layer, respectively. suppose have m inputs network , n neurons in first layer. w1 have shape (n, m). in general, fully-connected network, layer 2 weights (w2) have shape (k, n), n number of inputs (which constrained number of outputs first layer) , k number of neurons in second layer.
the problem since creating 1 combined weight matrix appending columns, k (the number of rows in second weight matrix) have same number of rows/neurons first layer, , on successive layers. in other words, network have shape m x n x n x n (m inputs, n neurons in each layer). bad constraint have on network since typically don't want same numbers of neurons in hidden layers in output layer.
note simplification, have ignored bias inputs same issues exist if included.
Comments
Post a Comment