Mathématiques du traitement du signal/Réseaux de neurones

Quelques remarques diverses :

réponse à un newsgroup

During the training phase, say a gradient backprop for instance, it seems that the contribution of an input $(x_{i},y_{i})$ for a gradient step is like ( $\Psi$ is the neural network) :

$\delta _{i}=2\,\nabla \Psi _{|i}\cdot (\Psi (x_{i})-y_{i})$

that has to be divided by all the contributions of all other points :

${\rm {Contrib}}_{i}=\ {\left|\nabla \Psi _{|i}\cdot (\Psi (x_{i})-y_{i})\right| \over \sum _{n}\left|\nabla \Psi _{|n}\cdot (\Psi (x_{n})-y_{n})\right|}$

you can see the absolute values $|\cdot |$ introduce a real bias in this measurement.

Besides you have to take in account the contribution during all the back propagation steps...

You can use the Amari's approach (Natural Gradient Works Efficiently in Learning (1998) --- Shun-Ichi Amari) and follow the path of the parameters of your NN during the learning phase. This will be a curvilinear integral along the learning trajectories.

You can also use my description of neural networks (published only for perceptrons at this stage : Initialization of Piecewise Affine Neural Networks for nonlinear control (1998) --- Charles-Albert Lehalle and Robert Azencott), to access to a more direct approach. It will allow you to describe the effect of the training as translations of hyperplanes, and you will be able to quantify the contribution of a data to those translations.