Complexity (hidden-units, weight-decay)

FAQ - Frequently asked questions

How to choose the number of hidden-units  [12-6-1-6-12]  ?

How to control the flexibility of a curve: from almost linear components (under-fitting) to very complex curves (over-fitting)?

Several ways are used to control neural network complexity: number of hidden nodes, number of iterations (early stopping), etc.

In nonlinear PCA the most important parameter is weight-decay.

Hidden units are not good for controlling the complexity because of the discrete scale: 2 or 3 nodes is a big difference, there is no option between. Best to have a reasonable large (or even too large) number of hidden units and control complexity by using weight-decay only. If not specified by 'units_per_layer', hidden units are set automatically to fit best the data dimension.

A large number of hidden units provides the required capacity of performing non-linear results, but  weight-decay is more sensitive to control and define the right model complexity!

You can choose different weight-decay values to check and avoid over-fitting.

[pc,net,network]=nlpca(data,1, 'weight_decay_coefficient',0.01 )


Low or zero means no restriction of complexity which can lead to over-fitting.

A very high value (max 1)  leads to under-fitting with the effect that we get only a linear component as in standard PCA.

The optimum is somewhere within the range 0.1 ,..., 0.001, from experience a good choice and default is 0.01

See also

Validation of nonlinear PCA

http://www.nlpca.org/validation-of-nonlinear-PCA.html


Inverse NLPCA and hierarchical order also have important impacts on controlling curve complexity.

see Section 5.5.1 (page 51) of my PhD theses:

http://opus.kobv.de/ubp/volltexte/2006/783/pdf/scholz_diss.pdf