Complexity (hidden-units, weight-decay)

How to choose the number of hidden-units [12-6-1-6-12] ?

How to control the flexibility of a curve: from almost linear components (under-fitting) to very complex curves (over-fitting)?

Several ways are used to control neural network complexity: number of hidden nodes, number of iterations (early stopping), etc.

In nonlinear PCA the most important parameter is weight-decay.

Hidden units are not good for controlling the complexity because of the discrete scale: 2 or 3 nodes is a big difference, there is no option between. Best to have a reasonable large (or even too large) number of hidden units and control complexity by using weight-decay only. If not specified by 'units_per_layer', hidden units are set automatically to fit best the data dimension.

A large number of hidden units provides the required capacity of performing non-linear results, but weight-decay is more sensitive to control and define the right model complexity!

You can choose different weight-decay values to check and avoid over-fitting.

[pc,net,network]=nlpca(data,1, 'weight_decay_coefficient',0.01 )

Low or zero means no restriction of complexity which can lead to over-fitting.

A very high value (max 1) leads to under-fitting with the effect that we get only a linear component as in standard PCA.

The optimum is somewhere within the range 0.1 ,..., 0.001, from experience a good choice and default is 0.01

Complexity (hidden-units, weight-decay)

How to choose the number of hidden-units [12-6-1-6-12] ?

See also