Initializers

A collection of initialization functions. Parameter initialization schemes can be very important for deep neural networks, because the initial distribution of the weights helps determine whether activations change in mean and variance as the signal moves through the network. If the activations are not stable, the network will not learn effectively. The “best” initialization scheme changes depending on the activation functions being used, which is why a variety of initializations are necessary. You can reduce the importance of the initialization by using normalization after your hidden layers.

normal_init function

Initialize from a normal distribution, with scale = sqrt(1 / fan_in).

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
keyword-only
fan_inintUsually the number of inputs to the layer. If -1, the second dimension is used.
RETURNSFloatsXdThe initialized array.

glorot_uniform_init function

Initialize from a uniform distribution with scale parameter computed by the method introduced by Xavier Glorot ([Glorot and Bengio, 2010])(http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf): scale = sqrt(6.0 / (data.shape[0] + data.shape[1])). Usually used in Relu layers.

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

glorot_normal_init function

Initialize from a normal distribution with scale parameter computed by the method introduced by Xavier Glorot ([Glorot and Bengio, 2010])(http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf): scale = sqrt(2.0 / (data.shape[0] + data.shape[1])).

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

he_uniform_init function

Initialize from a uniform distribution with scale parameter computed by the method introduced in He et al., 2015: scale = sqrt(6.0 / data.shape[1]).

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

he_normal_init function

Initialize from a normal distribution with scale parameter computed by the method introduced in He et al., 2015: scale = sqrt(2.0 / data.shape[1]).

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

lecun_uniform_init function

Initialize from a uniform distribution with scale parameter computed as: scale = sqrt(3.0 / data.shape[1]).

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

lecun_normal_init function

Initialize from a normal distribution with scale parameter computed as: variance = sqrt(1.0 / data.shape[1]).

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

zero_init function

Initialize a parameter with zero weights. This is usually used for output layers and for bias vectors.

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
RETURNSFloatsXdThe initialized array.

uniform_init function

Initialize values from a uniform distribution. This is usually used for word embedding tables.

ArgumentTypeDescription
opsOpsThe backend object, e.g. model.ops.
shapeShapeThe data shape.
keyword-only
lofloatThe minimum of the uniform distribution.
hifloatThe maximum of the uniform distribution.
RETURNSFloatsXdThe initialized array.

Usage via config and function registry

Since the initializers need to be called with data, defining them in the config will return a configured function: a partial with only the settings (the keyword arguments) applied. Within your script, you can then pass in the data, and the configured function will be called using the settings defined in the config.

Most commonly, the initializer is passed as an argument to a layer, so it can be defined as its own config block nested under the layer settings:

config.cfg[model]
@layers = "linear.v1"
nO = 10

[model.init_W]
@initializers = "normal_init.v1"
fan_in = -1
Usagefrom thinc.api import registry, Config

config = Config().from_disk("./config.cfg")
resolved = registry.resolve(config)
model = resolved["model"]

You can also define it as a regular config setting and then call the configured function in your script:

config.cfg[initializer]
@initializers = "uniform_init.v1"
lo = -0.1
hi = 0.1
Usagefrom thinc.api import registry, Config, NumpyOps

config = Config().from_disk("./config.cfg")
resolved = registry.resolve(config)
initializer = resolved["initializer"]
weights = initializer(NumpyOps(), (3, 2))