Layers
Weights layers, transforms, combinators and wrappers

This page describes functions for defining your model. Each layer is implemented in its own module in thinc.layers and can be imported from thinc.api. Most layer files define two public functions: a creation function that returns a Model instance, and a forward function that performs the computation.


Weights layers	Layers that use an internal weights matrix for their computations.
Reduction operations	Layers that perform rank reductions, e.g. pooling from word to sentence vectors.
Combinators	Layers that combine two or more existing layers.
Data type transfers	Layers that transform data to different types.
Wrappers	Wrapper layers for other libraries like PyTorch and TensorFlow.

Weights layers

CauchySimilarity function

Input: Tuple[Floats2d, Floats2d] (batch_size, nI)
Output: Floats1d (batch_size)
Parameters: W (1, nI)

Compare input vectors according to the Cauchy similarity function proposed by Chen (2013). Primarily used within siamese neural networks.

Argument	Type	Description
`nI`	`Optional[int]`	The size of the input vectors.
RETURNS	`Model[Tuple[Floats2d, Floats2d], Floats1d]`	The created similarity layer.

 View on GitHub thinc/layers/cauchysimilarity.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Dish function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with the Dish activation function. Dish or “Daniël’s Swish-like activation” is an activation function with a non-monotinic shape similar to GELU, Swish and Mish. However, Dish does not rely on elementary functions like exp or erf, making it much faster to compute in most cases.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `he_normal_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/dish.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Dropout function

Input: ArrayXd / Sequence[ArrayXd] / Ragged / Padded
Output: ArrayXd / Sequence[ArrayXd] / Ragged / Padded
Attrs: dropout_rate float

Helps prevent overfitting by adding a random distortion to the input data during training. Specifically, cells of the input are zeroed with probability determined by the dropout_rate argument. Cells which are not zeroed are rescaled by 1-rate. When not in training mode, the distortion is disabled (see Hinton et al., 2012).

Examplefrom thinc.api import chain, Linear, Dropout
model = chain(Linear(10, 2), Dropout(0.2))
Y, backprop = model(X, is_train=True)
# Configure dropout rate via the dropout_rate attribute.
for node in model.walk():
    if node.name == "dropout":
        node.attrs["dropout_rate"] = 0.5

Argument	Type	Description
`dropout_rate`	`float`	The probability of zeroing the activations (default: 0). Higher dropout rates mean more distortion. Values around `0.2` are often good.
RETURNS	`Model[T, T]`	The created dropout layer.

 View on GitHub thinc/layers/dropout.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Embed function

Input: Union[Ints1d, Ints2d] (n,)
Output: Floats2d (n, nO)
Parameters: E (nV, nO)
Attrs: column int, dropout_rate float

Map integers to vectors, using a fixed-size lookup table. The input to the layer should be a two-dimensional array of integers, one column of which the embeddings table will slice as the indices.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nV`	`int`	Number of input vectors. Defaults to `1`.
keyword-only
`column`	`int`	The column to slice from the input, to get the indices.
`initializer`	`Optional[Callable]`	A function to initialize the internal parameters. Defaults to `uniform_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting (default `None`).
RETURNS	`Model[Union[Ints1d, Ints2d], Floats2d]`	The created embedding layer.

 View on GitHub thinc/layers/embed.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

HashEmbed function

Input: Union[Ints1d, Ints2d] (n,) /
Output: Floats2d (n, nO)
Parameters: E (nV, nO)
Attrs: seed Optional[int], column int, dropout_rate float

An embedding layer that uses the “hashing trick” to map keys to distinct values. The hashing trick involves hashing each key four times with distinct seeds, to produce four likely differing values. Those values are modded into the table, and the resulting vectors summed to produce a single result. Because it’s unlikely that two different keys will collide on all four “buckets”, most distinct keys will receive a distinct vector under this scheme, even when the number of vectors in the table is very low.

Argument	Type	Description
`nO`	`int`	The size of the output vectors.
`nV`	`int`	Number of input vectors.
keyword-only
`seed`	`Optional[int]`	A seed to use for the hashing.
`column`	`int`	The column to select features from.
`initializer`	`Optional[Callable]`	A function to initialize the internal parameters. Defaults to `uniform_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting (default `None`).
RETURNS	`Model[Union[Ints1d, Ints2d], Floats2d]`	The created embedding layer.

 View on GitHub thinc/layers/hashembed.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

LayerNorm function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nI)
Parameters: b (nI,), G (nI,)

Perform layer normalization on the inputs (Ba et al., 2016). This layer does not change the dimensionality of the vectors.

Argument	Type	Description
`nI`	`Optional[int]`	The size of the input vectors.
RETURNS	`Model[Floats2d, Floats2d]`	The created normalization layer.

 View on GitHub thinc/layers/layernorm.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Linear function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

The Linear layer multiplies inputs by a weights matrix W and adds a bias vector b. In PyTorch this is called a Linear layer, while Keras calls it a Dense layer.

Examplefrom thinc.api import Linear

model = Linear(10, 5)
model.initialize()
Y = model.predict(model.ops.alloc2f(2, 5))
assert Y.shape == (2, 10)

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Callable`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init`.
`init_b`	`Callable`	A function to initialize the bias vector. Defaults to `zero_init`.
RETURNS	`Model[Floats2d, Floats2d]`	The created `Linear` layer.

 View on GitHub thinc/layers/linear.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Sigmoid function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A linear (aka dense) layer, followed by a sigmoid activation. This is usually used as an output layer for multi-label classification (in contrast to the Softmax layer, which is used for problems where exactly one class is correct per example.

Argument	Type	Description
`nOs`	`Tuple[int, …]`	The sizes of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
RETURNS	`Model[Floats2d, Floats2d]`	The created sigmoid layer.

 View on GitHub thinc/layers/sigmoid.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

sigmoid_activation function

Input: FloatsXd (batch_size, nI)
Output: FloatsXd (batch_size, nO)

Apply the sigmoid logistic function as an activation to the inputs. This is often used as an output activation for multi-label classification, because each element of the output vectors will be between 0 and 1.

Argument	Type	Description
RETURNS	`Model[Floats2d, Floats2d]`	The created `sigmoid_activation` layer.

 View on GitHub thinc/layers/sigmoid_activation.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

LSTM and BiLSTM function

Input: Padded
Output: Padded
Parameters: depth int, dropout float

An LSTM recurrent neural network. The BiLSTM is bidirectional: that is, each layer concatenated a forward LSTM with an LSTM running in the reverse direction. If you are able to install PyTorch, you should usually prefer to use the PyTorchLSTM layer instead of Thinc’s implementations, as PyTorch’s LSTM implementation is significantly faster.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`bi`	`bool`	Use BiLSTM.
`depth`	`int`	Number of layers (default `1`).
`dropout`	`float`	Dropout rate to avoid overfitting (default `0`).
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
RETURNS	`Model[Padded, Padded]`	The created LSTM layer(s).

 View on GitHub thinc/layers/lstm.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Maxout function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO*nP)
Parameters: W (nO*nP, nI), b (nO*nP,)

A dense layer with a “maxout” activation (Goodfellow et al, 2013). Maxout layers require a weights array of shape (nO, nP, nI) in order to compute outputs of width nO given inputs of width nI. The extra multiple, nP, determines the number of “pieces” that the piecewise-linear activation will consider.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
`nP`	`int`	Number of maxout pieces (default: 3).
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization, (default: False).
RETURNS	`Model[Floats2d, Floats2d]`	The created maxout layer.

 View on GitHub thinc/layers/maxout.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Mish function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with Mish activation (Misra, 2019).

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization, (default: False).
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/mish.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Swish function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with the Swish activation function (Ramachandran et al., 2017). Swish is a self-gating non-monotonic activation function similar to GELU: whereas GELU uses the CDF of the Gaussian distribution Φ for self-gating x * Φ(x) Swish uses the logistic CDF x * σ(x). Sometimes referred to as “SiLU” for “Sigmoid Linear Unit”.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `he_normal_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/swish.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Gelu function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with the GELU activation function (Hendrycks and Gimpel, 2016). The GELU or “Gaussian Error Linear Unit” is a self-gating non-monotonic activation function similar to Swish: whereas GELU uses the CDF of the Gaussian distribution Φ for self-gating x * Φ(x) the Swish activation uses the logistic CDF σ and computes x * σ(x). Various approximations exist, but thinc implements the exact GELU. The use of GELU is popular within transformer feed-forward blocks.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `he_normal_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/gelu.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

ReluK function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with the ReLU activation function where the maximum value is clipped at k. A common choice is k=6 introduced for convolutional deep belief networks (Krizhevsky, 2010). The resulting function relu6 is commonly used in low-precision scenarios.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
`k`	`float`	Maximum value. Defaults to `6.0`..
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/clipped_linear.py#L132
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

HardSigmoid function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with hard sigmoid activation function, which is a fast linear approximation of sigmoid, defined as max(0, min(1, x * 0.2 + 0.5)).

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/clipped_linear.py#L90
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

HardTanh function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with hard tanh activation function, which is a fast linear approximation of tanh, defined as max(-1, min(1, x)).

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/clipped_linear.py#L111
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

ClippedLinear function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer implementing a flexible clipped linear activation function of the form max(min_value, min(max_value, x * slope + offset)). It is used to implement the ReluK, HardSigmoid, and HardTanh layers.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
`slope`	`float`	The slope of the linear function: `input * slope`.
`offset`	`float`	The offset or intercept of the linear function: `input * slope + offset`.
`min_val`	`float`	Minimum value to clip to.
`max_val`	`float`	Maximum value to clip to.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/clipped_linear.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

HardSwish function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer implementing the hard Swish activation function, which is a fast linear approximation of Swish: x * hard_sigmoid(x).

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `he_normal_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/hard_swish.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

HardSwishMobileNet function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer implementing the a variant of the fast linear hard Swish activation function used in MobileNetV3 (Howard et al., 2019), defined as x * (relu6(x + 3) / 6).

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `he_normal_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization. Defaults to `False`.
RETURNS	`Model[Floats2d, Floats2d]`	The created dense layer.

 View on GitHub thinc/layers/hard_swish_mobilenet.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

MultiSoftmax function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

Neural network layer that predicts several multi-class attributes at once. For instance, we might predict one class with six variables, and another with five. We predict the 11 neurons required for this, and then softmax them such that columns 0-6 make a probability distribution and columns 6-11 make another.

Argument	Type	Description
`nOs`	`Tuple[int, …]`	The sizes of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
RETURNS	`Model[Floats2d, Floats2d]`	The created multi softmax layer.

 View on GitHub thinc/layers/multisoftmax.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

ParametricAttention function

Input: Ragged
Output: Ragged
Parameters: Q (nO,)

A layer that uses the parametric attention scheme described by Yang et al. (2016). The layer learns a parameter vector that is used as the keys in a single-headed attention mechanism.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
RETURNS	`Model[Ragged, Ragged]`	The created attention layer.

 View on GitHub thinc/layers/parametricattention.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

ParametricAttention_v2 function

Input: Ragged
Output: Ragged
Parameters: Q (nO,)

A layer that uses the parametric attention scheme described by Yang et al. (2016). The layer learns a parameter vector that is used as the keys in a single-headed attention mechanism.

Argument	Type	Description
`key_transform`	`Optional[Model[Floats2d, Floats2d]]`	Transformation to apply to the key representations. Defaults to `None`
`nO`	`Optional[int]`	The size of the output vectors.
RETURNS	`Model[Ragged, Ragged]`	The created attention layer.

 View on GitHub thinc/layers/parametricattention_v2.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Relu function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with Relu activation.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `glorot_uniform_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`dropout`	`Optional[float]`	Dropout rate to avoid overfitting.
`normalize`	`bool`	Whether or not to apply layer normalization, (default: False).
RETURNS	`Model[Floats2d, Floats2d]`	The created Relu layer.

 View on GitHub thinc/layers/relu.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Softmax function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

A dense layer with a softmax activation. This is usually used as a prediction layer. Vectors produced by the softmax function sum to 1, and have values between 0 and 1, so each vector can be interpreted as a probability distribution.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `zero_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
RETURNS	`Model[Floats2d, Floats2d]`	The created softmax layer.

 View on GitHub thinc/layers/softmax.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Softmax_v2 function

Input: Floats2d (batch_size, nI)
Output: Floats2d (batch_size, nO)
Parameters: W (nO, nI), b (nO,)

Softmax_v2 supports outputting unnormalized probabilities during inference by using normalize_outputs=False as an argument. This is useful when we are only interested in finding the top-k classes, but not their probabilities. Computing unnormalized probabilities is faster, because it skips the expensive normalization step.

The temperature argument of Softmax_v2 provides control of the softmax distribution. Values larger than 1 increase entropy and values between 0 and 1 (exclusive) decrease entropy of the distribution. The default temperature of 1 will calculate the unmodified softmax distribution. temperature is not used during inference when normalize_outputs=False.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`nI`	`Optional[int]`	The size of the input vectors.
keyword-only
`init_W`	`Optional[Callable]`	A function to initialize the weights matrix. Defaults to `zero_init` when set to `None`.
`init_b`	`Optional[Callable]`	A function to initialize the bias vector. Defaults to `zero_init` when set to `None`.
`normalize_outputs`	`bool`	Return normalized probabilities during inference. Defaults to `True`.
`temperature`	`float`	Temperature to divide logits by. Defaults to `1.0`.
RETURNS	`Model[Floats2d, Floats2d]`	The created softmax layer.

 View on GitHub thinc/layers/softmax.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

SparseLinear function

Input: Tuple[ArrayXd, ArrayXd, ArrayXd]
Output: ArrayXd
Parameters: W (nO*length,), b (nO,), length int

A sparse linear layer using the “hashing trick”. Useful for tasks such as text classification. Inputs to the layer should be a tuple of arrays (keys, values, lengths), where the keys and values are arrays of the same length, describing the concatenated batch of input features and their values. The lengths array should have one entry per sequence in the batch, and the sum of the lengths should equal the length of the keys and values array.

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`length`	`int`	The size of the weights vector, to be tuned empirically.
RETURNS	`Model[Tuple[ArrayXd, ArrayXd, ArrayXd], ArrayXd]`	The created layer.

 View on GitHub thinc/layers/sparselinear.pyx
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

SparseLinear_v2 functionNew: v8.1.6

Input: Tuple[ArrayXd, ArrayXd, ArrayXd]
Output: ArrayXd
Parameters: W (nO*length,), b (nO,), length int

Argument	Type	Description
`nO`	`Optional[int]`	The size of the output vectors.
`length`	`int`	The size of the weights vector, to be tuned empirically.
RETURNS	`Model[Tuple[ArrayXd, ArrayXd, ArrayXd], ArrayXd]`	The created layer.

 View on GitHub thinc/layers/sparselinear.pyx
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Reduction operations

reduce_first function

Input: Ragged
Output: ArrayXd (batch_size, nO)

Pooling layer that reduces the dimensions of the data by selecting the first item of each sequence. This is most useful after multi-head attention layers, which can learn to assign a good feature representation for the sequence to one of its elements.

Argument	Type	Description
RETURNS	`Model[Ragged, ArrayXd]`	The created pooling layer.

 View on GitHub thinc/layers/reduce_first.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

reduce_last function

Pooling layer that reduces the dimensions of the data by selecting the last item of each sequence. This is typically used after multi-head attention or recurrent neural network layers such as LSTMs, which can learn to assign a good feature representation for the sequence to its final element.

Input: Ragged
Output: ArrayXd (batch_size, nO)

Argument	Type	Description
RETURNS	`Model[Ragged, ArrayXd]`	The created pooling layer.

 View on GitHub thinc/layers/reduce_last.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

reduce_max function

Input: Ragged
Output: Floats2d (batch_size, nO)

Pooling layer that reduces the dimensions of the data by selecting the maximum value for each feature. A ValueError is raised if any element in lengths is zero.

Argument	Type	Description
RETURNS	`Model[Ragged, Floats2d]`	The created pooling layer.

 View on GitHub thinc/layers/reduce_max.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

reduce_mean function

Input: Ragged
Output: Floats2d (batch_size, nO)

Pooling layer that reduces the dimensions of the data by computing the average value of each feature. Zero-length sequences are reduced to the zero vector.

Argument	Type	Description
RETURNS	`Model[Ragged, Floats2d]`	The created pooling layer.

 View on GitHub thinc/layers/reduce_mean.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

reduce_sum function

Input: Ragged
Output: Floats2d (batch_size, nO)

Pooling layer that reduces the dimensions of the data by computing the sum for each feature. Zero-length sequences are reduced to the zero vector.

Argument	Type	Description
RETURNS	`Model[Ragged, Floats2d]`	The created pooling layer.

 View on GitHub thinc/layers/reduce_sum.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Combinators

Combinators are layers that express higher-order functions: they take one or more layers as arguments and express some relationship or perform some additional logic around the child layers. Combinators can also be used to overload operators. For example, binding chain to >> allows you to write Relu(512) >> Softmax() instead of chain(Relu(512), Softmax()).

add function

Compose two or more models f, g, etc, such that their outputs are added, i.e. add(f, g)(x) computes f(x) + g(x).

Argument	Type	Description
`*layers`	`Model[Any, ArrayXd]`	The models to compose.
RETURNS	`Model[Any, ArrayXd]`	The composed model.

 View on GitHub thinc/layers/add.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

bidirectional function

Stitch two RNN models into a bidirectional layer. Expects squared sequences.

Argument	Type	Description
`l2r`	`Model[Padded, Padded]`	The first model.
`r2l`	`Optional[Model[Padded, Padded]]`	The second model.
RETURNS	`Model[Padded, Padded]`	The composed bidirectional layer.

 View on GitHub thinc/layers/bidirectional.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

chain function

Compose two or more models such that they become layers of a single feed-forward model, e.g. chain(f, g) computes g(f(x)).

Argument	Type	Description
`layer1`	`Model`	The first model to compose.
`layer2`	`Model`	The second model to compose.
`*layers`	`Model`	Any additional models to compose.
RETURNS	`Model`	The composed feed-forward model.

 View on GitHub thinc/layers/chain.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

clone function

Construct n copies of a layer, with distinct weights. For example, clone(f, 3)(x) computes f(f'(f''(x))).

Argument	Type	Description
`orig`	`Model`	The layer to copy.
`n`	`int`	The number of copies to construct.
RETURNS	`Model`	A composite model containing two or more copies.

 View on GitHub thinc/layers/clone.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

concatenate function

Compose two or more models f, g, etc, such that their outputs are concatenated, i.e. concatenate(f, g)(x) computes hstack(f(x), g(x)).

Argument	Type	Description
`*layers`	`Model`, …	The models to compose.
RETURNS	`Model`	The composed model.

 View on GitHub thinc/layers/concatenate.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

map_list function

Map a child layer across list inputs.

Argument	Type	Description
`layer`	`Model[InT, OutT]`	The child layer to map.
RETURNS	`Model[List[InT], List[OutT]]`	The composed model.

 View on GitHub thinc/layers/map_list.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

expand_window function

Input: Floats2d, Ragged (batch_size, nI)
Output: Floats2d, Ragged (batch_size, nO)
Attrs: window_size int

For each vector in an input, construct an output vector that contains the input and a window of surrounding vectors. This is one step in a convolution. If the window_size is three, the output size nO will be nI * 7 after concatenating three contextual vectors from the left, and three from the right, to each input vector. In general, nO equals nI * (2 * window_size + 1).

Argument	Type	Description
`window_size`	`int`	The window size (default 1) that determines the number of surrounding vectors.
RETURNS	`Model[T, T]`	The created layer for adding context to vectors.

 View on GitHub thinc/layers/expand_window.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

noop function

Transform a sequences of layers into a null operation.

Argument	Type	Description
`*layers`	`Model`	The models to compose.
RETURNS	`Model`	The composed model.

 View on GitHub thinc/layers/noop.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

residual function

Input: List[FloatsXd] / Ragged / Padded / FloatsXd Floats1d Floats2d Floats3d Floats4d
Output: List[FloatsXd] / Ragged / Padded / FloatsXd Floats1d Floats2d Floats3d Floats4d

A unary combinator creating a residual connection. This converts a layer computing f(x) into one that computes f(x)+x. Gradients flow through residual connections directly, helping the network to learn more smoothly.

Argument	Type	Description
`layer`	`Model[T, T]`	A model with the same input and output types.
RETURNS	`Model[T, T]`	A model with the unchanged input and output types.

 View on GitHub thinc/layers/residual.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

tuplify function

Give each child layer a separate copy of the input, and the combine the output of the child layers into a tuple. Useful for providing original and modified input to a downstream layer.

On the backward pass the loss from each child is added together, so when using custom datatypes they should define an addition operator.

Argument	Type	Description
`*layers`	`Model[Any, T] …`	The models to compose.
RETURNS	`Model[Any, Tuple[T]]`	The composed feed-forward model.

 View on GitHub thinc/layers/tuplify.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

siamese function

Combine and encode a layer and a similarity function to form a siamese architecture. Typically used to learn symmetric relationships, such as redundancy detection.

Argument	Type	Description
`layer`	`Model`	The layer to run over the pair of inputs.
`similarity`	`Model`	The similarity layer.
RETURNS	`Model[Tuple, ArrayXd]`	The created siamese layer.

 View on GitHub thinc/layers/siamese.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

uniqued function

Group inputs to a layer, so that the layer only has to compute for the unique values. The data is transformed back before output, and the same transformation is applied for the gradient. Effectively, this is a cache local to each minibatch. The uniqued wrapper is useful for word inputs, because common words are seen often, but we may want to compute complicated features for the words, using e.g. character LSTM.

Argument	Type	Description
`layer`	`Model`	The layer.
keyword-only
`column`	`int`	The column. Defaults to `0`.
RETURNS	`Model[Ints2d, Floats2d]`	The composed model.

 View on GitHub thinc/layers/uniqued.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Data type transfers

array_getitem, ints_getitem, floats_getitem function

Input: ArrayXd
Output: ArrayXd

Index into input arrays, and return the subarrays. Multi-dimensional indexing can be performed by passing in a tuple, and slicing can be performed using the slice object. For instance, X[:, :-1] would be (slice(None, None), slice(None, -1)).

Argument	Type	Description
`index`	`Union[Union[int, slice, Sequence[int]], Tuple[Union[int, slice, Sequence[int]], …]`	A valid numpy-style index.

 View on GitHub thinc/layers/array_getitem.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

list2array function

Input: List2d
Output: Array2d

Transform sequences to ragged arrays if necessary. If sequences are already ragged, do nothing. A ragged array is a tuple (data, lengths), where data is the concatenated data.

Argument	Type	Description
RETURNS	`Model[List2d, Array2d]`	The layer to compute the transformation.

 View on GitHub thinc/layers/list2array.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

list2ragged function

Input: ListXd
Output: Ragged

Transform sequences to ragged arrays if necessary and return the ragged array. If sequences are already ragged, do nothing. A ragged array is a tuple (data, lengths), where data is the concatenated data.

Argument	Type	Description
RETURNS	`Model[ListXd, Ragged]`	The layer to compute the transformation.

 View on GitHub thinc/layers/list2ragged.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

list2padded function

Input: List2d
Output: Padded

Create a layer to convert a list of array inputs into Padded.

Argument	Type	Description
RETURNS	`Model[List2d, Padded]`	The layer to compute the transformation.

 View on GitHub thinc/layers/list2padded.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

ragged2list function

Input: Ragged
Output: ListXd

Transform sequences from a ragged format into lists.

Argument	Type	Description
RETURNS	`Model[Ragged, ListXd]`	The layer to compute the transformation.

 View on GitHub thinc/layers/ragged2list.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

padded2list function

Input: Padded
Output: List2d

Create a layer to convert a Padded input into a list of arrays.

Argument	Type	Description
RETURNS	`Model[Padded, List2d]`	The layer to compute the transformation.

 View on GitHub thinc/layers/padded2list.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

remap_ids function

Input: Union[Sequence[Hashable], Ints1d, Ints2d]
Output: Ints2d

Remap a sequence of strings, integers or other hashable inputs using a mapping table, usually as a preprocessing step before embeddings. The input can also be a two dimensional integer array in which case the column attribute tells the remap_ids layer which column of the array to map with the mapping_table. Both attributes can be passed on initialization, but since the layer is designed to retrieve them from model.attrs during forward, they can be set any time before calling forward. This means that they can also be changed between calls. Before calling forward the mapping_table has to be set and for 2D inputs the column is also required.

Argument	Type	Description
`mapping_table`	`Dict[Any, int]`	The mapping table to use. Can also be set after initialization by writing to `model.attrs["mapping_table"]`.
`default`	`int`	The default value if the input does not have an entry in the mapping table.
`column`	`int`	The column to apply the mapper to in case of 2D input.
RETURNS	`Model[Union[Sequence[Hashable], Ints1d, Ints2d], Ints2d]`	The layer to compute the transformation.

 View on GitHub thinc/layers/remap_ids.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

strings2arrays function

Input: Sequence[Sequence[str]]
Output: List[Ints2d]

Transform a sequence of string sequences to a list of arrays.

Argument	Type	Description
RETURNS	`Model[Sequence[Sequence[str]], List[Ints2d]]`	The layer to compute the transformation.

 View on GitHub thinc/layers/strings2arrays.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_array function

Input / output: Union[Padded, Ragged, ListXd, ArrayXd]

Transform sequence data into a contiguous array on the way into and out of a model. Handles a variety of sequence types: lists, padded and ragged. If the input is an array, it is passed through unchanged.

Argument	Type	Description
`layer`	`Model[ArrayXd, ArrayXd]`	The layer to wrap.
keyword-only
`pad`	`int`	The padding. Defaults to `0`.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_array2d.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_array2d function

Input / output: Union[Padded, Ragged, List2d, Array2d]

Transform sequence data into a contiguous two-dimensional array on the way into and out of a model. In comparison to the with_array layer, the behavior of this layer mostly differs on Padded inputs, as this layer merges the batch and length axes to form a two-dimensional array. Handles a variety of sequence types: lists, padded and ragged. If the input is a two-dimensional array, it is passed through unchanged.

Argument	Type	Description
`layer`	`Model[Array2d, Array2d]`	The layer to wrap.
keyword-only
`pad`	`int`	The padding. Defaults to `0`.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_array.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_flatten function

Input: Sequence[Sequence[Any]]
Output: ListXd

Flatten nested inputs on the way into a layer and reverse the transformation over the outputs.

Argument	Type	Description
`layer`	`Model[Sequence[Any], ArrayXd]`	The layer to wrap.
RETURNS	`Model[Sequence[Sequence[Any]], ListXd]`	The wrapped layer.

 View on GitHub thinc/layers/with_flatten.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_flatten_v2 functionNew: v8.1.6

Input: List[List[InItemT]]
Output: List[List[OutItemT]]

Flatten nested inputs on the way into a layer and reverse the transformation over the outputs.

Argument	Type	Description
`layer`	`Model[List[InItemT], List[OutItemT]]`	The layer to wrap.
RETURNS	`Model[List[List[InItemT]], List[List[OutItemT]]]`	The wrapped layer.

 View on GitHub thinc/layers/with_flatten_v2.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_padded function

Input / output: Union[Padded, Ragged, List2d, Floats3d, Tuple[Floats3d, Ints1d, Ints1d, Ints1d]]

Convert sequence input into the Padded data type on the way into a layer and reverse the transformation on the output.

Argument	Type	Description
`layer`	`Model[Padded, Padded]`	The layer to wrap.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_padded.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_ragged function

Input / output: Union[Padded, Ragged, ListXd, Floats3d, Tuple[Floats2d, Ints1d]]

Convert sequence input into the Ragged data type on the way into a layer and reverse the transformation on the output.

Argument	Type	Description
`layer`	`Model[Ragged, Ragged]`	The layer to wrap.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_ragged.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_list function

Input / output: Union[Padded, Ragged, List2d]

Convert sequence input into lists on the way into a layer and reverse the transformation on the outputs.

Argument	Type	Description
`layer`	`Model[List2d, List2d]`	The layer to wrap.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_list.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_getitem function

Input: Tuple
Output: Tuple

Transform data on the way into and out of a layer by plucking an item from a tuple.

Argument	Type	Description
`idx`	`int`	The index to pluck from the tuple.
`layer`	`Model[ArrayXd, ArrayXd]`	The layer to wrap.
RETURNS	`Model[Tuple, Tuple]`	The wrapped layer.

 View on GitHub thinc/layers/with_getitem.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_reshape function

Input: Array3d
Output: Array3d

Reshape data on the way into and out from a layer.

Argument	Type	Description
`layer`	`Model[Array2d, Array2d]`	The layer to wrap.
RETURNS	`Model[Array3d, Array3d]`	The wrapped layer.

 View on GitHub thinc/layers/with_reshape.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_debug function

Input: Any
Output: Any

Debugging layer that wraps any layer and allows executing callbacks during the forward pass, backward pass and initialization. The callbacks will receive the same arguments as the functions they’re called in and are executed before the function runs.

Examplefrom thinc.api import Linear, with_debug

def on_init(model, X, Y):
    print(f"X: {type(Y)}, Y ({type(Y)})")

model = with_debug(Linear(2, 5), on_init=on_init)
model.initialize()

Argument	Type	Description
`layer`	`Model`	The layer to wrap.
`name`	`Optional[str]`	Optional name for the wrapped layer, will be prefixed by `debug:`. Defaults to name of the wrapped layer.
keyword-only
`on_init`	`Callable[[Model, Any, Any], None]`	Function called on initialization. Receives the model and the `X` and `Y` passed to `Model.initialize`, if available.
`on_forward`	`Callable[[Model, Any, bool], None]`	Function called at the start of the forward pass. Receives the model, the inputs and the value of `is_train`.
`on_backprop`	`Callable[[Any], None] = do_nothing`	Function called at the start of the backward pass. Receives the gradient.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_debug.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_nvtx_range function

Input: Any
Output: Any

Layer that wraps any layer and marks the forward and backprop passes as an NVTX range. This can be helpful when profiling GPU performance of a layer.

Examplefrom thinc.api import Linear, with_nvtx_range

model = with_nvtx_range(Linear(2, 5))
model.initialize()

Argument	Type	Description
`layer`	`Model`	The layer to wrap.
`name`	`Optional[str]`	Optional name for the wrapped layer. Defaults to the name of the wrapped layer.
keyword-only
`forward_color`	`int`	Identifier of the color to use for the forward pass
`backprop_color`	`int`	Identifier of the color to use for the backward pass
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_nvtx_range.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

with_signpost_interval functionNew: v8.1.1

Input: Any
Output: Any

Layer that wraps any layer and marks the init, forward and backprop passes as a (macOS) signpost interval. This can be helpful when profiling the performance of a layer using macOS Instruments.app. Use of this layer requires that the os-signpost package is installed.

Examplefrom os_signpost import Signposter
from thinc.api import Linear, with_signpost_interval

signposter = Signposter("com.example.my_subsystem",
    Signposter.Category.DynamicTracing)

model = with_signpost_interval(Linear(2, 5), signposter)
model.initialize()

Argument	Type	Description
`layer`	`Model`	The layer to wrap.
`signposter`	`os_signposter.Signposter`	`Signposter` object to log the interval with.
`name`	`Optional[str]`	Optional name for the wrapped layer. Defaults to the name of the wrapped layer.
RETURNS	`Model`	The wrapped layer.

 View on GitHub thinc/layers/with_signpost_interval.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Wrappers

PyTorchWrapper, PyTorchRNNWrapper function

Input: Any
Output: Any

Wrap a PyTorch model so that it has the same API as Thinc models. To optimize the model, you’ll need to create a PyTorch optimizer and call optimizer.step after each batch. The PyTorchRNNWrapper has the same signature as the PyTorchWrapper and lets you to pass in a custom sequence model that has the same inputs and output behavior as a torch.nn.RNN object.

Your PyTorch model’s forward method can take arbitrary positional arguments and keyword arguments, but must return either a single tensor as output or a tuple. You may find PyTorch’s register_forward_hook helpful if you need to adapt the output. The convert functions are used to map inputs and outputs to and from your PyTorch model. Each function should return the converted output, and a callback to use during the backward pass:

Xtorch, get_dX = convert_inputs(X)
Ytorch, torch_backprop = model.shims[0](Xtorch, is_train)
Y, get_dYtorch = convert_outputs(Ytorch)

To allow maximum flexibility, the PyTorchShim expects ArgsKwargs objects on the way into the forward and backward passes. The ArgsKwargs objects will be passed straight into the model in the forward pass, and straight into torch.autograd.backward during the backward pass.

Argument	Type	Description
`pytorch_model`	`Any`	The PyTorch model.
`convert_inputs`	`Callable`	Function to convert inputs to PyTorch tensors (same signature as `forward` function).
`convert_outputs`	`Callable`	Function to convert outputs from PyTorch tensors (same signature as `forward` function).
RETURNS	`Model[Any, Any]`	The Thinc model.

 View on GitHub thinc/layers/pytorchwrapper.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

TorchScriptWrapper_v1 functionNew: v8.1.6

Input: Any
Output: Any

Wrap a TorchScript model so that it has the same API as Thinc models. To optimize the model, you’ll need to create a PyTorch optimizer and call optimizer.step after each batch.

Your TorchScript model’s forward method can take arbitrary positional arguments and keyword arguments, but must return either a single tensor as output or a tuple. The convert functions are used to map inputs and outputs to and from your TorchScript model. Each function should return the converted output, and a callback to use during the backward pass:

Xtorch, get_dX = convert_inputs(X)
Ytorch, torch_backprop = model.shims[0](Xtorch, is_train)
Y, get_dYtorch = convert_outputs(Ytorch)

To allow maximum flexibility, the TorchScriptShim expects ArgsKwargs objects on the way into the forward and backward passes. The ArgsKwargs objects will be passed straight into the model in the forward pass, and straight into torch.autograd.backward during the backward pass.

Note that the torchscript_model argument can be None. This is useful for deserialization since serialized TorchScript contains both the model and its weights.

A PyTorch wrapper can be converted to a TorchScript wrapper using the pytorch_to_torchscript_wrapper function:

from thinc.api import PyTorchWrapper_v2, pytorch_to_torchscript_wrapper
import torch

model = PyTorchWrapper_v2(torch.nn.Linear(nI, nO)).initialize()
script_model = pytorch_to_torchscript_wrapper(model)

Argument	Type	Description
`torchscript_model`	`Optional[torch.jit.ScriptModule]`	The TorchScript model.
`convert_inputs`	`Callable`	Function to convert inputs to PyTorch tensors (same signature as `forward` function).
`convert_outputs`	`Callable`	Function to convert outputs from PyTorch tensors (same signature as `forward` function).
`mixed_precision`	`bool`	Enable mixed-precision training.
`grad_scaler`	`Optional[PyTorchGradScaler]`	Gradient scaler to use during mixed-precision training.
`device`	`Optional[torch.Device]`	The Torch device to execute the model on.
RETURNS	`Model[Any, Any]`	The Thinc model.

 View on GitHub thinc/layers/torchscriptwrapper.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

TensorFlowWrapper function

Input: Any
Output: Any

! In Thinc v8.2+, TensorFlow support is not enabled by default. To enable TensorFlow:

from thinc.api import enable_tensorflow
enable_tensorflow()

Wrap a TensorFlow model, so that it has the same API as Thinc models. To optimize the model, you’ll need to create a TensorFlow optimizer and call optimizer.apply_gradients after each batch. To allow maximum flexibility, the TensorFlowShim expects ArgsKwargs objects on the way into the forward and backward passes.

Argument	Type	Description
`tensorflow_model`	`Any`	The TensorFlow model.
RETURNS	`Model[Any, Any]`	The Thinc model.

 View on GitHub thinc/layers/tensorflowwrapper.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

MXNetWrapper function

Input: Any
Output: Any

! In Thinc v8.2+, MXNet support is not enabled by default. To enable MXNet:

from thinc.api import enable_mxnet
enable_mxnet()

Wrap a MXNet model, so that it has the same API as Thinc models. To optimize the model, you’ll need to create a MXNet optimizer and call optimizer.step() after each batch. To allow maximum flexibility, the MXNetShim expects ArgsKwargs objects on the way into the forward and backward passes.

Argument	Type	Description
`tensorflow_model`	`Any`	The TensorFlow model.
RETURNS	`Model[Any, Any]`	The Thinc model.

 View on GitHub thinc/layers/mxnetwrapper.py
  
Can't fetch code example from GitHub :(

Please use the link above to view the example. If you've come across
a broken link, we always appreciate a pull request to the repository,
or a report on the issue tracker. Thanks!

Layers Weights layers, transforms, combinators and wrappers

Layers
Weights layers, transforms, combinators and wrappers