{"componentChunkName":"component---src-templates-docs-js","path":"/docs/usage-training","result":{"data":{"site":{"siteMetadata":{"sidebar":[{"label":"Get started","items":[{"text":"Introduction","url":"/docs/"},{"text":"Concept & Design","url":"/docs/concept"},{"text":"Installation & Setup","url":"/docs/install"},{"text":"Examples & Tutorials","url":"https://github.com/explosion/thinc/#-selected-examples-and-notebooks"},{"text":"Backprop 101","url":"/docs/backprop101"}]},{"label":"Usage","items":[{"text":"Configuration System","url":"/docs/usage-config"},{"text":"Defining & Using Models","url":"/docs/usage-models"},{"text":"Training Models","url":"/docs/usage-training"},{"text":"PyTorch, TensorFlow etc.","url":"/docs/usage-frameworks"},{"text":"Variable-length Sequences","url":"/docs/usage-sequences"},{"text":"Type Checking","url":"/docs/usage-type-checking"}]},{"label":"API","items":[{"text":"Model","url":"/docs/api-model"},{"text":"Layers","url":"/docs/api-layers"},{"text":"Optimizers","url":"/docs/api-optimizers"},{"text":"Initializers","url":"/docs/api-initializers"},{"text":"Schedules","url":"/docs/api-schedules"},{"text":"Losses","url":"/docs/api-loss"},{"text":"Config & Registry","url":"/docs/api-config"},{"text":"Types & Dataclasses","url":"/docs/api-types"},{"text":"Backends & Math","url":"/docs/api-backends"},{"text":"Utilities & Extras","url":"/docs/api-util"}]}]}},"markdownRemark":{"htmlAst":{"type":"root","children":[{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"Thinc provides a fairly minimalistic approach to training, leaving you in\ncontrol to write the training loop. The library provides a few utilities for\nminibatching, hyperparameter scheduling, loss functions and weight\ninitialization, but does not provide abstractions for data loading, progress\ntracking or hyperparameter optimization."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"h2","properties":{"id":"training-loop"},"children":[{"type":"text","value":"The training loop "}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"Thinc assumes that your model will be trained using some form of "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"minibatched\nstochastic gradient descent"}]},{"type":"text","value":". On each step of a standard training loop, you’ll\nloop over batches of your data and call\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-model#begin_update"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Model.begin_update"}]}]},{"type":"text","value":" on the inputs of the batch,\nwhich will return a batch of predictions and a backpropagation callback. You’ll\nthen calculate the gradient of the loss with respect to the output, and provide\nit to the backprop callback which will increment the gradients of the model\nparameters as a side-effect. You can then pass an optimizer function into the\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-model#finish_update"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Model.finish_update"}]}]},{"type":"text","value":" method to update the\nweights."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Basic training loop"},"children":[{"type":"text","value":"for i in range(10):\n    for X, Y in train_batches:\n        Yh, backprop = model.begin_update(X)\n        loss, dYh = get_loss_and_gradient(Yh, Y)\n        backprop(dYh)\n        model.finish_update(optimizer)\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"You’ll usually want to make some additions to the loop to save out model\ncheckpoints periodically, and to calculate and report progress statistics. Thinc\nalso provides ready access to "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"lower-level details"}]},{"type":"text","value":", making it easy to\nexperiment with arbitrary training variations. You can accumulate the gradients\nover multiple batches before calling the optimizer, call the "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"backprop"}]},{"type":"text","value":" callback\nmultiple times (or not at all if the update is small), and inject arbitrary code\nto change or report gradients for particular layers. The implementation is quite\ntransparent, so you’ll find it easy to implement such arbitrary modifications if\nyou need to."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"h2","properties":{"id":"batching"},"children":[{"type":"text","value":"Batching "}]},{"type":"text","value":"\n"},{"type":"element","tagName":"infobox","properties":{},"children":[{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"A “minibatch” (or simply “batch” – we use the terms interchangeably) is just a\ngroup of samples that you update or predict over together. Batching the data is\nvery important: most neural network models converge much faster and achieve\nbetter accuracy when the gradients are calculated using multiple samples."}]},{"type":"text","value":"\n"}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"Thinc implements two batching helpers via the backend object\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-backend#ops"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Ops"}]}]},{"type":"text","value":", typically used via "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"model.ops"}]},{"type":"text","value":". They should\ncover the most common batching needs for training and evaluation."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"ol","properties":{},"children":[{"type":"text","value":"\n"},{"type":"element","tagName":"li","properties":{},"children":[{"type":"element","tagName":"a","properties":{"href":"/docs/api-backends#minibatch"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"minibatch"}]}]},{"type":"text","value":": Iterate slices from a sequence,\noptionally shuffled."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"li","properties":{},"children":[{"type":"element","tagName":"a","properties":{"href":"/docs/api-backends#multibatch"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"multibatch"}]}]},{"type":"text","value":": Minibatch one or more\nsequences and yield lists with one batch per sequence."}]},{"type":"text","value":"\n"}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Example"},"children":[{"type":"text","value":"batches = model.ops.minibatch(128, data, shuffle=True)\nbatches = model.ops.multibatch(128, train_X, train_Y, shuffle=True)\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"The batching methods take sequences of data and process them as a stream. They\nreturn a "},{"type":"element","tagName":"a","properties":{"href":"/docs/api-types#sizedgenerator"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"SizedGenerator"}]}]},{"type":"text","value":", a simple custom\ndataclass for generators that has a "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"__len__"}]},{"type":"text","value":" and can repeatedly call the\ngenerator function. This also means that the batching works nicely with progress\nbars like "},{"type":"element","tagName":"a","properties":{"href":"https://github.com/tqdm/tqdm"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"tqdm"}]}]},{"type":"text","value":" and similar tools\nout-of-the-box."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"With progress bar","highlight":"1,4"},"children":[{"type":"text","value":"from tqdm import tqdm\n\ndata = model.ops.multibatch(128, train_X, train_Y, shuffle=True)\nfor X, Y in tqdm(data, leave=False):\n    Yh, backprop = model.begin_update(X)\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"element","tagName":"a","properties":{"href":"/docs/api-types#sizedgenerator"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"SizedGenerator"}]}]},{"type":"text","value":" objects hold a reference to\nthe generator function and "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"call it repeatedly"}]},{"type":"text","value":", i.e. every time the sized\ngenerator is executed. This also means that the sized generator is "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"never\nconsumed"}]},{"type":"text","value":". If you like, you can define it once outside your training loop, and\non each iteration, the data will be "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"rebatched and reshuffled"}]},{"type":"text","value":"."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Option 1"},"children":[{"type":"text","value":"for i in range(10):\n    for X, Y in model.ops.multibatch(128, train_X, train_Y, shuffle=True):\n        # Update the model here\n    for X, Y in model.ops.multibatch(128, dev_X, dev_Y):\n        # Evaluate the model here\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Option 2"},"children":[{"type":"text","value":"train_data = model.ops.multibatch(128, train_X, train_Y, shuffle=True)\ndev_data = model.ops.multibatch(128, dev_X, dev_Y)\nfor i in range(10):\n    for X, Y in train_data:\n        # Update the model here\n    for X, Y in dev_data:\n        # Evaluate the model here\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"The "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"minibatch"}]},{"type":"text","value":" and "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"multibatch"}]},{"type":"text","value":" methods also support a "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"buffer"}]},{"type":"text","value":" argument, which\nmay be useful to promote better parallelism. If you’re using an engine that\nsupports asynchronous execution, such as PyTorch or\n"},{"type":"element","tagName":"a","properties":{"href":"https://github.com/google/jax"},"children":[{"type":"text","value":"JAX"}]},{"type":"text","value":", an unbuffered stream could cause the\nengine to block unnecessarily. If you think this may be a problem, try setting a\nhigher buffer, e.g. "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"buffer=500"}]},{"type":"text","value":", and see if it solves the problem. You could\nalso simply consume the entire generator, by calling "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"list()"}]},{"type":"text","value":" on it."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"Finally, "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"minibatch"}]},{"type":"text","value":" and "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"multibatch"}]},{"type":"text","value":" support "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"variable length batching"}]},{"type":"text","value":",\nbased on a schedule you can provide as the "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"batch_size"}]},{"type":"text","value":" argument. Simply pass in\nan iterable. Variable length batching is non-standard, but we regularly use it\nfor some of "},{"type":"element","tagName":"a","properties":{"href":"https://spacy.io"},"children":[{"type":"text","value":"spaCy"}]},{"type":"text","value":"’s models, especially the parser and entity\nrecognizer."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python"},"children":[{"type":"text","value":"from thinc.api import compounding\n\nbatch_size = compounding(1.0, 16.0, 1.001)\ntrain_data = model.ops.multibatch(batch_size, train_X, train_Y, shuffle=True)\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"img","properties":{"src":"/b36f66116494729df412707bf30f8961/schedules_custom2.svg","alt":""},"children":[]},{"type":"text","value":"\n"},{"type":"element","tagName":"grid","properties":{},"children":[{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-ini"],"lang":"ini","title":"config","small":"true"},"children":[{"type":"text","value":"[batch_size]\n@schedules = \"compounding.v1\"\nstart = 1.0\nstop = 16.0\ncompound = 1.001\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Usage","small":"true"},"children":[{"type":"text","value":"from thinc.api import Config, registry\n\nconfig = Config().from_str(\"./config.cfg\")\nresolved = registry.resolve(config)\nbatch_size = resolved[\"batch_size\"]\n"}]}]},{"type":"text","value":"\n"}]},{"type":"text","value":"\n"},{"type":"element","tagName":"hr","properties":{},"children":[]},{"type":"text","value":"\n"},{"type":"element","tagName":"h2","properties":{"id":"evaluation"},"children":[{"type":"text","value":"Evaluation "}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"Thinc does not provide utilities for calculating accuracy scores over either\nindividual samples or whole datasets. In most situations, you will make a loop\nover batches of your inputs and targets, "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"calculate the accuracy"}]},{"type":"text","value":" on the batch\nof data, and then "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"keep a tally of the scores"}]},{"type":"text","value":"."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python"},"children":[{"type":"text","value":"def evaluate(model, batch_size, Xs, Ys):\n    correct = 0.\n    total = 0.\n    for X, Y in model.ops.multibatch(batch_size, Xs, Ys):\n        correct += (model.predict(X).argmax(axis=0) == Y.argmax(axis=0)).sum()\n        total += X.shape[0]\n    return correct / total\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"During evaluation, take care to run your model "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"in a prediction context"}]},{"type":"text","value":" (as\nopposed to a training context), by using either the\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-model#predict"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Model.predict"}]}]},{"type":"text","value":" method, or by passing the\n"},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"is_train=False"}]},{"type":"text","value":" flag to "},{"type":"element","tagName":"a","properties":{"href":"/docs/api-model#call"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Model.__call__"}]}]},{"type":"text","value":". Some layers\nmay behave differently during training and prediction in order to provide\nregularization. Dropout layers are the most common example."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"hr","properties":{},"children":[]},{"type":"text","value":"\n"},{"type":"element","tagName":"h2","properties":{"id":"losses"},"children":[{"type":"text","value":"Loss calculators "}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"When training your Thinc models, the most important loss calculation is not a\nscalar loss, but rather the "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"gradient of the loss with respect to your model\noutput"}]},{"type":"text","value":". That’s the figure you have to pass into the backprop callback. You\nactually don’t need to calculate the scalar loss at all, although it’s often\nhelpful as a diagnostic statistic."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"Thinc provides a few "},{"type":"element","tagName":"a","properties":{"href":"/docs/api-losses"},"children":[{"type":"text","value":"helpers for common loss functions"}]},{"type":"text","value":". Each\nhelper is provided as a class, so you can pass in any settings or\nhyperparameters that your loss might require. The helper class can be used as a\ncallable object, in which case it will return both the scalar loss and the\ngradient of the loss with respect to the outputs. You can also call the\n"},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"get_grad"}]},{"type":"text","value":" method to just get the gradients, or the "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"get_loss"}]},{"type":"text","value":" method to just\nget the scalar loss."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"grid","properties":{},"children":[{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Example","small":"true"},"children":[{"type":"text","value":"from thinc.api import CategoricalCrossentropy\nloss_calc = CategoricalCrossentropy()\ngrad, loss = loss_calc(guesses, truths)\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-ini"],"lang":"ini","title":"config.cfg","small":"true"},"children":[{"type":"text","value":"[loss]\n@losses = \"CategoricalCrossentropy.v1\"\nnormalize = true\n"}]}]},{"type":"text","value":"\n"}]},{"type":"text","value":"\n"},{"type":"element","tagName":"hr","properties":{},"children":[]},{"type":"text","value":"\n"},{"type":"element","tagName":"h2","properties":{"id":"schedules"},"children":[{"type":"text","value":"Setting learning rate schedules "}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"A common trick for stochastic gradient descent is to "},{"type":"element","tagName":"strong","properties":{},"children":[{"type":"text","value":"vary the learning rate or\nother hyperparameters"}]},{"type":"text","value":" over the course of training. Since there are many\npossible ways to vary the learning rate, Thinc lets you implement hyperparameter\nschedules as instances of the "},{"type":"element","tagName":"a","properties":{"href":"/docs/api-schedules#schedule"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Schedule"}]}]},{"type":"text","value":" class.\nThinc also provides a number of "},{"type":"element","tagName":"a","properties":{"href":"/docs/api-schedules"},"children":[{"type":"text","value":"popular schedules"}]},{"type":"text","value":"\nbuilt-in."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"You can use schedules directly, by calling the schedule with the "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"step"}]},{"type":"text","value":" keyword\nargument and using it to update hyperparameters in your training loop. Since\nschedules are particularly common for optimization settings, the\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-optimizer"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Optimizer"}]}]},{"type":"text","value":" object accepts floats, lists, iterators, and\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-schedules#schedule"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Schedule"}]}]},{"type":"text","value":" instances for most of its parameters.\nWhen you call "},{"type":"element","tagName":"a","properties":{"href":"/docs/api-optimizer#step_schedules"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Optimizer.step_schedules"}]}]},{"type":"text","value":",\nthe optimizer will increase its step count and pass it to the schedules. For\ninstance, this is how one creates an instance of the "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"Adam"}]},{"type":"text","value":" optimizer with a\ncustom learning rate schedule:"}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Custom learning rate schedule"},"children":[{"type":"text","value":"from thinc.api import Adam, Schedule\n\ndef cycle():\n    values = [0.001, 0.01, 0.1]\n    all_values = values + list(reversed(values))\n    return Schedule(\"cycle\", _cycle_schedule, attrs={\"all_values\": all_values})\n\ndef _cycle_schedule(schedule: Schedule, step: int, **kwargs) -> float:\n    all_values = schedule.attrs[\"all_values\"]\n    return all_values[step % len(all_values)]\n\noptimizer = Adam(learn_rate=cycle())\nassert optimizer.learn_rate(optimizer.step) == 0.001\noptimizer.step_schedules()\nassert optimizer.learn_rate(optimizer.step) == 0.01\noptimizer.step_schedules()\nassert optimizer.learn_rate(optimizer.step) == 0.1\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"img","properties":{"src":"/dbd27f11a38d31bc8e818bb5f5c2737c/schedules_custom1.svg","alt":""},"children":[]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"You’ll often want to describe your optimization schedules in your configuration\nfile. That’s also very easy: you can use the\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/api-config#registry"},"children":[{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"@thinc.registry.schedules"}]}]},{"type":"text","value":" decorator to register\nyour function, and then refer to it in your config as the "},{"type":"element","tagName":"code","properties":{},"children":[{"type":"text","value":"learn_rate"}]},{"type":"text","value":" argument\nof the optimizer. Check out the\n"},{"type":"element","tagName":"a","properties":{"href":"/docs/usage-config"},"children":[{"type":"text","value":"documentation on config files"}]},{"type":"text","value":" for more examples."}]},{"type":"text","value":"\n"},{"type":"element","tagName":"grid","properties":{},"children":[{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-python"],"lang":"python","title":"Registered function","small":"true"},"children":[{"type":"text","value":"@thinc.registry.schedules(\"cycle.v1\")\ndef cycle(values):\n    all_values = values + list(reversed(values))\n    return Schedule(\"cycle\", _cycle_schedule, attrs={\"all_values\": all_values})\n\ndef _cycle_schedule(schedule: Schedule, step: int, **kwargs) -> float:\n    all_values = schedule.attrs[\"all_values\"]\n    return all_values[step % len(all_values)]\n"}]}]},{"type":"text","value":"\n"},{"type":"element","tagName":"pre","properties":{},"children":[{"type":"element","tagName":"code","properties":{"className":["language-ini"],"lang":"ini","title":"config.cfg","small":"true"},"children":[{"type":"text","value":"[optimizer]\n@optimizers = \"Adam.v1\"\n\n[optimizer.learn_rate]\n@schedules = \"cycle.v1\"\nvalues = [0.001, 0.01, 0.1]\n"}]}]},{"type":"text","value":"\n"}]},{"type":"text","value":"\n"},{"type":"element","tagName":"hr","properties":{},"children":[]},{"type":"text","value":"\n"},{"type":"element","tagName":"h2","properties":{"id":"distributed"},"children":[{"type":"text","value":"Distributed training "}]},{"type":"text","value":"\n"},{"type":"element","tagName":"p","properties":{},"children":[{"type":"text","value":"We expect to recommend "},{"type":"element","tagName":"a","properties":{"href":"https://ray.io/"},"children":[{"type":"text","value":"Ray"}]},{"type":"text","value":" for distributed training. Ray\noffers a clean and simple API that fits well with Thinc’s model design. Full\nsupport is still under development."}]}],"data":{"quirksMode":false}},"frontmatter":{"title":"Training Models","teaser":null,"next":"/docs/usage-frameworks"}},"allMarkdownRemark":{"nodes":[{"fields":{"slug":"/docs/api-initializers"},"frontmatter":{"title":"Initializers"}},{"fields":{"slug":"/docs/api-config"},"frontmatter":{"title":"Config & Registry"}},{"fields":{"slug":"/docs/api-loss"},"frontmatter":{"title":"Loss Calculators"}},{"fields":{"slug":"/docs/api-model"},"frontmatter":{"title":"Model"}},{"fields":{"slug":"/docs/api-optimizers"},"frontmatter":{"title":"Optimizers"}},{"fields":{"slug":"/docs/api-schedules"},"frontmatter":{"title":"Schedules"}},{"fields":{"slug":"/docs/api-types"},"frontmatter":{"title":"Types & Dataclasses"}},{"fields":{"slug":"/docs/api-util"},"frontmatter":{"title":"Utilities & Extras"}},{"fields":{"slug":"/docs/backprop101"},"frontmatter":{"title":"Backpropagation 101"}},{"fields":{"slug":"/docs/concept"},"frontmatter":{"title":"Concept and Design"}},{"fields":{"slug":"/docs/"},"frontmatter":{"title":"Introduction"}},{"fields":{"slug":"/docs/install"},"frontmatter":{"title":"Installation & Setup"}},{"fields":{"slug":"/docs/usage-config"},"frontmatter":{"title":"Configuration System"}},{"fields":{"slug":"/docs/usage-frameworks"},"frontmatter":{"title":"PyTorch, TensorFlow & MXNet"}},{"fields":{"slug":"/docs/usage-models"},"frontmatter":{"title":"Defining and Using Models"}},{"fields":{"slug":"/docs/usage-sequences"},"frontmatter":{"title":"Variable-length sequences"}},{"fields":{"slug":"/docs/usage-training"},"frontmatter":{"title":"Training Models"}},{"fields":{"slug":"/docs/usage-type-checking"},"frontmatter":{"title":"Type Checking"}},{"fields":{"slug":"/docs/api-layers"},"frontmatter":{"title":"Layers"}},{"fields":{"slug":"/docs/api-backends"},"frontmatter":{"title":"Backends & Math"}}]},"headerTopRight":{"childImageSharp":{"fluid":{"base64":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAeCAYAAAAsEj5rAAAACXBIWXMAAAsTAAALEwEAmpwYAAAE5ElEQVRIx53Uf1DTZRwH8Pc2yj+6K//oyq46vbzqss660zO6FMW8y0KZBekhJGE/+akJIZAwkCnboo1tbEO+m9v4NVB+OYYwhuD4LUj+CAQFxnJM9t2GRhooyp7O/VNKf8jed597nr9e93mez3MP8J/ww8U4GiaBSFXP5GaUIoH89XTF8BVLxdUBoh26PN9H2z0z92c8t+dm7rdfc5DSPnthcY8Nlf12v2rrbSyIF/xKju8RjFgEL5XXdKRr+waIpuvSfHHPZU9Z7++keWjMo6u7+CBHYiYCmTmaJzWDLzOz/h8MEyGHLWQd3XwUvB3CmuyIfFLVMjDX1j1GznRcI2c6RzynW6940pNOzif/UEzS4nX+KTElSIkpZeZdGoT40uCjoGCPBPwDGmZmAgVBTuWX4t2y6bayPs9A88j8hYYhz2DLKGkt65+PD5KS6I/zXPFB+ctitorx3YZcBnVlGJHGHuajYIQY06IWGFuHodaeha2gvf96QTuxKto8FkXbg0l111zTz/q7X/v/QmI3iroPb9LiCCHg6cxeaH/nb491GCnBJPc07LxGjPIbnnEcNqjoLMMfdJbhjiOzjvyZXU+qvi0ncYHy7rQt8ncSNknBKWz2c884EWjoRGbPxeQF93jrWDum8814UHQeLoUZd8I1flPJ1W+6D1azb6bUZNcl1iYRNQd7PypAUXLVU3sJDUIIK7X7ojbE0EkWgFaqHZN8I2yyVowV9+DeTgo3kyrh/qkK02nVKN2tAz3S+Ar3ixK/Do4eu8j0q3FdF84FVbeRdzN09xaATk4d6CwDnBl6796VrocrrZbhSK3y6+XKcC21dMV4YeWaicRyWKPLN9T82kCHqprIGwc0sx9uFizs8PFMpdTAEV0OS7gSljAK4zuppYQQuLcrVzh2qKb/DqLIWfaxu1sDBSR9i9iIJ8lYgByWPSpYIpSIMklBB1FwslXrncFKMsVWEet2iii3KaR40njB3UpvXQ+hYA9Vwsl+WKoMF1s5aA85HlXjL3w4HCwqdlsT3KvlcK+SwemvwK1VMtij1HAVnEL/pvxnyXvCxYE3XCZM0M2YmDoDW0Yxw2rQw5pdtsoSqZKMrZetHAuULx60DzVikphh6zDAVnsqwiqqEI0naJZPfVMEXZzab3FHpk1wkl7cIwT2YeMhm8kQZ6Eq4EovxzZEIne/xocOaRPsTtOSG3bTCltDHd5CCD5/LhZZ6zjLc7NO7ls86DR51xt2E2zGOmb+vuMswR4ZBFGKYF6U4u6iQW+5TbBPNDGss2qM3K/CrDESNYQwue9zlsKXjDuUGKeVsDgob4071QgQaplCRiR8Bi0PQZqChS5cMmrTLovo7QC12If9L6jCOK1i2mgK12mKc5U+TpL6zx3REOLjkWkVRq0qBiFazBLtB/oTiuEtB7Rka3HzpM/gxIwSrfUFDMtVJYBcxK/lxXz6mfSmT+CF88dQX5SPCp4QjQV5rF5Syfrk5VR0rRG96BOYiu3gBnCRuTIJnBfiwU/XeX+aJiT6NpQf8Ta4G7mMzNcSkfF87EsCbuXBEpfz9RIy5xvIj5SBF57P4u2SgBcm4eSESYmooFHNSynxDRRTDRBTjQzpibOQVppX54TmlWcH5wYcXsfxDcyT6iEpaYG0toMhPWkGP0qB7HApstYewj+BxMBZvJr8CAAAAABJRU5ErkJggg==","aspectRatio":0.6756756756756757,"src":"/static/a592cfceebdbf105bac40baa898f12a9/53f65/landing_top-right.png","srcSet":"/static/a592cfceebdbf105bac40baa898f12a9/a90ce/landing_top-right.png 125w,\n/static/a592cfceebdbf105bac40baa898f12a9/002c1/landing_top-right.png 250w,\n/static/a592cfceebdbf105bac40baa898f12a9/53f65/landing_top-right.png 500w,\n/static/a592cfceebdbf105bac40baa898f12a9/f26e3/landing_top-right.png 750w,\n/static/a592cfceebdbf105bac40baa898f12a9/5d2c5/landing_top-right.png 1000w,\n/static/a592cfceebdbf105bac40baa898f12a9/6050d/landing_top-right.png 1200w","sizes":"(max-width: 500px) 100vw, 500px"}}}},"pageContext":{"slug":"/docs/usage-training"}},"staticQueryHashes":["34836940","3699375715"]}