Schedules
Schedules are generators that provide different rates, schedules, decays or
series. They’re typically used for batch sizes or learning rates. You can easily
implement your own schedules as well: just write your own
Schedule
implementation, that produces whatever series of values
you need. A common use case for schedules is within
Optimizer
objects, which accept iterators for most of
their parameters. See the training guide for details.
Schedule classNew: v9
Class for implementing Thinc schedules.
Typing
Schedule
can be used as a
generic type with one
parameter. This parameter specifies the type that is returned by the schedule.
For instance, Schedule[int]
denotes a scheduler that returns integers when
called. A mismatch will cause a type error. For more details, see the docs on
type checking.
from thinc.api import Schedule
def my_function(schedule: Schedule[int]):
...
Attributes
Name | Type | Description |
---|---|---|
name | str | The name of the scheduler type. |
Properties
Name | Type | Description |
---|---|---|
attrs | Dict[str, Any] | The scheduler attributes. You can use the dict directly and assign to it – but you cannot reassign schedule.attrs to a new variable: schedule.attrs = {} will fail. |
Schedule.__init__ method
Initialize a new schedule.
Exampleschedule = Schedule(
"constant",
constant_schedule,
attrs={"rate": rate},
)
Argument | Type | Description |
---|---|---|
name | str | The name of the schedule type. |
schedule | Callable | Function to compute the schedule value for a given step. |
keyword-only | ||
attrs | Dict[str, Any] | Dictionary of non-parameter attributes. |
Schedule.__call__ method
Call the schedule function, returning the value for the given step. The step
positional argument is always required. Some schedules may require additional
keyword arguments.
Examplefrom thinc.api import constant
schedule = constant(0.1)
assert schedule(0) == 0.1
assert schedule(1000) == 0.1
Argument | Type | Description |
---|---|---|
step | int | The step to compute the schedule for. |
**kwargs | Optional arguments passed to the schedule. | |
RETURNS | Any | The schedule value for the step. |
Schedule.to_generator method
Turn the schedule into a generator by passing monotonically increasing step count into the schedule.
Examplefrom thinc.api import constant
g = constant(0.1).to_generator()
assert next(g) == 0.1
assert next(g) == 0.1
Argument | Type | Description |
---|---|---|
start | int | The initial schedule step. Defaults to 0 . |
step_size | int | The amount to increase the step with for each generated value. Defaults to 1 . |
**kwargs | Optional arguments passed to the schedule. | |
RETURNS | Generator[OutT, None, None] | The generator. |
constant function
Yield a constant rate.
from thinc.api import constant
batch_sizes = constant(0.001)
batch_size = batch_sizes(step=0)
config[batch_size]
@schedules = "constant.v1"
rate = 0.001
Argument | Type |
---|---|
rate | float |
YIELDS | float |
constant_then function
Yield a constant rate for N steps, before starting a schedule.
from thinc.api import constant_then, decaying
learn_rates = constant_then(
0.005,
1000,
decaying(0.005, 1e-4)
)
learn_rate = learn_rates(step=0)
config[learn_rates]
@schedules = "constant_then.v1"
rate = 0.005
steps = 1000
[learn_rates.schedule]
@schedules = "decaying"
base_rate = 0.005
decay = 1e-4
Argument | Type |
---|---|
rate | float |
steps | int |
schedule | Iterable[float] |
YIELDS | float |
decaying function
Yield an infinite series of linearly decaying values, following the schedule
base_rate * 1 / (1 + decay * t)
.
from thinc.api import decaying
learn_rates = decaying(0.005, 1e-4)
learn_rate = learn_rates(step=0) # 0.001
learn_rate = learn_rates(step=1) # 0.00999
config[learn_rate]
@schedules = "decaying.v1"
base_rate = 0.005
decay = 1e-4
t = 0
Argument | Type |
---|---|
base_rate | float |
decay | float |
keyword-only | |
t | int |
YIELDS | float |
compounding function
Yield an infinite series of compounding values. Each time the generator is called, a value is produced by multiplying the previous value by the compound rate.
from thinc.api import compounding
batch_sizes = compounding(1.0, 32.0, 1.001)
batch_size = batch_sizes(step=0) # 1.0
batch_size = batch_sizes(step=1) # 1.0 * 1.001
config[batch_size]
@schedules = "compounding.v1"
start = 1.0
stop = 32.0
compound = 1.001
t = 0
Argument | Type |
---|---|
start | float |
stop | float |
compound | float |
keyword-only | |
t | int |
YIELDS | float |
warmup_linear function
Generate a series, starting from an initial rate, and then with a warmup period, and then a linear decline. Used for learning rates.
from thinc.api import warmup_linear
learn_rates = warmup_linear(0.01, 3000, 6000)
learn_rate = learn_rates(step=0)
config[learn_rate]
@schedules = "warmup_linear.v1"
initial_rate = 0.01
warmup_steps = 3000
total_steps = 6000
Argument | Type |
---|---|
initial_rate | float |
warmup_steps | int |
total_steps | int |
YIELDS | float |
slanted_triangular function
Yield an infinite series of values according to Howard and Ruder’s (2018) “slanted triangular learning rate” schedule.
from thinc.api import slanted_triangular
learn_rates = slanted_triangular(0.1, 5000)
learn_rate = learn_rates(step=0)
config[learn_rate]
@schedules = "slanted_triangular.v1"
max_rate = 0.1
num_steps = 5000
cut_frac = 0.1
ratio = 32
decay = 1.0
t = 0.1
Argument | Type |
---|---|
max_rate | float |
num_steps | int |
keyword-only | |
cut_frac | float |
ratio | int |
decay | float |
t | float |
YIELDS | float |
cyclic_triangular function
Linearly increasing then linearly decreasing the rate at each cycle.
from thinc.api import cyclic_triangular
learn_rates = cyclic_triangular(0.005, 0.001, 1000)
learn_rate = learn_rates(step=0)
config[learn_rate]
@schedules = "cyclic_triangular.v1"
min_lr = 0.005
max_lr = 0.001
period = 1000
Argument | Type |
---|---|
min_lr | float |
max_lr | float |
period | int |
YIELDS | float |
plateau functionNew: v9
Yields values from the wrapped schedule, exponentially scaled by the number of
times optimization has plateaued. The caller must pass model evaluation scores
through the last_score
argument for the scaling to be adjusted. The last
evaluation score is passed through the last_score
argument as a tuple
(last_score_step
, last_score
). This tuple indicates when a model was last
evaluated (last_score_step
) and with what score (last_score
).
from thinc.api import constant, plateau
schedule = plateau(2, 0.5, constant(1.0))
assert schedule(step=0, last_score=(0, 1.0)) == 1.0
assert schedule(step=1, last_score=(1, 1.0)) == 1.0
assert schedule(step=2, last_score=(2, 1.0)) == 0.5
assert schedule(step=3, last_score=(3, 1.0)) == 0.5
assert schedule(step=4, last_score=(4, 1.0)) == 0.25
config[learn_rate]
@schedules = "plateau.v1"
scale = 0.5
max_patience = 2
[learn_rate.shedule]
@schedules = "constant.v1"
rate = 1.0
Argument | Type | Description |
---|---|---|
max_patience | int | Number of evaluations without an improvement to consider the model to have plateaued. |
scale | float | Scale to apply to the learning rate when the model has plateaued. This scale is cumulative – if the model plateaued n times, the effective scale is scale**n . |
schedule | Schedule[float] | The schedule to wrap. |
RETURNS | Schedule[float] |