MultiSteps & state.steps & warmup #324

agemagician · 2022-03-30T22:22:16Z

agemagician
Mar 30, 2022

Hello,

Two questions regarding MultiSteps:

If we used MultiSteps, then in this case should we adjust the warm-up steps according to the full number of training steps or the full number of training steps // number of accumulation steps?
Does MultiSteps affect the state.steps and make it only count steps when the backpropagation occurs at every accumulation step or does the number of steps increase normally even if the backpropagation didn't occur?

Apr 19, 2022

Taking a look at:
https://github.com/deepmind/optax/blob/3fb68179604e349c3083ad12cd2e38ff8713f613/optax/_src/wrappers.py#L184

It looks like the inner optimizer is only called each time "final_step" is called. Since optax works by chaining together GradientTransformations, and usually the step count is used by GradientTransformations (such as the learning rate schedule), the answer to your question depends on how the GradientTransformations are chained together:

e.g. if the schedule is applied before the multi step transformation, it will be applied once on every step (whether or not accumulation is done), but if it's chained after the multistep transformation, it will …

View full answer

rosshemsley · 2022-04-19T16:07:13Z

rosshemsley
Apr 19, 2022
Maintainer

Hello @agemagician!

Taking a look at:
https://github.com/deepmind/optax/blob/3fb68179604e349c3083ad12cd2e38ff8713f613/optax/_src/wrappers.py#L184

It looks like the inner optimizer is only called each time "final_step" is called. Since optax works by chaining together GradientTransformations, and usually the step count is used by GradientTransformations (such as the learning rate schedule), the answer to your question depends on how the GradientTransformations are chained together:

e.g. if the schedule is applied before the multi step transformation, it will be applied once on every step (whether or not accumulation is done), but if it's chained after the multistep transformation, it will be applied once on every accumulated step.

Does this help?

1 reply

agemagician Apr 19, 2022
Author

Yep, thanks a lot. This clarifies it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiSteps & state.steps & warmup #324

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

MultiSteps & state.steps & warmup #324

agemagician Mar 30, 2022

Replies: 1 comment · 1 reply

rosshemsley Apr 19, 2022 Maintainer

agemagician Apr 19, 2022 Author

agemagician
Mar 30, 2022

Replies: 1 comment 1 reply

rosshemsley
Apr 19, 2022
Maintainer

agemagician Apr 19, 2022
Author