lr_warmup should not be passed when adafactor is used as the optimizer

It's not really a major bug, more of an inconvenience, but the gui passes lr_warmup to the training command when it is non-zero, causing library/train_util.py to raise a ValueError if the optimizer is AdaFactor. Error is raised mid-way after latents are cached, just before dataloader is created instead of the start of the calling train_network, wasting time, and making it harder to pick out what caused the error for users who are not used to reading stack traces. 
 
> ValueError: adafactor:0.0001 does not require `num_warmup_steps`. Set None or 0.

Suggested fix in the order of preference:
a) Pop-up a warning in the gui if user did not set lr_warmup to 0 when optimizer is set to AdaFactor (recommended)
or
b) Raise error at train_network.py when invalid combination of optimizer and lr_warmup is used.  
c) Change train_util.py to raise a warning and ignore value lr_warmup (not-recommended)

Unless there is differing opinions as to why a) and b) is not the way to go, I will go ahead and send a push request for a) and b) over the weekend with said "fix".  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

lr_warmup should not be passed when adafactor is used as the optimizer #617

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

lr_warmup should not be passed when adafactor is used as the optimizer #617

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions