Loading models with huggingface Automodel.from_pretrained

sachinruk · February 10, 2021, 6:05am

I have a model that looks like the following:

class Model(pl.LightningModule):
    def __init__(
        self,
        base: str,
        embedding_dim: int,
        num_classes: int,
        n_hidden: int,
        loss_fn: Callable,
        lr: float,
        metrics: Dict[str, Metric],
    ):
        super().__init__()
        self.base = AutoModel.from_pretrained(base)
        self.tokenizer = AutoTokenizer.from_pretrained(base)

I can see how to load a model using the docs. However, I am a bit worried that considering the base parameter is a string (name of HF model), it will run the first line and start downloading the huggingface model first before it overlays it with the weights from state_dict.

The pod that I’m using does not have access to the internet and will fail in prod. Regardless I don’t want it to download a model first everytime this pod spins up (since its a hourly job). So any thoughts on this?

goku · February 15, 2021, 7:12pm

I remember there is something called AutoModel.from_config that doesn’t download the model but still initializes with random weights thus avoids any download. Then you can simply use lightning load_from_checkpoint to load the trained weights.