Pickling error - cannot load on transformers==4.37.0.dev0

#3
by danielhanchen - opened

Great work again on TInyLlama!

I tried loading this on transformers==4.37.0.dev0 and sadly it doesn't work :( The Chat one successfully loads though!

config.json: 100%
560/560 [00:00<00:00, 36.3kB/s]
pytorch_model.bin: 100%
4.40G/4.40G [00:18<00:00, 239MB/s]

---------------------------------------------------------------------------

UnpicklingError                           Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in load_state_dict(checkpoint_file)
    518 
--> 519         return torch.load(checkpoint_file, map_location=map_location, weights_only=True)
    520     except Exception as e:

6 frames

UnpicklingError: Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution.Do it only if you get the file from a trusted source. WeightsUnpickler error: Unsupported operand 149


During handling of the above exception, another exception occurred:

UnicodeDecodeError                        Traceback (most recent call last)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 70: invalid start byte


During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)

/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in load_state_dict(checkpoint_file)
    533                     ) from e
    534         except (UnicodeDecodeError, ValueError):
--> 535             raise OSError(
    536                 f"Unable to load weights from pytorch checkpoint file for '{checkpoint_file}' "
    537                 f"at '{checkpoint_file}'. "

OSError: Unable to load weights from pytorch checkpoint file for '/root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-intermediate-step-1431k-3T/snapshots/4b8dd7e43ec08c24ccaf89cbf67898cff53c95ae/pytorch_model.bin' at '/root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-intermediate-step-1431k-3T/snapshots/4b8dd7e43ec08c24ccaf89cbf67898cff53c95ae/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

I also tried from_tf = True:
```

OSError Traceback (most recent call last)

in <cell line: 25>()
23 bnb_4bit_compute_dtype = dtype,
24 )
---> 25 model = AutoModelForCausalLM.from_pretrained(
26 model_name,
27 device_map = "sequential",

1 frames

/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3351 )
3352 else:
-> 3353 raise EnvironmentError(
3354 f"{pretrained_model_name_or_path} does not appear to have a file named"
3355 f" {_add_variant(WEIGHTS_NAME, variant)}, {TF2_WEIGHTS_NAME}, {TF_WEIGHTS_NAME} or"

OSError: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.


Again great work!

can you share the code you used for loading the model :)

Interestingly I just noticed the stable release of transformers==4.36.2 works fine! It might be an issue only with 4.37.

Oh my code is literally the normal code for loading:

model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T",
    torch_dtype = torch.bfloat16,
)

getting the same issue

@manishiitg I reuploaded it at https://huggingface.co/unsloth/tinyllama to work for the transformers dev branch!

This is caused by safe loading feature introduced by transformers==4.37, which add weights_only=True argument to the torch.load function. To fix this issue, you can simply load and re-save the model using the latest pytorch.

import torch

model = torch.load('pytorch_model.bin')
torch.save(model, 'pytorch_model.bin')

Can confirm the issue, can confirm that loading and saving fixes it.

Sign up or log in to comment