fix tokenizer initialization issue AttributeError: can't set attribute

#99
by katuni4ka - opened
No description provided.
katuni4ka changed pull request status to closed
katuni4ka changed pull request status to open

After finishing the update above,I get this error. Can you help me ? Thank you !

[INFO|trainer.py:1721] 2023-11-15 17:52:14,029 >> Number of trainable parameters = 14,680,064
0%| | 0/3000 [00:00<?, ?it/s]11/15/2023 17:52:15 - WARNING - transformers_modules.chatglm-6b.modeling_chatglm - use_cache=True is incompatible with gradient checkpointing. Setting use_cache=False...
Traceback (most recent call last):
File "main.py", line 470, in
main()
File "main.py", line 401, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\transformers\trainer.py", line 1553, in train
return inner_training_loop(
File "C:\Users\PC.conda\envs\torch\lib\site-packages\transformers\trainer.py", line 1835, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\transformers\trainer.py", line 2679, in training_step
loss = self.compute_loss(model, inputs)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\transformers\trainer.py", line 2704, in compute_loss
outputs = model(**inputs)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\peft\peft_model.py", line 977, in forward
return self.base_model(
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\peft\tuners\tuners_utils.py", line 106, in forward
return self.model.forward(*args, **kwargs)
File "C:\Users\PC/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 1190, in forward
transformer_outputs = self.transformer(
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\PC/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 985, in forward
layer_ret = torch.utils.checkpoint.checkpoint(
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\utils\checkpoint.py", line 249, in checkpoint
return CheckpointFunction.apply(function, preserve, *args)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\autograd\function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\utils\checkpoint.py", line 107, in forward
outputs = run_function(*args)
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\PC/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 627, in forward
attention_outputs = self.attention(
File "C:\Users\PC.conda\envs\torch\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "C:\Users\PC/.cache\huggingface\modules\transformers_modules\chatglm-6b\modeling_chatglm.py", line 461, in forward
position_ids = position_ids[:, 0, :].transpose(0, 1).contiguous()
IndexError: too many indices for tensor of dimension 2

For me, check the existing and pop will avoid the error in some case.
chatglm2-6b$ diff -ur tokenization_chatglm.py.backup tokenization_chatglm.py
--- tokenization_chatglm.py.backup 2023-11-12 21:27:40.798629427 +0800
+++ tokenization_chatglm.py 2023-11-13 18:05:34.877616175 +0800
@@ -70,6 +70,12 @@
self.vocab_file = vocab_file
self.tokenizer = SPTokenizer(vocab_file)
+ if "eos_token" in kwargs:
+ kwargs.pop("eos_token")
+ if "pad_token" in kwargs:
+ kwargs.pop("pad_token")
+ if "unk_token" in kwargs:
+ kwargs.pop("unk_token")
self.special_tokens = {
"": self.tokenizer.bos_id,
"": self.tokenizer.eos_id,

@iampeterchen dict.pop(key, default_value) command that used in request is equivalent of what you write, just shorter

I wasn't aware of the difference between kwargs.pop("pad_token") and kwargs.pop("pad_token", None) before. Thank you for this shorter one.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment