about self.scaled_masked_softmax in modeling_chatglm.py

#91
by shl97 - opened

what's the meaning and definition of self.scaled_masked_softmax in modeling_chatglm.py
in 300-302 lines
if self.scale_mask_softmax:
self.scale_mask_softmax.scale = query_key_layer_scaling_coeff
attention_probs = self.scale_mask_softmax(attention_scores, attention_mask.contiguous())
self.scaled_masked_softmax has attribute scale and can be used as a function, but it only appears in line 378 as self.scale_mask_softmax = None

Sign up or log in to comment