Config does not align with original paper

by cbock90 - opened

Hey there,

I see that the config significantly differs from the one on the paper:

  • dmodel: Config: 4096, Paper: 1024
  • #heads: Config: 64, Paper: 128
  • d_ff: Config: 10240, Paper: 65536
  • dkv: Config: 64, Paper: 128

Are there any insights why there is this difference in the checkpoint?

Sign up or log in to comment