SGPT models sequence length

#19
by rdroste - opened

Hi, thanks for the great benchmark. Quick question: The sequence lengths of the SGPT models are stated here between 2048 and 4096 tokens. However, the SGPT paper states that they were trained with sequence lengths of up to 300 tokens (see e.g. Section 4.2.1, https://arxiv.org/pdf/2202.08904.pdf). Is the 300 token number from the paper correct? Are the models expected to perform well for sequences of 2048 tokens nonetheless? Thanks a lot!
(CC: @Muennighoff )

Massive Text Embedding Benchmark org

Great point. The numbers are correct - They were only trained on short sequences but you can theoretically use them with much longer sequences. I haven't done any extensive testing on longer sequences so I'm not sure how well it would perform.

Also see this issue: https://github.com/Muennighoff/sgpt/issues/23

Thanks a lot for the reply. I'll try to test it out.

Sign up or log in to comment