-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asymmetric search models with longer max seq length? #23
Comments
For most models you can significantly increase the sequence length. If you load via SentenceTransformer you can do the following after loading the model:
Maximum sequence length for most SGPT models on the hub is 2048, you can always check the config (https://huggingface.co/Muennighoff/SGPT-5.8B-weightedmean-msmarco-specb-bitfit/blob/2dbba11efed19bb418811eac04be241ddc42eb99/config.json#L19). Note though that the models weren't trained / evaluated on examples that long, so I'm not sure how well it performs. Would be interesting to hear your experience! |
Thanks for the reply. |
Yeah the reason the sequence length is set to 300 during training is that it saves a lot of memory & for many cases 300 tokens are enough to determine similarity even if the actual texts are longer. I think it can, but I also havn't tried it - I pasted some code in this issue here that might work: #19 (comment) |
@regstuff did you get good results when bumping the sequence length? |
Frankly, I havent been able to figure out a sensible way of measuring the quality. Any ideas welcome. |
To confirm, is there a difference between sequence length and token length? Or do they mean the same? |
It's the same, i.e. sequence length is measured in tokens |
Hi
Some great work in this repo. I've been trying to get it to work in my asymmetric search application - basically a document retrieval application.
Currently I use one of the sentence transformer models trained by UKPlabs, with a max seq length of 512 tokens. But most of my documents are quite a bit longer.
Was wondering if any of the SGPT models that you or anyone else might have trained have a longer max length? Most of what I see on huggingface has a max length of 75 or 300.
Thanks
The text was updated successfully, but these errors were encountered: