zlacker

[return to "Mistral 7B Fine-Tune Optimized"]
1. averev+Lo[view] [source] 2023-12-20 22:13:37
>>tosh+(OP)
not a bad model, becomes incoherent at above 8k token, and it's not helped by the fact that's very verbose, but seems very coherent and stay on topic closely until then: https://chat.openai.com/share/089d1b8c-3467-4c01-af9f-6568c0...

fails at math of course, even if the problem is very easy, like all mistrals. good for genration, probably not the best for RAG, there's mistral tunes that stay coherent to 16k tokens, and that cuts down chunking significanty

◧◩
2. jahsom+py[view] [source] 2023-12-20 23:17:33
>>averev+Lo
~Is the 'k' in your token sizes a typo?~

Edit: mistook tokens for parameters for a moment there. Keeping up with AI jargon is exhausting for an idiot like me.

◧◩◪
3. averev+9z[view] [source] 2023-12-20 23:22:30
>>jahsom+py
No it's the sequence length like how much long is the string in the prompt so to say, 8192 token and it starts losing coherence and by 10000 tokens it was emitting gibberish, like empty lines and half words, I didn't put the worst part into the link. What do you mean by ELII?
[go to top]