zlacker

[parent] [thread] 3 comments
1. valine+(OP)[view] [source] 2023-11-20 20:59:51
Maybe. Goliath 120B took two different llama variants and interwove the layers. Surprisingly Goliath 120B quantized to 2bit is outperforming llama 70B 4bit in many benchmarks.

https://www.reddit.com/r/LocalLLaMA/comments/17vcr9d/llm_com...

replies(1): >>ghotli+oz
2. ghotli+oz[view] [source] 2023-11-21 00:02:44
>>valine+(OP)
Do you happen to have a link to where that interwoven layers bit is described? As far as I can tell it's not clear on the model cards.
replies(1): >>valine+LP
◧◩
3. valine+LP[view] [source] [discussion] 2023-11-21 01:55:41
>>ghotli+oz
The model page is the only info I’ve found on it. As far as I can tell there’s no paper published on the technique.

In the “Merge Process” section they at least give the layer ranges.

https://huggingface.co/alpindale/goliath-120b

replies(1): >>ghotli+RX
◧◩◪
4. ghotli+RX[view] [source] [discussion] 2023-11-21 02:47:43
>>valine+LP
Ah, actually reviewing that more closely I found a link to it in the acknowledgements.

https://github.com/cg123/mergekit

[go to top]