zlacker
[parent]
[thread]
3 comments
1. ronyfa+(OP)
[view]
[source]
2023-09-12 20:14:06
It still does a much better job at translation than llama 2 70b even, at 6.7b params
replies(1):
>>two_in+z4
◧
2. two_in+z4
[view]
[source]
2023-09-12 20:32:07
>>ronyfa+(OP)
If it's MOE that may explain why it's faster and better...
replies(1):
>>yumraj+Ae
◧◩
3. yumraj+Ae
[view]
[source]
[discussion]
2023-09-12 21:10:16
>>two_in+z4
MOE?
replies(1):
>>sartha+yg
◧◩◪
4. sartha+yg
[view]
[source]
[discussion]
2023-09-12 21:17:40
>>yumraj+Ae
Mixture of Experts Model -
https://en.wikipedia.org/wiki/Mixture_of_experts
[go to top]