zlacker

[parent] [thread] 0 comments
1. huac+(OP)[view] [source] 2023-12-20 21:41:58
the BTL model is just a way to infer 'true' skill levels given some list of head to head comparisons. the head to head comparisons / rankings are the most important!!!! and in this case, the rankings come from GPT-4 itself. so take any subsequent score with all the grains of salt you can muster.

their methodology also appears to be 'try 12 different models and hope 1 of them wins out.' multiple hypothesis adjustments come to mind here :)

[go to top]