Norwegian language-modeling evaluation with lm-eval-harness using our NorEval benchmark
Based on HPLT-E task selection. Only benchmarks passing all enabled criteria are aggregated.