rvnx 10 hours ago

> As of 1 Oct 2024, this model is #1 on all three automatic alignment benchmarks (verified tab for AlpacaEval 2 LC), edging out strong frontier models such as GPT-4o and Claude 3.5 Sonnet.

> this model can correctly the question How many r in strawberry? without specialized prompting or additional reasoning tokens

Can be tested here: https://huggingface.co/chat/models/nvidia/Llama-3.1-Nemotron...

  • not_your_vase 10 hours ago

      > this model can correctly the question
    
    Yeah, but I guess it still can't put a verb in a sentence, so you win some, you lose some. It accidentally the whole statement.