Nearly 1,000 researchers worldwide created "Humanity's Last Exam," a 2,500-question test designed to challenge advanced AI systems by including only questions that current models cannot reliably answer. Early results show leading AI systems achieved accuracy rates between 2.7 percent and 50 percent, revealing a significant gap between AI capabilities and expert-level human knowledge across specialized academic fields.
1 comment
Nearly 1,000 researchers worldwide created "Humanity's Last Exam," a 2,500-question test designed to challenge advanced AI systems by including only questions that current models cannot reliably answer. Early results show leading AI systems achieved accuracy rates between 2.7 percent and 50 percent, revealing a significant gap between AI capabilities and expert-level human knowledge across specialized academic fields.