The best AI of our time competed with leading mathematicians

Over the past few years, artificial intelligence has advanced in a number of remarkable ways. Today, machines can not only solve complex problems, but also develop their own unique proof strategies. But are they really that smart? In a new study, cutting-edge AI systems are being challenged by leading mathematicians. The unpeer-reviewed paper is out now on the preprint server ArXiv.

Ability of AI to solve problems

While the ability of AI to solve problems like the GSM8K set (8.5 thousand high school math problems that require multiple steps to solve) or the International Mathematical Olympiad is impressive, these are still not the most advanced areas of mathematics, but rather the level of an advanced school rather than the limits of human knowledge in this area.

In addition, there is a problem of a lack of new problems for various AI programs.

A significant problem when assessing large language models [LLMs] is contamination of data.”In other terms, this is the unintentionally inclusion of trial issues in the initial research data,” the researchers write.

As a result, like a student who knows the answers to a test in advance, the success rates of the models are inflated, obscuring the true reasoning abilities of the models.

This is no empty promise: the project involved Fields Medalists, including those who submitted problems to the data set, and mathematicians at the graduate level and above from universities around the world.The proposed problems had to satisfy four criteria: be original—that is, their solution required genuine mathematical insight, rather than fitting known problems; be testable without guesswork; be computationally solvable; and be quickly and automatically verified. Once the problems were checked against all these criteria, they were peer-reviewed, given difficulty ratings, and submitted to the AI.Could today’s programs handle them? Alas, no.

The solutions are so complex that they require large amounts of training data that are not available in reality, notes Fields Medalist Terry Tao. However, this is a temporary limitation, because as AI systems improve, the situation should change, as the authors note.

Breaking

The best AI of our time competed with leading mathematicians AI is getting to grips with the toughest math problems, but what happens when they really go beyond textbooks?

Ability of AI to solve problems

Leave a Reply Cancel reply

You Missed

YouTube, a twist : this time the change to advertising is in favor of users

NEO Gamma, the robot butler that washes, tidies and vacuums is reality.

Banks, Hackers Don’t Offer a Moment of Peace | They Attack Them Every Day: Do This Now, Your Money Is in Danger

The Odyssey, the director wanted Robert Downey Jr. in the film for an important role

The best AI of our time competed with leading mathematicians AI is getting to grips with the toughest math problems, but what happens when they really go beyond textbooks?

Ability of AI to solve problems

Related Posts

NEO Gamma, the robot butler that washes, tidies and vacuums is reality.

Banks, Hackers Don’t Offer a Moment of Peace | They Attack Them Every Day: Do This Now, Your Money Is in Danger

xAI’s Grok 3 is available for free for a limited time

Leave a Reply Cancel reply

You Missed

YouTube, a twist : this time the change to advertising is in favor of users

NEO Gamma, the robot butler that washes, tidies and vacuums is reality.

Banks, Hackers Don’t Offer a Moment of Peace | They Attack Them Every Day: Do This Now, Your Money Is in Danger

The Odyssey, the director wanted Robert Downey Jr. in the film for an important role