Exploring the mathematical thinking of humans and LLMs
o3 can easily solve the reasoning tasks
https://chatgpt.com/share/680eb4b9-da30-800e-9fb0-2897e460aeec
https://chatgpt.com/share/680eb58e-aab0-800e-9792-d32ac492d68d
And the abstraction tasks:
https://chatgpt.com/share/680eb613-e890-800e-99d0-f99efa522040
https://chatgpt.com/share/680eb653-0928-800e-b735-5ac3335a7ffb
That's great! Especially the new language one. Although that also got the logical chain wrong.
You really should be using reasoning models for all of your tests here. The tests you are showing are at this point ~6 months out of date and a lot has happened in that time frame.
That's good to know. I'm planning a follow up, will use reasoning models there.
o3 can easily solve the reasoning tasks
https://chatgpt.com/share/680eb4b9-da30-800e-9fb0-2897e460aeec
https://chatgpt.com/share/680eb58e-aab0-800e-9792-d32ac492d68d
And the abstraction tasks:
https://chatgpt.com/share/680eb613-e890-800e-99d0-f99efa522040
https://chatgpt.com/share/680eb653-0928-800e-b735-5ac3335a7ffb
That's great! Especially the new language one. Although that also got the logical chain wrong.
You really should be using reasoning models for all of your tests here. The tests you are showing are at this point ~6 months out of date and a lot has happened in that time frame.
That's good to know. I'm planning a follow up, will use reasoning models there.