Discussion about this post

User's avatar
SorenJ's avatar

You really should be using reasoning models for all of your tests here. The tests you are showing are at this point ~6 months out of date and a lot has happened in that time frame.

Expand full comment
2 more comments...

No posts