PerkyPotato

Language Reasoning Models can overtake LLMs...

PerkyPotato · 2024-10-07T11:40:07.322656+05:30

Here's my quick 3 minute breakdown: 1. o1-preview: 97.8% on PlanBench Blocksworld vs. 62.5% for top LLMs, indicating shift from retrieval to reasoning. 2. 52.8% on obfuscated "Mystery Blocksworld" vs. near-zero for LLMs, suggesting abstract reasoning skills, showing transfer capability. 3. Variable "reasoning tokens" usage correlates with problem difficulty, hinting at internal search process, indicating adaptive compute.

Here's my quick 3 minute breakdown:

o1-preview: 97.8% on PlanBench Blocksworld vs. 62.5% for top LLMs, indicating shift from retrieval to reasoning.
52.8% on obfuscated "Mystery Blocksworld" vs. near-zero for LLMs, suggesting abstract reasoning skills, showing transfer capability.
Variable "reasoning tokens" usage correlates with problem difficulty, hinting at internal search process, indicating adaptive compute.

arxiv.org

LLMS STILL CAN’T PLAN; CAN LRMS?

The ability to plan a course of action that achieves a desired state of affairs has long been con...

8mo ago

Talking product sense with Ridhi

9 min AI interview5 questions

SwirlyPretzel

Google8mo

Adaptive compute is very interesting to me imo. I wonder how they are using variable compute for each task and basis what meta heuristic

WobblyMarshmallow

Stealth8mo

Adaptive compute is what will help optimise cost for high complexity tasks, right?

SnoozyPickle

Advanced Micro Devices8mo

Probably different "cores" for different types of tasks

ZestyPenguin

Student8mo

Thanks for the paper! It's really interesting. I've been sounding like a madman explaining to people irl that Generative AI is not the end goal or even the natural next step of AI.