Content
summary Summary

Language models can reason better if they write down intermediate steps. A new study shows how such "System 2 Reasoning" can, at least partially, be trained into language models.

Ad

In recent years, AI methods like Chain-of-Thought Prompting or Branch-Solve-Merge have demonstrated that large language models achieve better results when they are made to generate their answers in multiple steps.

This two-step process can be seen as an expression of Daniel Kahneman's "System 2 Reasoning", where information is processed slowly and consciously. Its counterpart is "System 1", a fast, unconscious, and automated way of thinking.

Researchers from Meta AI have now developed a method to "distill" the computationally intensive "System 2 Reasoning" of AI models into the parameters of a language model. The results show that the resulting "System 1" model achieves similarly good performance in some cases as the original two-stage process - with significantly lower computational effort.

Ad
Ad

The process works as follows: First, a "System 2" method is applied to a large amount of example data. Then the answers are filtered, e.g., by keeping only consistent results. Finally, this data is used to train the language model through fine-tuning. Essentially, the team generates synthetic training data via System-2 prompts for fine-tuning LLMs to skip steps and answer directly.

Chain-of-thought remains out of reach

The researchers applied the method to four different "System 2" approaches and five task types. They found that distillation works in many, but not all cases.

For methods such as System 2 Attention to avoid biases or Rephrase and Respond to improve responses, it was shown that the resulting "System 1" models delivered similar results as the "System 2" variants, but with significantly fewer generated tokens.

However, the distillation failed for complex mathematical reasoning using Chain-of-Thought Prompting. The researchers suspect this is because some tasks are simply too complex for "System 1" thinking - especially since the models, even with CoT, repeatedly fail at logical tasks.

Nevertheless, the researchers see their method as a promising approach for developing powerful AI systems, which can then focus on the really challenging problems using other methods like CoT.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Meta AI researchers are developing a method to "distill" the computationally intensive "System 2 Reasoning" of AI models into the parameters of a language model. In some cases, the resulting "System 1" model achieves similarly good results with significantly less computational effort.
  • To do this, a "System 2" method is first applied to sample data, the responses are filtered, and finally the language model is trained with this synthetic training data using fine-tuning.
  • Distillation works with methods such as System 2 Attention and Rephrase and Respond, but fails with complex chain-of-thought prompts for mathematical conclusions. Nevertheless, the researchers see this as a promising approach for developing powerful AI systems that can focus on challenging problems.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.