One Step Closer to AGI: The 'Tree of Thoughts'

Inspired by Reinforcement Learning, LLM's make huge performance gains.
Created on May 30|Last edited on May 30
Comment
As we venture further into the realms of Artificial Intelligence, researchers are relentlessly seeking to improve the capabilities of Large Language Models (LLMs). One such endeavor involves designing new methodologies for LLMs to solve complex problems.
Traditionally, humans have been exceptional at these tasks due to our inherent ability to explore a "tree" of possible solutions, with each branch representing a different path leading to a potential solution. This exploration is generally guided by heuristics or rules of thumb. However, the prevalent models for language understanding and problem-solving have not incorporated this particular trait of human problem-solving: planning, exploring different paths, and backtracking.
Tree of Thoughts Researchers have recognized this gap and developed a novel approach called "Tree of Thoughts" (ToT) (you can read the paper here). The ToT methodology frames problem-solving as a search through a tree of potential solutions. It is designed to emulate the human approach to problem-solving and enhances the LLM's ability to tackle complex problems.
Key Questions and System Components ﻿
The application of ToT necessitates answering four fundamental questions:
How can the problem-solving process be divided into smaller "thought steps"?
How can potential thoughts be generated from each state in the tree?
How can the potential of each state be evaluated using heuristics?
What kind of search algorithm should be used to explore the tree?
﻿
These questions are addressed in the following simplified breakdown of the ToT approach:
Thought decomposition: This step involves breaking down the problem-solving process into manageable steps or "thoughts". For example, solving a crossword puzzle might be approached by generating a few words at a time, while a writing task could involve crafting an entire paragraph.
Thought generator: The role of this stage is to ideate potential next steps or thoughts. Two strategies are generally employed here. The first involves generating completely independent ideas when many options are available. The second strategy is to generate ideas that are somewhat related to the current state, which is useful when the options are limited.
State evaluator: Once potential thoughts are generated, their viability is evaluated to decide which options should be explored further. This can be done by independently assessing each option or comparing all the options and selecting the best one. This approach shares similarities with reinforcement learning (RL), where agents learn to perform actions that maximize some notion of cumulative reward. Normally this evaluator would need to be trained in a setting like RL; however, existing LLM’s were used very successfully to evaluate the quality of a thought.
Search algorithm: In the final step, a search algorithm is employed to traverse the different options. Two common types tested are breadth-first search (BFS), which explores all the options at each level of the tree before moving to the next level, and depth-first search (DFS), which explores one branch of the tree as far as possible before backtracking and trying a different branch.
Simplicity Advantage  Importantly, the authors argue that the ToT approach is general, modular (allowing for independent changes to different parts), adaptable to different problems, and convenient since it doesn't require training a new language model. Instead, an existing LLM can be leveraged. Furthermore, the approach allows for trade-offs between time, resources, and the robustness of heuristics, through multiple prompts to the LLM.
Comparison to existing inference paradigms 
￼
ToT Algorithms Two main algorithms associated with the ToT methodology are Breadth-First Search (ToT-BFS) and Depth-First Search (ToT-DFS). Here is a simplified explanation of each:
Algorithm 1: ToT-BFS
ToT-BFS explores all options at a particular level (breadth) in the tree before proceeding to the next level. It operates by generating all possible new states based on the current states at each step, up to a set limit. The top states with the highest values according to the state evaluator are selected, and the process is repeated until all steps are completed. Finally, the best state from the last set of states is chosen, and a thought from it is generated as the final output.
Algorithm 2: ToT-DFS
ToT-DFS goes as deep as possible into one path in the tree before backtracking and trying a different path. If a current step is beyond a set limit, a thought from the current state is generated as output. For each potential next state, if the value of the new state (according to the state evaluator) is greater than a set threshold, the DFS algorithm is recursively called with this new state and the next step.
﻿
Both algorithms aim to find the best sequence of thoughts leading to the solution of the given problem. The difference lies in the exploration of the tree of potential solutions: BFS does it level-by-level, while DFS explores as deep as possible in one direction before backtracking.
Overview of BFS and DFS 
￼
One Step Closer to AGI The Tree of Thoughts (ToT) represents a significant leap forward in enhancing the problem-solving capabilities of language models. By borrowing from human cognition and problem-solving methodologies, ToT provides LLMs with a more strategic and exploratory approach to tackling problems. Its versatility and modularity open a new frontier for applying LLMs in various complex tasks. As research in this area continues, we can anticipate further fascinating developments in AI.
Yannic Kilcher on ToTFor more information, here's a great video description by Yannic Kilcher on just this topic.
﻿
﻿
﻿
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.