Skip to main content

Tree of Thoughts, Sophia, Goat, QLoRA, and Other ML News

Here's a round-up of the Tree of Thoughts, Second-order Clipped Stochastic Optimization (Sophia), GOod at Arithmetic Tasks( Goat), QLoRA, and other ML news.
Created on May 26|Last edited on May 26

Tree of Thoughts



Tree of Thought prompting is another method where, instead of prompting the model for a single response or (a chain of thoughts leading up to a response), the model is prompted with a selection of initial thoughts. These thoughts are evaluated to see which are most favorable. The more favorable thoughts/steps are then continued where the rest of the unfavorable thoughts are pruned. More details are in Yannic's awesome video!


Sophia

Stanford researchers recently published a paper titled, Sophia, or Second-order Clipped Stochastic Optimization. Sophia is a new optimizer for language model pre-training. They've found that Sophia not only converges faster, but it achieves roughly a 2x speedup over AdamW, the LLM standard optimizer, in wall time, number of steps, and total compute.

The algorithm is described below (the subroutine estimator algorithms listed in the paper).


Goat

Goat, or GOod at Arithmetic Tasks, is a fine-tuned LLaMA model that outperforms GPT-4 on a number of arithmetic tasks. Goat's success is made possible by the way it dismantles complex arithmetic tasks into more manageable ones based on arithmetic principles. They began their experiments categorizing learnable and unlearnable tasks.

LLaMA's performant arithmetic capabilities, as the authors note, seem to be due to the consistent number tokenization behind LLaMA. Their paper tackles 4 arithmetic tasks: addition, subtraction, multiplication, and division. Addition and subtraction seem to be learnable tasks while multiplication and division require a bit of Chain of Thought (CoT) prompting specifically dividing the arithmetic task into sub-components.
For multiplication of multiple digits (division also follows a CoT approach), this divide and conquer strategy is:
  • extract the expression from the natural language
  • split the smaller number into 2
  • expanding the sum based on distributive property
  • compute the product
  • adding term by term
This paper introduces a method to break-down the difficult arithmetic tasks like multi-digit multiplication and division. Future work may integrate this into existing LLMs to enhance their arithmetic reasoning capabilities.

QLoRA

QLoRA, developed by researchers at the University of Washington (UW), is an efficient finetuning approach for LLMs. They demonstrate this finetuning, memory and time efficient method on over a thousand models.

3 novel methods make up QLoRA:
  • 4-bit NormalFloat Quantization
  • Double Quantization, quantizing the quantization constants in addition to weights and parameters
  • Paged optimizers leveraging NVIDIA's unified memory feature
These methods aim to reduce the memory footprint of training while also preserving the original values. The authors released a set of models called Guanaco along their new finetuning method. All the code can be found in their GitHub repo!

Other News

  • Meta announces they are developing custom chips tailored to AI
  • Zoom and Anthropic are partnering to bring Claude to Zoom
  • Nvidia AI Enterprise integrating into Microsoft Azure's ML ecosystem

References


Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.