Skip to main content

L2P: A Strong Method For Continual Learning

L2P is a newly proposed method for continual learning that shows more success than comparable methods, including rehearsal-based methods.
Created on April 20|Last edited on April 21
For supervised learning in classification, a model is usually prepared to handle a pre-defined collection of classes from which it can choose. Adding additional classes or tasks for the model to handle is called continual learning, the process of broadening the scope of what a machine learning model can do.

The difficulties of continual learning

Continual learning is plagued with the major issue of "catastrophic forgetting", something that happens when a model is taught something new and forgets how to do the task it was able to do prior to that. This usually happens because the model is trained on data X to the point where all it's weights are tuned to accomplish task X, and then it is subsequently trained on data Y and all it's weights are tuned to accomplish task Y, causing the model to lose the weights that were able to accomplish task X.
There are some existing solutions to fix this conundrum: The most common method uses a "rehearsal buffer" where instances of data X is peppered into the training process for data Y, effectively letting it train on data X and data Y at the same time. This solution may cause issues with training times, increasingly large model size, and needing to handle an ever-growing memory batch of rehearsal data.
Another solution is segmenting the model into parts focused on particular tasks, though doing this would balloon model sizes.

L2P is a very promising method for continual learning

The rehearsal-based method is reliable, but researchers set out to find a solution that worked even better. In this paper published by Google Research titled, "Learning to Prompt for Continual Learning" by Wang et al. they explore a method for continual learning called "Learning to Prompt", or L2P.
L2P takes inspiration from prompting techniques used in natural language processing, and is the first time prompting has been used as a solution to the problems continual learning faces.
Most models tend to feed the input data raw into the model, and train in such a way that the weights of the whole system are fine-tuned to complete the task successfully. L2P chooses to first feed the input into a pre-trained embedding lair, then uses the input again to select a prompt from a prompt pool to prepend the output of the embedding layer with. From there, the data is fed through the rest of the pre-trained model and through the classifier for prediction.

The pretrained sections are actually left untouched during the training process, the only areas that are modified are the prompt pool and the classifier. This means that the progression of the model's ability actually comes down largely to the way the prompts are handled based on input data.

L2P compared to other methods

In their findings, L2P was able to consistently out perform all other methods of continual learning, including the commonly used standard rehearsal method. L2P was able to produce more accurate predictions and suffer from forgetting less than other methods. Notably, L2P doesn't use a rehearsal buffer, but a rehearsal buffer could be tacked on to the process for even high success rates.

Find out more

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.