HuggingTweets - Generate Tweets with Huggingface
Introduction
In this project, we'll show you how to fine-tune a pre-trained transformer on anyone's tweets using HuggingFace's transformers library – a collection of popular model architectures for natural language processing – including BERT, GPT-2, RoBERTa, T5 and hundreds of others.
We're also going to use the new Weights & Biases integration to log model performance and model predictions automatically.
Try it yourself →
Disclaimer: this demo is not to be used to publish any false generated information but to perform research on Natural Language Generation (NLG).
The Model Predictions
Without further ado, let's look at the predictions our model makes. In the next sections, we'll walk you through how to do this yourself.
See if you can spot your favorite AI researcher in the table below. 🤗
My favorite sample is definitely from Andrej Karpathy. I gave it the prompt "I don't like", and it helpfully responded:
I don't like this :) 9:20am: Forget this little low code and preprocessor optimization. Even if it's neat, for top-level projects. 9:27am: Other useful code examples? It's not kind of best code, :) 9:37am: Python drawing bug like crazy, restarts regular web browsing ;) 9:46am: Okay, I don't mind. Maybe I should try that out! I'll investigate it :) 10:00am: I think I should try Shigemitsu's imgur page. Or the minimalist website if you're after 10/10 results :) Also maybe Google ImageNet on "Yelp" instead :) 10:05am: Looking forward to watching it talk!
I had a lot of fun running predictions on other people too!
Fine-Tuning The HuggingFace Model Yourself
Generating tweets based on your favorite people by fine-tuning a transformer from HuggingFace, and visualizing its performance and predictions in Weights & Biases is simple!
If you just want to test the demo, click on below link and share your predictions on Twitter with #huggingtweets
!
To understand how the model works, check huggingtweets.ipynb
or use the following link.
Future research
Lot more research to do:
- test training top layers vs bottom layers to see how it affects learning of lexical field (subject of content) vs word predictions, memorization vs creativity ;
- losses are not the same based on people (Karpathy is the hardest to predict) ;
- data pre-processing can be optimized (padding, end tokens…) ;
- I could augment text data ;
- do I keep
@
handles or better to use names ; - what about hashtags? #ConvNets #iloveGANs ;
- need to test more models and do some fine-tuning ;
- pre-train on large Twitter dataset of many people on fine-tune on single user!
Share your results
If you get an interesting result, we'd absolutely love to see it! 🤗
Please tweet us at @weights_biases and @huggingface.
Resources to dive further
Got questions?
If you have any questions about using W&B to track your model performance and predictions, please reach out in our slack community. Our team would love to make your experience a good one.
More Resources
- A Step by Step Guide: Track your Hugging Face model performance with Weights & Biases
- Does model size matter? A comparison of BERT and DistilBERT using Sweeps
- Who is "Them"? Text Disambiguation with Transformers