Predicting Disaster Tweets
Using HuggingFace and Weights & Biases to predict whether or not a Tweet is about a disaster
Created on July 3|Last edited on July 31
Comment
IntroUse of Weights & BiasesExploratory Data AnalysisA first attempt...Sweep Relevant Code Sweep ResultsResults & Next Steps Code Sources
Intro
Given the widespread use of smartphones, individuals can promptly report emergencies they witness. As a result, disaster relief organizations and news agencies are interested in programmatically monitoring social media for immediate updates on disasters.
To address this need, our project leverages BERT to predict disaster-related Tweets. By accurately identifying such tweets, this project aims to provide timely and crucial information to aid organizations and news agencies, facilitating effective disaster response and reporting.
BERT is a transformer-based model that learns contextualized word representations by considering bidirectional context, enabling it to capture rich language understanding and achieve state-of-the-art performance on various natural language processing tasks.
Use of Weights & Biases
- Huggingface integration
- Reports
- Logging plotly figures
- Wandb Tables
- Sweeps
Exploratory Data Analysis
This set of panels contains runs from a private project, which cannot be shown in this report
A first attempt...
As an initial attempt at model training, I decided to use a modest batch size of 8, a learning rate of 0.001, and a weight decay of 0.0001.
Code to initialize wandb run and model training:
# Initialize wandb runwandb.init(entity='uma-wandb', project='disaster')# Hyperarameters - adjustable parameters of a model that influence model trainingBATCH_SIZE = 8EPOCHS = 10LEARNING_RATE = 0.001WEIGHT_DECAY = 0.0001MAXLEN = 250# Set training argumentstraining_args = TrainingArguments(output_dir='./results',num_train_epochs=EPOCHS,per_device_train_batch_size=BATCH_SIZE,per_device_eval_batch_size=BATCH_SIZE,learning_rate=LEARNING_RATE,weight_decay=WEIGHT_DECAY,evaluation_strategy='epoch',remove_unused_columns=False,report_to='wandb', # This line logs metrics to wandblogging_steps=10,)# Define training looptrainer = Trainer(model=model,args=training_args,train_dataset=train_dataset,eval_dataset=valid_dataset,compute_metrics=compute_metrics,optimizers=(optim, None),callbacks=[WandbCallback()]) # Callback not fully necessarytrainer.train()
This set of panels contains runs from a private project, which cannot be shown in this report
Sweep
Problem: My initial run produced subpar results; I need to try different hyperparameters and don't know where to start.
Solution: Automate the process of trying a bunch of different hyperparameter combos by using W&B Sweeps!
Relevant Code
sweep_config = {"method": "random","name": "disaster-sweep","metric": {"goal": "minimize","name": "train/loss"},"parameters": {"epochs": {"values": [5, 10]},"batch_size": {"values": [8, 16, 32, 64]},"learning_rate": {"values": [0.005, 0.0001, 0.00005]},"weight_decay": {"values": [0.0001, 0.1]}}}def train(config=None):with wandb.init(config=config):# set sweep configurationconfig = wandb.config# Set training argumentstraining_args = TrainingArguments(output_dir='./results',report_to='wandb', # Turn on Weights & Biases loggingnum_train_epochs=config.epochs,learning_rate=config.learning_rate,weight_decay=config.weight_decay,per_device_train_batch_size=config.batch_size,per_device_eval_batch_size=config.batch_size,save_strategy='epoch',evaluation_strategy='epoch',logging_strategy='epoch',load_best_model_at_end=True,remove_unused_columns=False,)# Define training looptrainer = Trainer(model=model,args=training_args,train_dataset=train_dataset,eval_dataset=valid_dataset,compute_metrics=compute_metrics)# start training looptrainer.train()
sweep_id = wandb.sweep(sweep_config, project='disaster-sweep')wandb.agent(sweep_id, train, count=10)
Sweep Results
After initializing a random search on the batch size, learning, rate, # of epochs, and weight decay, I was able to find a combo of hyperparameters that actually worked
This set of panels contains runs from a private project, which cannot be shown in this report
Results & Next Steps
We were able to achieve a maximum accuracy of 81% on our val dataset, and our train/loss panel indicates to us that our model is properly training, unlike before.
This set of panels contains runs from a private project, which cannot be shown in this report
Next Steps
Code Sources
I used the code from https://www.kaggle.com/code/datafan07/disaster-tweets-nlp-eda-bert-with-transformers as a basis for my EDA (but implemented the plotly graphs myself)
This set of panels contains runs from a private project, which cannot be shown in this report
This set of panels contains runs from a private project, which cannot be shown in this report
Add a comment