Skip to main content

Combatting credit card fraud with machine learning

In this article, learn how machine learning combats credit card fraud through practical model training.
Created on March 31|Last edited on May 13



Introduction

Fighting credit card fraud has always been difficult.Fraudsters update their methods constantly and the sheer volume of credit card transactions make many traditional approaches difficult.
In this piece, we'll look at how machine learning can help solve this problem. We'll show you how to use a special dataset to teach these programs what to look for in spotting fraud. We'll cover everything from preparing the data to choosing the right models for the job.
Here's what we'll be covering:

Table of contents



The nature of credit card fraud

Types of credit card fraud

Source
Credit card fraud takes several forms but unauthorized use and identity theft are the most prevalent.
Unauthorized use involves transactions made without the cardholder's consent, often stemming from the loss or theft of the physical card. Identity theft occurs when fraudsters obtain sensitive personal information, enabling them to open new accounts or take over existing ones in the victim's name.
Another common method, known as carding, involves testing stolen card data on websites to verify their validity. Phishing attacks, skimming devices, and data breaches are typical avenues through which criminals access card details and personal information, further diversifying the landscape of credit card fraud.

Challenges in detecting fraudulent transactions

Detecting and preventing credit card fraud poses significant challenges, primarily due to the rapidly evolving tactics employed by fraudsters. As security measures advance, so too do the methods used to circumvent them, creating a constant arms race between criminals and financial institutions.
The sheer volume of transactions processed daily exacerbates this issue, necessitating sophisticated algorithms capable of identifying fraudulent activity in real-time without impeding legitimate transactions. The requirement for high-speed, accurate detection systems underscores the complexity of effectively combating credit card fraud.

Impact on consumers and financial institutions

The ramifications of credit card fraud extend beyond immediate financial losses, which can be considerable for both consumers and institutions.
For individuals, the experience can bring long-term damage to credit scores, loss of access to financial products, and the emotional distress associated with identity theft. For financial institutions, fraud not only entails direct financial losses but also erodes consumer trust, a fundamental component of financial service relationships.
The reputation damage can deter potential customers and strain existing client relationships. Additionally, regulatory implications may arise, with institutions facing potential sanctions for failing to adequately protect consumer information or prevent fraud. Together, these factors illustrate the multifaceted impact of credit card fraud, highlighting the importance of ongoing efforts to enhance security and detection measures.

An overview of anomaly detection in fraud prevention

Understanding anomalies in transaction data

In the context of credit card transactions, anomalies are activities that starkly contrast with a user's regular spending behavior. A few eamples:
  • Geographical anomalies: If a cardholder typically uses their card in New York and there's suddenly a flurry of transactions in Paris within hours, this geographical inconsistency could signal unauthorized use.
  • Frequency anomalies: A card regularly used for a couple of transactions per day suddenly incurs dozens of transactions in a short timeframe, suggesting potential fraud.
  • Value anomalies: If a user's transactions usually hover around $50, but there's an abrupt purchase of $5,000, this unusual spike in transaction value could be a red flag. This will be the main focus of our example model used in this article.

Role of anomaly detection in fraud prevention

The primary goal of anomaly detection in the context of fraud prevention is to differentiate between legitimate transactions and potential fraudulent activities accurately and swiftly.
By identifying transactions that fall outside of established patterns of behavior, financial institutions can flag these anomalies for further investigation, halt transactions in real-time, or even block cards to prevent further unauthorized use.
This not only helps in minimizing financial losses but also protects consumers from the broader implications of credit card fraud, including identity theft and credit score damage.

Conceptual approach to detecting fraud

Detecting fraud through anomaly detection involves several conceptual approaches that collectively enhance the accuracy and efficiency of fraud prevention measures:
  • Behavioral profiling: Consider a cardholder, John, who uses his card mostly for groceries, online subscriptions, and occasional dining out. A behavioral profile built on this data helps detect deviations, like sudden luxury item purchases or gambling charges, which are atypical for John.
  • Pattern recognition: If a pattern emerges where several cardholders report fraudulent transactions shortly after using their cards at a particular merchant, future transactions at this merchant could be scrutinized more closely or flagged for review.
  • Anomaly identification: Using machine learning, a system learns that cardholders rarely make several large electronic purchases back-to-back. When Jane, who has a history of small, scattered purchases, suddenly buys three high-end laptops within an hour, this anomaly is flagged for review.

What are the best machine learning models for fraud detection?

LightGBM (LGBM)

LightGBM, short for Light Gradient Boosting Machine, is a highly efficient model known for its speed and accuracy. It works by building trees in a gradient-boosting framework, focusing on errors from previous trees to improve its predictions.
What makes LightGBM particularly suited for fraud detection is its ability to handle large datasets and work with features of varying importance, quickly identifying patterns that may indicate fraudulent activity.

XGBoost

XGBoost stands for Extreme Gradient Boosting and is another model that thrives in the gradient boosting family. It's celebrated for its performance and speed, making it a go-to choice for many data scientists.
XGBoost improves fraud detection by carefully adjusting for bias and variance, learning from the mistakes of previous iterations to enhance its predictive power. Its robustness and ability to deal with imbalanced datasets make it invaluable in spotting elusive fraud cases.

RandomForest

Source
The RandomForest model is like assembling a team of decision-makers, where each tree in the forest contributes its vote on whether a transaction is fraudulent or not. This ensemble method is effective because it combines the predictions of numerous decision trees, reducing the risk of overfitting and increasing the model's generalizability.
RandomForest is particularly appreciated for its interpretability and the ability to rank the importance of different features in predicting fraud.

Logistic regression

Source
Logistic Regression might seem simple compared to its more complex counterparts, but it's a staple in the fraud detection toolkit. It estimates the probability that a given transaction is fraudulent, offering a direct and interpretable way to assess risk. This model excels in scenarios where relationships between the features and the outcome are approximately linear, making it a solid baseline for any fraud detection analysis.

Understanding Our Dataset

For the practical section of this article, we are going to train our models on the Online Payment Fraud Detection data set. This dataset contains the following columns:
  • Step: This temporal feature, measured in hours, allows us to analyze transaction patterns over time. For instance, fraudulent activities may occur more frequently during certain hours.
  • Type: The transaction type (e.g., transfer, cash out) can be crucial for identifying fraud, as certain transaction types might be more susceptible to fraudulent activities.
  • Amount: The transaction amount could indicate potential fraud, especially if the amount significantly deviates from a user's typical transaction pattern.
  • NameOrig: The customer initiating the transaction. While the name itself may not directly indicate fraud, analyzing patterns or frequencies of transactions initiated by the same customer could reveal suspicious behaviors.
  • OldbalanceOrg and NewbalanceOrig: These balance features before and after the transaction for the originator could help identify discrepancies indicative of fraud, such as significant balances left in the account post-transfer.
  • NameDest: The recipient of the transaction. Similar to the originator, the recipient's transaction history might provide insights into recurring fraudulent patterns.
  • OldbalanceDest and NewbalanceDest: The recipient's account balance before and after the transaction. Large discrepancies or unexpected balance changes could signal fraudulent activity.
  • IsFraud: This binary target variable indicates whether the transaction is fraudulent, serving as the basis for training classification models.

The goal of our model

Our goal is to use the data we have to teach a computer program how to tell if a transaction is fake or real. We're going to try this with four different types of machine learning models, to see which one does the best job at spotting fake transactions.
For those asking about how the internal workings of the model function the answer is simple. The model would detect fake transactions from small details in each data point, such as a massive jump in the transaction amount sent, the type of the transaction, etc.
But before we start training these models, we'll first process our data and look at it closely to understand it better. This step helps us get ready and make sure our models can learn effectively.

Practical guide: Implementing a fraud detection model

Step 1: Upgrading our tools

We start by making sure we have the latest versions of LightGBM and XGBoost. It's like updating our apps to have the coolest new features.
!pip install --upgrade lightgbm xgboost

Step 2: Importing required libraries

In this step, we will be importing necessary Python libraries for data manipulation (pandas, numpy), visualization (matplotlib, seaborn), machine learning models (RandomForestClassifier, LogisticRegression, XGBClassifier, LGBMClassifier), and model evaluation (classification_report, roc_auc_score).
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
import lightgbm as lgb
from sklearn.metrics import classification_report, roc_auc_score

Step 3: Loading the dataset

Loads the fraud detection dataset from a CSV file into a pandas DataFrame for analysis.
df = pd.read_csv('/kaggle/input/online-payments-fraud-detection-dataset/PS_20174392719_1491204439457_log.csv')

Step 4: Data splitting

We split our data into two parts - one for training our models and the other for testing them to see how well they do. The golden ratio is 80 to 20 percent.
X = df.drop(['isFraud', 'nameOrig', 'nameDest'], axis=1)
y = df['isFraud']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 5: Visualizing our dataset

Good vs. Bad Transactions
This step is extremely important, as we can study how dataset balances when comparing its two main classes.
wandb.init(project="fraud_detection", name="type_distribution")
We will then plot, save, and log the plot into our Weights & Biases Project run.
type_counts.plot(kind='bar', stacked=True, figsize=(10, 6), color=['green', 'red'])
plt.title('Percentage of Fraudulent and Legitimate Transactions by Type')
plt.xlabel('Transaction Type (Encoded)')
plt.ylabel('Percentage')
plt.legend(['Legitimate', 'Fraudulent'], title='Transaction Type', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()

plt.savefig("type_distribution.png")
wandb.log({"Transaction Type Distribution": wandb.Image("type_distribution.png")})
plt.close()

wandb.finish()


As shown in the above graph, the data set heavily leans toward legitimate (non-fraud) transactions. This is due to the real-life bias towards normal and legitimate transactions compared to the fraudulent ones.
This is why some engineers choose to use additional synthetic data for this problem.
💡
Another thing we'll focus on is the actual type of each transaction. We will check the distributions of the 5 types available in our data set. This is important due to the nature of some types of transactions to have a higher probability of being fraudulent compared to others.
wandb.init(project="fraud_detection", name="transaction_type_distribution")
Similarly, we will then save and log the plot into our Weights & Biases Project run.
plt.figure(figsize=(10, 6))
sns.countplot(data=df, x='type', hue='isFraud')
plt.title('Transaction Type Distribution')
plt.xlabel('Transaction Type')
plt.ylabel('Count')
plt.savefig("transaction_type_distribution.png")
wandb.log({"Transaction Type Distribution": wandb.Image("transaction_type_distribution.png")})
plt.close()

wandb.finish()



Step 6: Training our models

For our first model, we'll use the LightGBM, known for its fast and efficient handling of large datasets. It's especially good at picking up on the subtle signs of fraud, even when fraudulent transactions are much less common than legitimate ones.
lgb_model = lgb.LGBMClassifier(is_unbalance=True, metric='auc', objective='binary', num_leaves=31, learning_rate=0.05, feature_fraction=0.9, bagging_fraction=0.8, bagging_freq=5, verbose=0)
Next up, we have the XGBoost model. It's a powerhouse that's both precise and efficient, making it a favorite for tackling complex challenges like fraud detection. By focusing on reducing errors in successive iterations, XGBoost hones in on the toughest cases of fraud.
xgb_model = XGBClassifier(use_label_encoder=False, eval_metric='logloss')
Following that, we'll explore the RandomForest classifier.
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
Lastly, the Logistic Regression model is our straightforward but effective approach. It's like using a classic detective method to separate fraudulent transactions from legitimate ones based on observed patterns. Simple yet surprisingly powerful, it gives us a clear line between "fraud" and "not fraud."
lr_model = LogisticRegression(max_iter=1000)

Step 7: Function to train and evaluate models

def train_evaluate_model(model, X_train, y_train, X_test, y_test, model_name='Model'):
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
try:
y_pred_proba = model.predict_proba(X_test)[:, 1]
except:
y_pred_proba = [0]*len(y_pred)
print(f"Results for {model_name}:")
print(classification_report(y_test, y_pred))
print("ROC AUC Score:", roc_auc_score(y_test, y_pred_proba))
print("-" * 60)
Train and evaluate each model
train_evaluate_model(lgb_model, X_train, y_train, X_test, y_test, 'LightGBM')
train_evaluate_model(xgb_model, X_train, y_train, X_test, y_test, 'XGBoost')
train_evaluate_model(rf_model, X_train, y_train, X_test, y_test, 'Random Forest')
train_evaluate_model(lr_model, X_train, y_train, X_test, y_test, 'Logistic Regression')

Evaluating our results

In this final step, we will be comparing the results of the four models that we have used in the given task. Although the difference in results may seem small, in the case of performing delicate tasks such as detecting fraudulent transactions that directly correlate with real-life money, even the smallest improvement in accuracy would make a difference. Luckily for us, even though the four below-used models may seem simple, they actually work with great efficiency when it comes to tabular data such as the one used in this article.
Throughout the article, we have actually logged the accuracy and ROC AUC Score of each of our models into W&B. The idea is to try on different parameters for each model comparing the final results with each parameter tuning. Below are the final results that we have settled on. We have logged both evaluation metrics as W&B’s graphs, and we have also printed the results for easier visibility.

LGBM

For our first model, the LGBM model didn’t return the best accuracy. The results are impressive in a vacuum but not when compared to our other models.
Results for LightGBM:
  • Accuracy: 0.9429488166824358
  • ROC AUC Score: 0.9344628948088899


XGBoost

The XGBoost model returned the best accuracy among the four used with an outstanding near 100 percent accuracy.
Results for XGBoost:
  • Accuracy: 0.9996754481644354
  • ROC AUC Score: 0.9988293874981029



Random forest

The random forest returned near-optimum results as well.
Results for Random forest:
  • Accuracy: 0.9995960783450842
  • ROC AUC Score: 0.9958275430056476



Logistic regression

As did our logistic regression model:
Results for Logistic regression:
  • Accuracy: 0.9992110168452618
  • ROC AUC Score: 0.9557624827254294


Conclusion

While the battle against fraudsters is ongoing, the tools and techniques we've explored offer promising strategies for staying one step ahead. Armed with knowledge and the right algorithms, we're better equipped to protect ourselves and our financial systems from the ever-evolving threat of fraud.
Throughout this article, we've taken a deep dive into the mechanics of using machine learning to detect and prevent credit card fraud, a significant challenge in our increasingly digital world. By walking through each step—from understanding the nature of fraud to training sophisticated models—we've seen how technology can be a powerful ally in identifying suspicious transactions.