Skip to main content

Building a recommendation engine with Python

Harnessing the power of machine learning: how to create personalized recommendations in Python
Created on January 22|Last edited on January 20
Recommendation engines are the secret sauce behind personalized digital experiences. Whether it’s Netflix suggesting your next binge-worthy series, Amazon curating your shopping list, or Spotify creating a playlist just for you, recommendation engines transform how we interact with technology. These systems use AI to predict user preferences and enhance engagement.
In this guide, we’ll explore what a recommendation engine is, the types of models that power them, and how to build a movie recommendation engine with Python. By the end, you’ll have a solid foundation for creating your own recommendation engine and tracking experiments with Weights & Biases.


Table of Content



What is a recommendation engine?

A recommendation engine is a system that uses data and machine learning algorithms to predict what users are most likely to enjoy or need. By analyzing user behavior, preferences, and patterns from similar users, it delivers personalized suggestions, enhancing engagement and driving business outcomes.
Recommendation engines analyze vast amounts of user data—such as browsing history, purchase records, or content interactions—to identify trends and make predictions. They act as virtual assistants, simplifying decision-making for users while maximizing value for businesses.
Examples include:
  • Streaming platforms like Netflix recommending shows or movies based on your watch history and ratings.
    E-commerce websites such as Amazon suggesting complementary products or items frequently bought together.
    Music apps like Spotify creating curated playlists tailored to your listening habits.

Why are recommendation engines essential?

Unlike static lists of popular items, recommendation engines adapt dynamically to user behavior. They scale to analyze millions of users and items in real time, offering:
  • Personalization: Ensuring recommendations are tailored to individual tastes.
  • Efficiency: Helping users discover relevant content or products faster.
  • Continuous learning: Improving accuracy as more data is collected.
By providing personalized, actionable suggestions, recommendation engines are indispensable in modern applications, from entertainment to healthcare, where tailored experiences are critical for success. Whether you’re building one for a small app or a global platform, understanding their mechanics is the foundation of creating meaningful user interactions.

How does a recommendation engine work?

The workflow of a recommendation engine can be broken down into five key steps:
  1. Data collection: Gather user data, such as browsing history, ratings, or purchase records.
  2. Data processing: Clean and preprocess the data to ensure quality, handling missing values and normalizing data where necessary.
  3. Algorithm selection: Choose an appropriate model—collaborative filtering, content-based filtering, or hybrid approaches.
  4. Recommendation generation: Use the model to predict and rank items based on user preferences.
  5. Feedback loop: Continuously refine the system by incorporating user feedback, such as clicks, ratings, or purchases.


Recommendation engine models

Recommendation engine models can be divided into three categories based on their working:

Collaborative filtering

Collaborative Filtering can be likened to asking a friend who has similar tastes for recommendations. Just as you might trust a friend who has enjoyed the same movies or books as you to suggest your next watch or read, collaborative filtering uses the power of collective user preferences and behaviors to make personalized suggestions. There are two subcategories of collaborative filtering:
  1. User-based filtering: Compares users with similar tastes. Example: If User A and User B both like action movies, User B’s recent favorite might be suggested to User A.
  2. Item-based filtering: Focuses on item similarity. Example: People who bought "The Da Vinci Code" also bought "Angels & Demons."

Content-based filtering

Content-based filtering is your personal echo chamber. It is like having a personal chef who knows your specific taste preferences and dietary restrictions. This chef doesn't necessarily consider what others are eating. Instead, they focus solely on your unique likes and dislikes.
For instance, if you have shown a preference for Italian cuisine and vegetarian dishes, your chef will recommend dishes like vegetarian lasagna or mushroom risotto, regardless of what others are eating. This method is particularly useful when accurate and detailed information about the content is available and when the user’s preferences are distinct and consistent.

Hybrid models

A Hybrid recommendation system is akin to a combined approach of consulting both a well-informed friend and a knowledgeable personal assistant for suggestions. This friend knows your preferences based on your past choices and what similar people have enjoyed (like Collaborative Filtering), while the personal assistant is aware of your specific tastes and requirements based on the characteristics of items you've liked (like Content-Based Filtering).
Together, they provide recommendations that are both personalized to your taste and informed by broader trends and similarities among other users.
Let’s summarize the recommendation engine models in a concise table:


Python tutorial: Building a simple recommendation engine

Roll up your sleeves; it's time to build your own recommendation engine in Python. We will explore the implementation of both collaborative filtering and content based filtering. We’ll also use Weights & Biases for experiment tracking. Let’s dive in, starting with the foundational setup that applies to both methods.

Importing the libraries

This cell imports the libraries we’ll use throughout the project. Libraries like pandas and numpy help manipulate data, while scipy provides tools like Singular Value Decomposition (SVD). wandb allows us to track and log experiments.
import requests
import zipfile
import pandas as pd
import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse.linalg import svds
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from math import sqrt
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
!pip install wandb
!wandb login
import wandb

Download and Extract the Dataset

Here, we download the MovieLens 100K dataset, which contains user ratings for movies. After extracting the dataset, we load the ratings into a DataFrame to begin processing.
# Download the dataset
url = 'http://files.grouplens.org/datasets/movielens/ml-100k.zip'
r = requests.get(url, allow_redirects=True)
open('ml-100k.zip', 'wb').write(r.content)


# Extract the dataset
with zipfile.ZipFile('ml-100k.zip', 'r') as zip_ref:
zip_ref.extractall('ml-100k')


# Load the datasets
df_ratings = pd.read_csv('ml-100k/ml-100k/u.data', sep='\t', names=['user id', 'movie id', 'rating', 'timestamp'], header=None)

Set up Weights & Biases for tracking

We initialize a Weights & Biases project to log key metrics like Root Mean Square Error (RMSE) and recommendations generated by the engine.
# Initialize a new wandb run
wandb.init(project='movie_recommendation_system', name='Content_Based_Filtering')

Collaborative filtering

Let's now begin with the implementation of collaborative filtering.

Data Preprocessing

For this step, we load the movid_id and movie_title columns. The goal is to create a matrix where each row represents a user_id, each column a movie_id, and the matrix values are user ratings, aiding in recommending movie titles.
# Read the dataset, specify column names for movie attributes and use '|' as the delimiter to address the dataset's format and character encoding (Latin-1). The usecols parameter limits the imported columns to just movie IDs and titles.


df_movies = pd.read_csv('ml-100k/ml-100k/u.item', sep='|', encoding='latin-1', header=None,
names=['movie id', 'movie_title', 'release_date', 'video_release_date',
'IMDb_URL', 'unknown', 'Action', 'Adventure', 'Animation',
'Children', 'Comedy', 'Crime', 'Documentary', 'Drama', 'Fantasy',
'Film-Noir', 'Horror', 'Musical', 'Mystery', 'Romance', 'Sci-Fi',
'Thriller', 'War', 'Western'], usecols=[0, 1]


Splitting the data and creating matrices

The entire data is split into a training and testing set with 80% for training and 20% for testing. This split allows for the evaluation of the model in a realistic scenario, where the goal is to predict user preferences for unseen movies.
# Split the data into training and testing sets
train_data, test_data = train_test_split(df_ratings, test_size=0.2, random_state=42)
Then, we create matrices where each row corresponds to a user and each column to a movie.
# Reshaping the training and test sets
test_matrix = test_data.pivot_table(index='user id', columns='movie id', values='rating', fill_value=0)
train_matrix = train_data.pivot_table(index='user id', columns='movie id', values='rating', fill_value=0)

# Convert the matrices to sparse matrices
train_matrix_sparse = csr_matrix(train_matrix, dtype=np.float32)
test_matrix_sparse = csr_matrix(test_matrix, dtype=np.flo

Perform SVD on the training matrix

Singular Value Decomposition (SVD) reduces the dimensionality of the user-item matrix, identifying latent factors that explain patterns in user ratings. These factors help predict ratings for unseen items.
number_of_factors = 50 # Placeholder for the number of latent factors
regularization_strength = 0.1 # Placeholder for regularization strength


# Perform SVD on the training matrix
U, sigma, Vt = svds(train_matrix_sparse, k=number_of_factors)
sigma = np.diag(sigma)


# Log SVD model parameters and hyperparameters
wandb.config.update({
"number_of_factors": number_of_factors,
"regularization_strength": regularization_strength
})


# Add user mean back to get actual rating prediction for training set
user_ratings_mean_train = np.array(train_matrix.mean(axis=1))
predicted_ratings_train = np.dot(np.dot(U, sigma), Vt) + user_ratings_mean_train.reshape(-1, 1)

Root Mean Square Error (RMSE)

RMSE is a standard way to measure the error of a model in predicting quantitative data. In this context it quantifies the model's prediction accuracy in terms of the ratings given by the users. A lower RMSE value indicates better model performance.
#Function to calculate the rmse between prediction and ground_truth.
def rmse(prediction, ground_truth):
#Filters and reshape the prediction array by selecting only those predictions where the corresponding ground_truth rating exists (is non-zero). flatten() then converts the array into a one-dimensional array.
prediction = prediction[ground_truth.nonzero()].flatten()


#Similar to the prediction array, this line filters the ground_truth array for non-zero values.
ground_truth = ground_truth[ground_truth.nonzero()].flatten()


#Compute the mean squared error between the filtered prediction and ground_truth arrays. The sqrt (square root) function is then applied to this value to obtain the RMSE.
return sqrt(mean_squared_error(prediction, ground_truth))


# Calculate RMSE for train data
train_rmse = rmse(predicted_ratings_train, train_matrix.to_numpy())
# Calculate RMSE for test data
test_rmse = rmse(predicted_ratings_train, test_matrix.to_numpy())
#log the RMSE
wandb.log({"RMSE Train": train_rmse,
"RMSE Test": test_rmse})
print('User-based Train CF RMSE: ' + str(train_rmse))
print('User-based Test CF RMSE: ' + str(test_rmse))

Get recommendations

This function predicts top movie recommendations for a specific user by sorting predicted ratings and filtering out movies the user has already rated.
# Function to recommend movies for the test dataset
def recommend_movies_test(user_id, num_recommendations=3):
user_row_number = user_id - 1


#Sorting predicted ratings for the user
sorted_user_predictions = pd.Series(predicted_ratings_train[user_row_number]).sort_values(ascending=False)


#Filtering previously user's rated movies
user_data = test_data[test_data['user id'] == user_id]
user_full = (user_data.merge(df_movies, how='left', left_on='movie id', right_on='movie id').
sort_values(['rating'], ascending=False)
)
#Generating recommendations
recommendations = (df_movies[~df_movies['movie id'].isin(user_full['movie id'])].


#Selecting top recommendations: merge(pd.DataFrame(sorted_user_predictions).reset_index(), how='left',
left_on='movie id',
right_on='index').
rename(columns={0: 'Predictions'}).
sort_values('Predictions', ascending=False).
iloc[:num_recommendations, :-1]
)


return user_full, recommendations


Log sample recommendations

We log recommendations for a specific user to Weights & Biases for analysis and evaluation.
# Get recommendations for a specific user
user_id = 5 # Change the user_id to the desired user
actual_movies, recommended_movies = recommend_movies_test(user_id, num_recommendations=3)
print("\nMovies watched by user:")
print(actual_movies)
print("\nRecommended movies:")
print(recommended_movies)


#log the user and movie metics
wandb.log({
"user_id": user_id,
"watched_movies": actual_movies['movie_title'].tolist(),
"recommended_movies": recommended_movies['movie_title'].tolist()
})
# Finish the run
wandb.finish()

Evaluation

The table below shows the runs for collaborative filtering for three different users. The top three recommended movies are shown along with the previously watched movies by the user.

The test and train RMSE is logged in Weights & Biases, as shown below for each of the hyperparameters that were monitored for SVD runs i.e. number of latent features and regularization strength. The x-axis represents the number of runs while the y-axis represents the RMSE.


The graph below shows the highlighted test RMSE at its lowest point and the corresponding values of other hyperparameters that were monitored. Logging the results in Weights & Biases helps in efficient monitoring of the model’s statistics which eventually aids in better optimization and analysis.


Content based filtering

After importing the required libraries and setting up Weights & Biases we will continue with content based filtering:

Define functions to retrieve data for recommendation

We first initialize functions to retrieve movie title and movie_id.
# Function to get the title from the movie id
def get_title_from_index(index):
return df_movies[df_movies.index == index]["movie_title"].values[0]


# Function to get index from the movie id
def get_index_from_movie_id(movie_id):
matches = df_movies[df_movies['movie id'] == movie_id].index.values
return matches[0] if len(matches) > 0 else None
The code below shows the function to recommend movies based on one specific movie previously watched by the user. Cosine similarity is calculated between the TF-IDF (Term Frequency-Inverse Document Frequency) matrix to return the top three movies which are similar to those already watched by the user.
def recommend_movies_based_on_one_movie(user_id, num_recommendations=3):
# Find the highest-rated movie by the user
user_movies = df_ratings[df_ratings['user id'] == user_id]
highest_rated_movie_row = user_movies.sort_values(by='rating', ascending=False).iloc[0]
highest_rated_movie_id = highest_rated_movie_row['movie id']


# Find the index and title of this movie in df_movies
movie_index = get_index_from_movie_id(highest_rated_movie_id)
if movie_index is None:
return "No movies found for the given user.", None


movie_title_based_on = get_title_from_index(movie_index)
# Calculate the cosine similarities with other movies
cosine_similarities = cosine_similarity(tfidf_matrix[movie_index], tfidf_matrix).flatten()
similar_movies = list(enumerate(cosine_similarities))
sorted_similar_movies = sorted(similar_movies, key=lambda x: x[1], reverse=True)[1:]


# Get the recommended movies
recommended_movies = []
for i in range(num_recommendations):
index = sorted_similar_movies[i][0]
movie_title = get_title_from_index(index)
recommended_movies.append(movie_title)


return movie_title_based_on, recommended_movies

Data preprocessing

For content-based filtering, we combine movie genres into a single string for each movie to use as features.
# Read the dataset, specify column names for movie attributes and use '|' as the delimiter to address the dataset's format and character encoding (Latin-1).
df_movies = pd.read_csv('ml-100k/ml-100k/u.item', sep='|', encoding='latin-1', header=None,
names=['movie id', 'movie_title', 'release_date', 'video_release_date',
'IMDb_URL', 'unknown', 'Action', 'Adventure', 'Animation',
'Children', 'Comedy', 'Crime', 'Documentary', 'Drama', 'Fantasy',
'Film-Noir', 'Horror', 'Musical', 'Mystery', 'Romance', 'Sci-Fi',
'Thriller', 'War', 'Western'])


# Combine movie genres into a single string for each movie
genre_columns = ['Action', 'Adventure', 'Animation', 'Children', 'Comedy', 'Crime',
'Documentary', 'Drama', 'Fantasy', 'Film-Noir', 'Horror', 'Musical',
'Mystery', 'Romance', 'Sci-Fi', 'Thriller', 'War', 'Western']
df_movies['combined_features'] = df_movies[genre_columns].apply(lambda x: ' '.join(x.index[x == 1]), axis=1)

Data vectorization

We use TF-IDF to convert the text-based features into a numerical matrix for similarity calculations.
# Creating a TF-IDF Vectorizer to convert genres to a matrix of TF-IDF features
tfidf_vectorizer = TfidfVectorizer()
tfidf_matrix = tfidf_vectorizer.fit_transform(df_movies['combined_features'])

Get recommendations

The recommend movie function, initialized earlier, is called next to get recommendations based on a movie watched by a specific user i.e. 20 in this case.
# Get recommendations based on one movie for a specific user
user_id = 20 # Change the user_id to the desired user
based_on_movie, recommended_movies = recommend_movies_based_on_one_movie(user_id)5)5

Log sample recommendations

Recommendations by the system are logged into Weights & Biases for efficient comparison and finally the wandb run is finished.
# Log sample recommendations
wandb.log({
"user_id": user_id,
"based_on_movie": based_on_movie,
"recommended_movies": recommended_movies
})
# Finish the run
wandb.finish()

Evaluation

The table below shows runs for content based filtering for three different users. The top three recommended movies are shown along with the movie used as the basis for recommending.


And that concludes our guide to building a recommendation system from scratch. Each model we explored - collaborative filtering and content-based filtering - delivers unique results due to their fundamentally different methodologies. The choice of filtering algorithm should align with your specific task requirements and the nature of your dataset, as each approach excels under different conditions.

Conclusion

Recommendation engines are the backbone of modern personalized experiences, transforming how users interact with content across industries - from streaming platforms and e-commerce to healthcare and education. By leveraging data and machine learning, these systems not only enhance user satisfaction but also drive business growth through increased engagement and retention.
In this guide, we explored two foundational approaches: collaborative filtering and content-based filtering. Each method offers unique strengths and trade-offs, making the choice between them heavily dependent on the specific use case and the nature of the available data. However, combining both approaches into a hybrid recommendation system often results in more accurate and robust solutions, delivering an optimal balance of personalization and scalability.
As AI and machine learning technologies continue to advance, the potential for recommendation engines will only grow. From delivering real-time, tailored suggestions to seamlessly integrating into various aspects of daily life, these systems are reshaping how users discover content, products, and services. Whether it’s a binge-worthy series, a life-saving treatment plan, or a perfectly curated playlist, recommendation engines are poised to play a pivotal role in improving user satisfaction and fostering business success in the digital age.
This step-by-step guide equips you with the foundation to build and experiment with recommendation systems. By combining powerful tools like Weights & Biases for tracking with cutting-edge machine learning techniques, you can develop and refine systems that deliver meaningful, personalized experiences for your users. The future of recommendation engines is one of innovation and boundless opportunity—where will you take it?

Iterate on AI agents and models faster. Try Weights & Biases today.