DEV Community

Jessica Amoura
Jessica Amoura

Posted on

How I Built a Movie Recommendation System Using Python

Introduction
Ever wondered how Netflix knows just what you want to watch? Recommendation systems have become an essential part of the movie industry, helping users discover films they'll love based on their preferences. In this post, I'll walk you through how I built a simple movie recommendation system using Python, leveraging publicly available datasets and libraries. Whether you're a beginner or an experienced developer, this guide will be a fun dive into the world of data and recommendations.

Step 1: Gathering the Data
To build any recommendation system, we first need data. For movies, one of the best datasets available is the MovieLens dataset. It includes information like movie titles, genres, and user ratings.

Download the dataset: Visit the MovieLens website and download the dataset.
Load the data into Python: Use libraries like Pandas to read the dataset.
python
Salin kode
import pandas as pd

Load the movies and ratings dataset

movies = pd.read_csv('movies.csv')
ratings = pd.read_csv('ratings.csv')

print(movies.head())
print(ratings.head())
Step 2: Choosing the Recommendation Approach
There are two popular types of recommendation systems:

Content-Based Filtering: Recommends movies similar to what the user has liked before.
Collaborative Filtering: Recommends movies based on what similar users have liked.
For this tutorial, let's use content-based filtering.

Step 3: Building the Model
We'll use the TF-IDF Vectorizer from the sklearn library to analyze the movie genres and descriptions.

python
Salin kode
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

Vectorize the genres

tfidf = TfidfVectorizer(stop_words='english')
movies['genres'] = movies['genres'].fillna('') # Fill NaN values
tfidf_matrix = tfidf.fit_transform(movies['genres'])

Compute similarity matrix

cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)

print(cosine_sim.shape)
Step 4: Building a Recommendation Function
Now, let's create a function to recommend movies based on a selected title.

python
Salin kode
def recommend_movies(title, cosine_sim=cosine_sim):
indices = pd.Series(movies.index, index=movies['title']).drop_duplicates()
idx = indices[title]

# Get pairwise similarity scores
sim_scores = list(enumerate(cosine_sim[idx]))
sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)

Get top 10 recommendations

sim_scores = sim_scores[1:11]
movie_indices = [i[0] for i in sim_scores]

return movies['title'].iloc[movie_indices]

Enter fullscreen mode Exit fullscreen mode




Example

print(recommend_movies('Toy Story (1995)'))
Step 5: Testing the Model
Once the function is ready, test it with different movie titles to see if the recommendations align with your expectations.

Step 6: Deployment (Optional)
If you want to take it further, deploy this model as a simple web application using frameworks like Flask or Django. Here's a snippet for Flask:

python
Salin kode
from flask import Flask, request, jsonify

app = Flask(name)

@app.route('/recommend', methods=['GET'])
def recommend():
title = request.args.get('title')
recommendations = recommend_movies(title)
return jsonify(recommendations.tolist())

if name == 'main':
app.run(debug=True)
Conclusion
Congratulations! You've just built a basic movie recommendation system using Python. While this is a simple implementation, it opens up possibilities for more complex systems using deep learning or hybrid models. 🎮 Check it out now! https://shorturl.at/dwHQI
👉 Watch it here https://shorturl.at/zvAqO

If you enjoyed this post, feel free to leave a comment or share your ideas for improving the system. Happy coding!

Tags

movies #python #recommendationsystem #machinelearning #api

Let me know if you'd like to customize this further or add specific sections!🎮 Check it out now! https://shorturl.at/dwHQI
👉 Watch it here https://shorturl.at/zvAqO

Top comments (0)