SVD-Based Recommender
Overview
Our SVD-Based Recommender employs Singular Value Decomposition (SVD), a classic matrix factorization technique, to discover latent factors that explain user preferences. SVD decomposes the user-item rating matrix into lower-dimensional components, which can then be used to predict missing ratings.
Technical Implementation
Matrix Construction
Since users haven’t rated every place (creating a sparse matrix), we take a novel approach:
- Category-Based Matrix: We build a pseudo-dense user-item matrix by:
- Using each user’s category averages as a starting point
- Filling in missing values with category averages
- Normalizing ratings across users to account for different rating scales
- Place-to-Category Mapping: Each place is associated with its primary category, allowing us to leverage category-level preferences for specific place recommendations.
SVD Algorithm
We use the Surprise library’s implementation of SVD, which:
- Factorizes the user-item matrix into:
- User latent factor matrix
- Singular values diagonal matrix
- Item latent factor matrix
-
Optimizes these matrices using alternating least squares to minimize prediction error
- Parameters:
n_factors
: 20 (number of latent factors)n_epochs
: 25 (training iterations)lr_all
: 0.005 (learning rate)reg_all
: 0.02 (regularization parameter)
Evaluation
Our validation approach uses five-fold cross-validation to ensure robust performance:
- RMSE: 0.93 ± 0.02
- MAE: 0.74 ± 0.01
These metrics indicate strong predictive performance on held-out data.
Hybrid Scoring
We blend the SVD predictions with venue attributes through a linear combination:
def compute_score(predicted_rating, avg_rating, user_ratings_total, distance):
# Base score from the SVD prediction
score = predicted_rating * 0.6
# Boost by the venue's average community rating
score += (avg_rating / 5.0) * 0.2
# Popularity boost based on review count (log-scaled)
popularity = min(1.0, math.log(user_ratings_total + 1) / 10) if user_ratings_total else 0
score += popularity * 0.1
# Distance penalty (inverse relationship)
distance_factor = max(0, 1 - (distance / 5000)) if distance else 0.5
score += distance_factor * 0.1
return score
This hybrid approach balances collaborative signals (SVD predictions) with content features (venue attributes).
Implementation in Code
The SVDPlaceRecommender class implements this model:
class SVDPlaceRecommender:
def __init__(self, category_to_place_types):
self.model = None
self.category_to_place_types = category_to_place_types
def fit(self, places_df):
# Convert places to user-item matrix and train SVD
# ...
def evaluate_model(self):
# Run cross-validation and return metrics
# ...
def get_recommendations(self, df, user_lat, user_lon, predicted_ratings, top_n=5, max_distance=5):
# Generate recommendations based on SVD predictions
# ...
Advantages
- Proven Technique: SVD is a well-established, theoretically sound recommendation approach
- Efficient Computing: Faster training and inference compared to deep learning methods
- Handles Sparsity: Works well with limited user-item interactions
- Explainable Components: Latent factors can be analyzed to understand recommendation patterns
Limitations and Future Improvements
- Cold-Start Challenge: Limited effectiveness for completely new users or items
- Static Model: Doesn’t automatically adapt to new users/items without retraining
- Future Work:
- Implement incremental model updates
- Explore advanced matrix factorization techniques like BPR or WARP loss
- Integrate temporal dynamics to capture evolving user preferences