Skip to the content.

SVD-Based Recommender

Overview

Our SVD-Based Recommender employs Singular Value Decomposition (SVD), a classic matrix factorization technique, to discover latent factors that explain user preferences. SVD decomposes the user-item rating matrix into lower-dimensional components, which can then be used to predict missing ratings.

Technical Implementation

Matrix Construction

Since users haven’t rated every place (creating a sparse matrix), we take a novel approach:

  1. Category-Based Matrix: We build a pseudo-dense user-item matrix by:
    • Using each user’s category averages as a starting point
    • Filling in missing values with category averages
    • Normalizing ratings across users to account for different rating scales
  2. Place-to-Category Mapping: Each place is associated with its primary category, allowing us to leverage category-level preferences for specific place recommendations.

SVD Algorithm

We use the Surprise library’s implementation of SVD, which:

  1. Factorizes the user-item matrix into:
    • User latent factor matrix
    • Singular values diagonal matrix
    • Item latent factor matrix
  2. Optimizes these matrices using alternating least squares to minimize prediction error

  3. Parameters:
    • n_factors: 20 (number of latent factors)
    • n_epochs: 25 (training iterations)
    • lr_all: 0.005 (learning rate)
    • reg_all: 0.02 (regularization parameter)

Evaluation

Our validation approach uses five-fold cross-validation to ensure robust performance:

These metrics indicate strong predictive performance on held-out data.

Hybrid Scoring

We blend the SVD predictions with venue attributes through a linear combination:

def compute_score(predicted_rating, avg_rating, user_ratings_total, distance):
    # Base score from the SVD prediction
    score = predicted_rating * 0.6
    
    # Boost by the venue's average community rating
    score += (avg_rating / 5.0) * 0.2
    
    # Popularity boost based on review count (log-scaled)
    popularity = min(1.0, math.log(user_ratings_total + 1) / 10) if user_ratings_total else 0
    score += popularity * 0.1
    
    # Distance penalty (inverse relationship)
    distance_factor = max(0, 1 - (distance / 5000)) if distance else 0.5
    score += distance_factor * 0.1
    
    return score

This hybrid approach balances collaborative signals (SVD predictions) with content features (venue attributes).

Implementation in Code

The SVDPlaceRecommender class implements this model:

class SVDPlaceRecommender:
    def __init__(self, category_to_place_types):
        self.model = None
        self.category_to_place_types = category_to_place_types
        
    def fit(self, places_df):
        # Convert places to user-item matrix and train SVD
        # ...
        
    def evaluate_model(self):
        # Run cross-validation and return metrics
        # ...
        
    def get_recommendations(self, df, user_lat, user_lon, predicted_ratings, top_n=5, max_distance=5):
        # Generate recommendations based on SVD predictions
        # ...

Advantages

Limitations and Future Improvements