SVD-Based Recommender

Overview

Our SVD-Based Recommender employs Singular Value Decomposition (SVD), a classic matrix factorization technique, to discover latent factors that explain user preferences. SVD decomposes the user-item rating matrix into lower-dimensional components, which can then be used to predict missing ratings.

Technical Implementation

Matrix Construction

Since users haven’t rated every place (creating a sparse matrix), we take a novel approach:

Category-Based Matrix: We build a pseudo-dense user-item matrix by:
- Using each user’s category averages as a starting point
- Filling in missing values with category averages
- Normalizing ratings across users to account for different rating scales
Place-to-Category Mapping: Each place is associated with its primary category, allowing us to leverage category-level preferences for specific place recommendations.

SVD Algorithm

We use the Surprise library’s implementation of SVD, which:

Factorizes the user-item matrix into:
- User latent factor matrix
- Singular values diagonal matrix
- Item latent factor matrix
Optimizes these matrices using alternating least squares to minimize prediction error
Parameters:
- n_factors: 20 (number of latent factors)
- n_epochs: 25 (training iterations)
- lr_all: 0.005 (learning rate)
- reg_all: 0.02 (regularization parameter)

Evaluation

Our validation approach uses five-fold cross-validation to ensure robust performance:

RMSE: 0.93 ± 0.02
MAE: 0.74 ± 0.01

These metrics indicate strong predictive performance on held-out data.

Hybrid Scoring

We blend the SVD predictions with venue attributes through a linear combination:

def compute_score(predicted_rating, avg_rating, user_ratings_total, distance):
    # Base score from the SVD prediction
    score = predicted_rating * 0.6
    
    # Boost by the venue's average community rating
    score += (avg_rating / 5.0) * 0.2
    
    # Popularity boost based on review count (log-scaled)
    popularity = min(1.0, math.log(user_ratings_total + 1) / 10) if user_ratings_total else 0
    score += popularity * 0.1
    
    # Distance penalty (inverse relationship)
    distance_factor = max(0, 1 - (distance / 5000)) if distance else 0.5
    score += distance_factor * 0.1
    
    return score

This hybrid approach balances collaborative signals (SVD predictions) with content features (venue attributes).

Implementation in Code

The SVDPlaceRecommender class implements this model:

class SVDPlaceRecommender:
    def __init__(self, category_to_place_types):
        self.model = None
        self.category_to_place_types = category_to_place_types
        
    def fit(self, places_df):
        # Convert places to user-item matrix and train SVD
        # ...
        
    def evaluate_model(self):
        # Run cross-validation and return metrics
        # ...
        
    def get_recommendations(self, df, user_lat, user_lon, predicted_ratings, top_n=5, max_distance=5):
        # Generate recommendations based on SVD predictions
        # ...

Advantages

Proven Technique: SVD is a well-established, theoretically sound recommendation approach
Efficient Computing: Faster training and inference compared to deep learning methods
Handles Sparsity: Works well with limited user-item interactions
Explainable Components: Latent factors can be analyzed to understand recommendation patterns

Limitations and Future Improvements

Cold-Start Challenge: Limited effectiveness for completely new users or items
Static Model: Doesn’t automatically adapt to new users/items without retraining
Future Work:
- Implement incremental model updates
- Explore advanced matrix factorization techniques like BPR or WARP loss
- Integrate temporal dynamics to capture evolving user preferences