Autoencoder-Based Recommender
Overview
Our Autoencoder-Based Recommender uses a neural network architecture to learn latent patterns in user preferences across different place categories. This model excels at uncovering hidden connections—like how a fondness for modern art might translate to recommendations for contemporary galleries.
Technical Implementation
Architecture
The autoencoder consists of two main components:
- Encoder: Compresses user preference vectors into a lower-dimensional latent space
- Decoder: Reconstructs the full user preference vector from the compressed representation
Input Layer (29 neurons) → Hidden Layer (16 neurons) → Latent Space (8 neurons) → Hidden Layer (16 neurons) → Output Layer (29 neurons)
Input Representation
- Each user’s preferences are represented as a vector of ratings across 29 categories
- Each element corresponds to a category (e.g., “restaurants”, “museums”, “parks”)
- Values range from 0-5, indicating user preference strength for each category
- Unknown preferences are initially set to 0
Training Process
- Data Preparation:
- User preference vectors are normalized to [0,1] range
- Known preferences are maintained for reconstruction targets
- Missing preferences are masked during loss calculation
- Optimization:
- We use Mean Squared Error loss function
- Adam optimizer with learning rate of 0.001
- Early stopping to prevent overfitting
- Dropout layers (rate=0.2) for regularization
- Denoising:
- Random noise is added to inputs during training
- Forces the autoencoder to learn robust feature representations
- Improves generalization to new users
Recommendation Generation
The recommendation process follows these steps:
- Preference Prediction:
- Take user’s explicit preferences as input
- Pass through the trained autoencoder
- The output contains predicted ratings for categories the user hasn’t explicitly rated
- Place Scoring:
- For each venue, take the predicted score for its primary category
- Scale by the venue’s average community rating
- Apply a boost based on total review count
- Apply a small penalty proportional to distance (via Haversine formula)
- Ranking:
- Sort places by their final scores
- Return top-N recommendations
Implementation in Code
The AutoencoderRecommender class implements this model:
class AutoencoderRecommender:
def __init__(self, auto_model, scaler, places_df, categories_list, category_mappings):
self.model = auto_model # Pretrained TensorFlow autoencoder
self.scaler = scaler # For normalizing preference vectors
self.places_df = places_df
self.categories = categories_list
self.category_to_place_types = category_mappings
def get_recommendations(self, user_lat, user_lon, user_prefs, provided_mask, num_recs=5):
# Implementation of recommendation algorithm
# ...
Advantages
- Handles Sparse Input: Produces quality recommendations even when users have only rated a few categories
- Discovers Latent Patterns: Captures non-obvious relationships between preferences
- Personalization: Tailors recommendations to individual preference profiles
- Cold-Start Solution: Can generate recommendations with minimal user input
Limitations and Future Improvements
- Training Data Requirements: Needs substantial user-category ratings for training
- Interpretability: Latent factors are less interpretable than explicit features
- Future Work: We plan to implement a variational autoencoder (VAE) to better model the uncertainty in user preferences