Chapter 19: Building Recommendation Systems
Overview
Recommendation systems are one of the most compelling applications of Vektagraf's vector search capabilities. This chapter demonstrates how to build sophisticated recommendation engines that leverage both collaborative filtering and content-based approaches, with real-time updates and comprehensive quality metrics.
Learning Objectives
- Understand recommendation system architectures with Vektagraf
- Implement collaborative and content-based filtering
- Build real-time recommendation pipelines
- Measure and optimize recommendation quality
- Handle cold start problems and data sparsity
Prerequisites
- Completed Chapters 1-5 (Foundations and Core Features)
- Understanding of vector search concepts
- Basic knowledge of machine learning concepts
Core Concepts
Recommendation System Types
Vektagraf excels at building hybrid recommendation systems that combine multiple approaches:
- Content-Based Filtering: Recommends items similar to those a user has liked
- Collaborative Filtering: Recommends items based on similar users' preferences
- Hybrid Approaches: Combines multiple techniques for better results
- Context-Aware: Incorporates temporal and situational factors
Vector Embeddings in Recommendations
Vektagraf treats embeddings as first-class object properties, making it natural to build recommendation systems:
// User and item embeddings are just object properties
class User extends VektaObject {
late String name;
late String email;
late List<double> preferenceVector; // User's preference embedding
late List<String> viewedItems;
late Map<String, double> ratings;
}
class Product extends VektaObject {
late String name;
late String category;
late List<double> contentVector; // Product's content embedding
late List<String> tags;
late double averageRating;
late int viewCount;
}
Practical Examples
Complete Recommendation System Implementation
Let's build a comprehensive e-commerce recommendation system:
1. Schema Definition
{
"name": "RecommendationSystem",
"version": "1.0.0",
"objects": {
"User": {
"properties": {
"name": {"type": "string", "required": true},
"email": {"type": "string", "required": true, "unique": true},
"preferenceVector": {
"type": "vector",
"dimensions": 128,
"algorithm": "hnsw",
"distance": "cosine"
},
"demographics": {
"type": "object",
"properties": {
"age": {"type": "integer"},
"location": {"type": "string"},
"interests": {"type": "array", "items": {"type": "string"}}
}
},
"viewHistory": {"type": "array", "items": {"type": "string"}},
"purchaseHistory": {"type": "array", "items": {"type": "string"}},
"ratings": {"type": "object"},
"lastActive": {"type": "datetime"}
}
},
"Product": {
"properties": {
"name": {"type": "string", "required": true},
"description": {"type": "string"},
"category": {"type": "string", "required": true},
"contentVector": {
"type": "vector",
"dimensions": 128,
"algorithm": "hnsw",
"distance": "cosine"
},
"features": {
"type": "object",
"properties": {
"brand": {"type": "string"},
"price": {"type": "number"},
"tags": {"type": "array", "items": {"type": "string"}},
"specifications": {"type": "object"}
}
},
"metrics": {
"type": "object",
"properties": {
"averageRating": {"type": "number", "default": 0.0},
"ratingCount": {"type": "integer", "default": 0},
"viewCount": {"type": "integer", "default": 0},
"purchaseCount": {"type": "integer", "default": 0}
}
},
"createdAt": {"type": "datetime"}
}
},
"Interaction": {
"properties": {
"userId": {"type": "string", "required": true},
"productId": {"type": "string", "required": true},
"type": {"type": "string", "enum": ["view", "like", "purchase", "rating"]},
"value": {"type": "number"},
"context": {
"type": "object",
"properties": {
"sessionId": {"type": "string"},
"device": {"type": "string"},
"timestamp": {"type": "datetime"},
"duration": {"type": "integer"}
}
},
"timestamp": {"type": "datetime", "required": true}
}
},
"Recommendation": {
"properties": {
"userId": {"type": "string", "required": true},
"productId": {"type": "string", "required": true},
"score": {"type": "number", "required": true},
"algorithm": {"type": "string", "required": true},
"context": {"type": "string"},
"explanation": {"type": "string"},
"generatedAt": {"type": "datetime", "required": true},
"expiresAt": {"type": "datetime"},
"served": {"type": "boolean", "default": false},
"clicked": {"type": "boolean", "default": false}
}
}
},
"relationships": {
"UserInteractions": {
"from": "User",
"to": "Interaction",
"type": "one_to_many",
"foreignKey": "userId"
},
"ProductInteractions": {
"from": "Product",
"to": "Interaction",
"type": "one_to_many",
"foreignKey": "productId"
},
"UserRecommendations": {
"from": "User",
"to": "Recommendation",
"type": "one_to_many",
"foreignKey": "userId"
}
}
}
2. Recommendation Engine Implementation
class RecommendationEngine {
final VektaDatabase db;
final EmbeddingService embeddingService;
RecommendationEngine(this.db, this.embeddingService);
/// Generate recommendations using hybrid approach
Future<List<Recommendation>> generateRecommendations(
String userId, {
int count = 10,
String context = 'general',
}) async {
final user = await db.users.findById(userId);
if (user == null) throw Exception('User not found');
// Get recommendations from multiple algorithms
final contentBased = await _getContentBasedRecommendations(user, count);
final collaborative = await _getCollaborativeRecommendations(user, count);
final trending = await _getTrendingRecommendations(count ~/ 4);
// Combine and rank recommendations
final combined = _combineRecommendations([
contentBased,
collaborative,
trending,
], weights: [0.4, 0.4, 0.2]);
// Apply business rules and filters
final filtered = await _applyBusinessRules(combined, user, context);
// Store recommendations for tracking
await _storeRecommendations(userId, filtered, context);
return filtered.take(count).toList();
}
/// Content-based recommendations using product similarity
Future<List<Recommendation>> _getContentBasedRecommendations(
User user,
int count,
) async {
// Get user's interaction history
final interactions = await db.interactions
.where('userId', user.id)
.where('type', whereIn: ['purchase', 'like', 'rating'])
.where('value', greaterThan: 3.0) // Only positive interactions
.orderBy('timestamp', descending: true)
.limit(50)
.find();
if (interactions.isEmpty) {
return _getColdStartRecommendations(user, count);
}
// Get products from positive interactions
final likedProductIds = interactions.map((i) => i.productId).toSet();
final likedProducts = await db.products
.where('id', whereIn: likedProductIds.toList())
.find();
// Calculate average preference vector
final preferenceVector = _calculatePreferenceVector(likedProducts);
// Find similar products using vector search
final similarProducts = await db.products
.vectorSearch(
'contentVector',
preferenceVector,
limit: count * 3, // Get more for filtering
threshold: 0.7,
)
.where('id', whereNotIn: likedProductIds.toList()) // Exclude already seen
.find();
return similarProducts.map((product) => Recommendation()
..userId = user.id
..productId = product.id
..score = _calculateContentScore(preferenceVector, product.contentVector)
..algorithm = 'content_based'
..explanation = 'Based on items you liked in ${product.category}'
..generatedAt = DateTime.now()
..expiresAt = DateTime.now().add(Duration(hours: 24))
).toList();
}
/// Collaborative filtering using user similarity
Future<List<Recommendation>> _getCollaborativeRecommendations(
User user,
int count,
) async {
// Update user preference vector based on recent interactions
await _updateUserPreferenceVector(user);
// Find similar users using vector search
final similarUsers = await db.users
.vectorSearch(
'preferenceVector',
user.preferenceVector,
limit: 50,
threshold: 0.6,
)
.where('id', notEquals: user.id)
.find();
if (similarUsers.isEmpty) {
return _getPopularRecommendations(count);
}
// Get products liked by similar users
final similarUserIds = similarUsers.map((u) => u.id).toList();
final theirInteractions = await db.interactions
.where('userId', whereIn: similarUserIds)
.where('type', whereIn: ['purchase', 'like', 'rating'])
.where('value', greaterThan: 3.0)
.find();
// Score products based on similar users' preferences
final productScores = <String, double>{};
final productCounts = <String, int>{};
for (final interaction in theirInteractions) {
final userSimilarity = _getUserSimilarity(user, similarUsers
.firstWhere((u) => u.id == interaction.userId));
productScores[interaction.productId] =
(productScores[interaction.productId] ?? 0.0) +
(interaction.value * userSimilarity);
productCounts[interaction.productId] =
(productCounts[interaction.productId] ?? 0) + 1;
}
// Filter out products user has already interacted with
final userProductIds = (await db.interactions
.where('userId', user.id)
.find()).map((i) => i.productId).toSet();
productScores.removeWhere((productId, _) =>
userProductIds.contains(productId));
// Sort by score and get top products
final sortedProducts = productScores.entries.toList()
..sort((a, b) => b.value.compareTo(a.value));
final topProductIds = sortedProducts
.take(count)
.map((e) => e.key)
.toList();
final products = await db.products
.where('id', whereIn: topProductIds)
.find();
return products.map((product) => Recommendation()
..userId = user.id
..productId = product.id
..score = productScores[product.id]! / productCounts[product.id]!
..algorithm = 'collaborative_filtering'
..explanation = 'Users with similar tastes also liked this'
..generatedAt = DateTime.now()
..expiresAt = DateTime.now().add(Duration(hours: 24))
).toList();
}
/// Get trending products for diversity
Future<List<Recommendation>> _getTrendingRecommendations(int count) async {
final trending = await db.products
.orderBy('metrics.viewCount', descending: true)
.where('metrics.averageRating', greaterThan: 4.0)
.where('createdAt', greaterThan: DateTime.now().subtract(Duration(days: 30)))
.limit(count)
.find();
return trending.map((product) => Recommendation()
..productId = product.id
..score = product.metrics.averageRating *
(product.metrics.viewCount / 1000.0)
..algorithm = 'trending'
..explanation = 'Trending now'
..generatedAt = DateTime.now()
..expiresAt = DateTime.now().add(Duration(hours: 6))
).toList();
}
/// Handle cold start problem for new users
Future<List<Recommendation>> _getColdStartRecommendations(
User user,
int count,
) async {
// Use demographic-based recommendations
final demographicProducts = await _getDemographicRecommendations(user, count ~/ 2);
// Add popular products in user's interests
final interestProducts = await _getInterestBasedRecommendations(user, count ~/ 2);
return [...demographicProducts, ...interestProducts];
}
/// Update user preference vector based on interactions
Future<void> _updateUserPreferenceVector(User user) async {
final recentInteractions = await db.interactions
.where('userId', user.id)
.where('timestamp', greaterThan: DateTime.now().subtract(Duration(days: 30)))
.find();
if (recentInteractions.isEmpty) return;
// Get products from interactions
final productIds = recentInteractions.map((i) => i.productId).toList();
final products = await db.products
.where('id', whereIn: productIds)
.find();
// Calculate weighted average of product vectors
final weightedVectors = <List<double>>[];
final weights = <double>[];
for (final interaction in recentInteractions) {
final product = products.firstWhere((p) => p.id == interaction.productId);
final weight = _getInteractionWeight(interaction);
weightedVectors.add(product.contentVector);
weights.add(weight);
}
user.preferenceVector = _calculateWeightedAverage(weightedVectors, weights);
await db.users.save(user);
}
/// Calculate interaction weight based on type and recency
double _getInteractionWeight(Interaction interaction) {
final typeWeights = {
'view': 1.0,
'like': 3.0,
'purchase': 5.0,
'rating': interaction.value,
};
final baseWeight = typeWeights[interaction.type] ?? 1.0;
// Apply recency decay
final daysSince = DateTime.now().difference(interaction.timestamp).inDays;
final recencyFactor = math.exp(-daysSince / 30.0); // 30-day half-life
return baseWeight * recencyFactor;
}
}
3. Real-Time Recommendation Updates
class RealTimeRecommendationService {
final RecommendationEngine engine;
final StreamController<RecommendationUpdate> _updateController;
RealTimeRecommendationService(this.engine)
: _updateController = StreamController.broadcast();
Stream<RecommendationUpdate> get updates => _updateController.stream;
/// Handle real-time user interactions
Future<void> handleInteraction(Interaction interaction) async {
await db.interactions.save(interaction);
// Update recommendations if significant interaction
if (_isSignificantInteraction(interaction)) {
await _updateUserRecommendations(interaction.userId);
}
// Update product metrics
await _updateProductMetrics(interaction.productId, interaction);
// Trigger model updates if needed
if (_shouldUpdateModel(interaction)) {
await _scheduleModelUpdate();
}
}
/// Update recommendations for a specific user
Future<void> _updateUserRecommendations(String userId) async {
try {
final newRecommendations = await engine.generateRecommendations(
userId,
context: 'real_time_update',
);
// Invalidate old recommendations
await db.recommendations
.where('userId', userId)
.where('expiresAt', greaterThan: DateTime.now())
.update({'expiresAt': DateTime.now()});
// Store new recommendations
for (final rec in newRecommendations) {
await db.recommendations.save(rec);
}
_updateController.add(RecommendationUpdate(
userId: userId,
recommendations: newRecommendations,
trigger: 'interaction',
));
} catch (e) {
print('Error updating recommendations for user $userId: $e');
}
}
bool _isSignificantInteraction(Interaction interaction) {
return interaction.type == 'purchase' ||
interaction.type == 'like' ||
(interaction.type == 'rating' && interaction.value >= 4.0);
}
}
A/B Testing Framework
class RecommendationABTesting {
final VektaDatabase db;
RecommendationABTesting(this.db);
/// Run A/B test comparing different algorithms
Future<ABTestResult> runAlgorithmTest({
required List<String> userIds,
required Map<String, RecommendationAlgorithm> algorithms,
required Duration testDuration,
}) async {
final testId = VektaId.generate();
final startTime = DateTime.now();
// Randomly assign users to test groups
final assignments = _assignUsersToGroups(userIds, algorithms.keys.toList());
// Store test configuration
await _storeTestConfig(testId, algorithms, assignments, testDuration);
// Generate recommendations for each group
for (final entry in assignments.entries) {
final userId = entry.key;
final algorithmName = entry.value;
final algorithm = algorithms[algorithmName]!;
final recommendations = await algorithm.generateRecommendations(userId);
// Tag recommendations with test info
for (final rec in recommendations) {
rec.context = 'ab_test:$testId:$algorithmName';
await db.recommendations.save(rec);
}
}
// Monitor test progress
return _monitorTest(testId, testDuration);
}
/// Analyze A/B test results
Future<ABTestAnalysis> analyzeTest(String testId) async {
final testConfig = await _getTestConfig(testId);
final recommendations = await db.recommendations
.where('context', startsWith: 'ab_test:$testId')
.find();
final metrics = <String, TestMetrics>{};
for (final algorithmName in testConfig.algorithms.keys) {
final algorithmRecs = recommendations
.where((r) => r.context.endsWith(':$algorithmName'))
.toList();
metrics[algorithmName] = await _calculateMetrics(algorithmRecs);
}
return ABTestAnalysis(
testId: testId,
duration: testConfig.duration,
metrics: metrics,
winner: _determineWinner(metrics),
significance: await _calculateStatisticalSignificance(metrics),
);
}
Future<TestMetrics> _calculateMetrics(List<Recommendation> recommendations) async {
final served = recommendations.where((r) => r.served).length;
final clicked = recommendations.where((r) => r.clicked).length;
final clickThroughRate = served > 0 ? clicked / served : 0.0;
// Calculate conversion rate (purchases after recommendation)
final conversions = await _calculateConversions(recommendations);
final conversionRate = served > 0 ? conversions / served : 0.0;
// Calculate diversity metrics
final diversity = _calculateDiversity(recommendations);
return TestMetrics(
served: served,
clicked: clicked,
clickThroughRate: clickThroughRate,
conversionRate: conversionRate,
diversity: diversity,
);
}
}
Best Practices
1. Vector Quality and Maintenance
class VectorQualityManager {
/// Ensure vector quality through validation
Future<bool> validateVectors() async {
final products = await db.products.find();
for (final product in products) {
// Check vector dimensions
if (product.contentVector.length != 128) {
await _regenerateVector(product);
}
// Check for NaN or infinite values
if (product.contentVector.any((v) => !v.isFinite)) {
await _regenerateVector(product);
}
// Check vector magnitude
final magnitude = _calculateMagnitude(product.contentVector);
if (magnitude < 0.1 || magnitude > 10.0) {
await _normalizeVector(product);
}
}
return true;
}
/// Periodically update vectors based on user interactions
Future<void> updateVectorsFromInteractions() async {
final products = await db.products
.where('metrics.viewCount', greaterThan: 100)
.find();
for (final product in products) {
final interactions = await db.interactions
.where('productId', product.id)
.where('timestamp', greaterThan: DateTime.now().subtract(Duration(days: 30)))
.find();
if (interactions.length > 10) {
// Update vector based on user behavior patterns
final behaviorVector = await _calculateBehaviorVector(interactions);
product.contentVector = _combineVectors(
product.contentVector,
behaviorVector,
weights: [0.7, 0.3],
);
await db.products.save(product);
}
}
}
}
2. Performance Optimization
class RecommendationOptimizer {
/// Cache frequently accessed recommendations
final Map<String, List<Recommendation>> _cache = {};
final Duration _cacheExpiry = Duration(minutes: 30);
Future<List<Recommendation>> getCachedRecommendations(
String userId,
String context,
) async {
final cacheKey = '$userId:$context';
if (_cache.containsKey(cacheKey)) {
final cached = _cache[cacheKey]!;
if (cached.first.generatedAt.add(_cacheExpiry).isAfter(DateTime.now())) {
return cached;
}
}
// Generate fresh recommendations
final recommendations = await engine.generateRecommendations(
userId,
context: context,
);
_cache[cacheKey] = recommendations;
return recommendations;
}
/// Batch process recommendations for multiple users
Future<Map<String, List<Recommendation>>> batchGenerateRecommendations(
List<String> userIds,
) async {
final results = <String, List<Recommendation>>{};
// Process in batches to avoid memory issues
const batchSize = 50;
for (int i = 0; i < userIds.length; i += batchSize) {
final batch = userIds.skip(i).take(batchSize).toList();
final futures = batch.map((userId) async {
try {
final recommendations = await engine.generateRecommendations(userId);
return MapEntry(userId, recommendations);
} catch (e) {
print('Error generating recommendations for $userId: $e');
return MapEntry(userId, <Recommendation>[]);
}
});
final batchResults = await Future.wait(futures);
for (final entry in batchResults) {
results[entry.key] = entry.value;
}
// Small delay to prevent overwhelming the system
await Future.delayed(Duration(milliseconds: 100));
}
return results;
}
}
3. Quality Metrics and Monitoring
class RecommendationMetrics {
/// Calculate recommendation quality metrics
Future<QualityMetrics> calculateQualityMetrics(
String userId,
Duration period,
) async {
final endTime = DateTime.now();
final startTime = endTime.subtract(period);
final recommendations = await db.recommendations
.where('userId', userId)
.where('generatedAt', between: [startTime, endTime])
.find();
final interactions = await db.interactions
.where('userId', userId)
.where('timestamp', between: [startTime, endTime])
.find();
return QualityMetrics(
precision: await _calculatePrecision(recommendations, interactions),
recall: await _calculateRecall(recommendations, interactions),
diversity: _calculateDiversity(recommendations),
novelty: await _calculateNovelty(recommendations, userId),
coverage: await _calculateCoverage(recommendations),
serendipity: await _calculateSerendipity(recommendations, userId),
);
}
Future<double> _calculatePrecision(
List<Recommendation> recommendations,
List<Interaction> interactions,
) async {
final recommendedIds = recommendations.map((r) => r.productId).toSet();
final positiveInteractions = interactions
.where((i) => i.type == 'purchase' ||
(i.type == 'rating' && i.value >= 4.0))
.map((i) => i.productId)
.toSet();
final relevantRecommended = recommendedIds
.intersection(positiveInteractions)
.length;
return recommendedIds.isEmpty ? 0.0 : relevantRecommended / recommendedIds.length;
}
double _calculateDiversity(List<Recommendation> recommendations) {
if (recommendations.length < 2) return 0.0;
final products = recommendations.map((r) => r.productId).toList();
double totalDistance = 0.0;
int comparisons = 0;
for (int i = 0; i < products.length; i++) {
for (int j = i + 1; j < products.length; j++) {
// Calculate distance between products (simplified)
totalDistance += 1.0; // Placeholder - would use actual product similarity
comparisons++;
}
}
return comparisons > 0 ? totalDistance / comparisons : 0.0;
}
}
Advanced Topics
Context-Aware Recommendations
class ContextAwareRecommendations {
/// Generate recommendations based on current context
Future<List<Recommendation>> getContextualRecommendations(
String userId, {
required RecommendationContext context,
}) async {
final user = await db.users.findById(userId);
if (user == null) return [];
// Adjust recommendations based on context
switch (context.type) {
case ContextType.timeOfDay:
return _getTimeBasedRecommendations(user, context);
case ContextType.location:
return _getLocationBasedRecommendations(user, context);
case ContextType.device:
return _getDeviceBasedRecommendations(user, context);
case ContextType.social:
return _getSocialBasedRecommendations(user, context);
default:
return engine.generateRecommendations(userId);
}
}
Future<List<Recommendation>> _getTimeBasedRecommendations(
User user,
RecommendationContext context,
) async {
final hour = DateTime.now().hour;
String timeCategory;
if (hour >= 6 && hour < 12) {
timeCategory = 'morning';
} else if (hour >= 12 && hour < 18) {
timeCategory = 'afternoon';
} else if (hour >= 18 && hour < 22) {
timeCategory = 'evening';
} else {
timeCategory = 'night';
}
// Get products popular during this time
final timeBasedProducts = await db.products
.join('interactions')
.where('interactions.timestamp',
between: [_getTimeRangeStart(timeCategory), _getTimeRangeEnd(timeCategory)])
.groupBy('products.id')
.orderBy('COUNT(interactions.id)', descending: true)
.limit(20)
.find();
return _scoreProducts(timeBasedProducts, user, 'time_based');
}
}
Multi-Armed Bandit for Exploration
class MultiArmedBanditRecommender {
final Map<String, BanditArm> _arms = {};
final double _epsilon = 0.1; // Exploration rate
/// Select recommendation algorithm using epsilon-greedy strategy
Future<String> selectAlgorithm(String userId) async {
if (_arms.isEmpty) {
await _initializeArms();
}
// Epsilon-greedy selection
if (Random().nextDouble() < _epsilon) {
// Explore: random selection
final algorithms = _arms.keys.toList();
return algorithms[Random().nextInt(algorithms.length)];
} else {
// Exploit: select best performing algorithm
return _arms.entries
.reduce((a, b) => a.value.averageReward > b.value.averageReward ? a : b)
.key;
}
}
/// Update algorithm performance based on user feedback
Future<void> updateReward(String algorithm, double reward) async {
final arm = _arms[algorithm];
if (arm != null) {
arm.totalReward += reward;
arm.pullCount++;
arm.averageReward = arm.totalReward / arm.pullCount;
// Store updated metrics
await _storeArmMetrics(algorithm, arm);
}
}
}
Summary
This chapter demonstrated how to build sophisticated recommendation systems using Vektagraf's vector search capabilities. Key takeaways include:
- Hybrid Approaches: Combine content-based, collaborative, and trending recommendations
- Real-Time Updates: Handle user interactions and update recommendations dynamically
- Quality Metrics: Measure precision, recall, diversity, and other quality indicators
- A/B Testing: Systematically test and improve recommendation algorithms
- Context Awareness: Adapt recommendations based on time, location, and other factors
- Performance Optimization: Cache results and batch process for scalability
The vector-first approach of Vektagraf makes it particularly well-suited for recommendation systems, as embeddings are treated as first-class object properties rather than separate entities.
Next Steps
- Chapter 20: Document and Content Management - Learn semantic search patterns
- Chapter 22: AI/ML Integration Patterns - Explore advanced ML integration
- Part VII: Reference documentation for complete API coverage
Related Resources
- Vector Search Documentation (Chapter 5)
- Graph Operations (Chapter 6)
- Performance Optimization (Chapter 7)
- Multi-Tenant Architecture (Chapter 11)