Technology

Scaling Social Discovery: A Technical Guide to Building Friend-Driven Content Features

A detailed technical guide to building social discovery features (like Facebook's Friend Bubbles) that scale to billions, covering ML models, platform differences, and the breakthrough discovery of recency over frequency.

Published 2026-05-18 18:25:48 • Ehedrick Staff

Overview

When Facebook introduced Friend Bubbles on Reels, the feature looked deceptively simple: it highlighted Reels that a user's friends had watched and reacted to. Under the hood, however, building a social discovery engine that serves billions of users required deep engineering across machine learning, distributed systems, and platform-specific optimizations. This tutorial walks through the core architectural decisions, the evolution of the recommendation model, and the unexpected breakthrough that made the feature truly click — all based on real lessons shared by engineers on the Meta Tech Podcast.

Scaling Social Discovery: A Technical Guide to Building Friend-Driven Content Features — Source: engineering.fb.com

By the end of this guide, you'll understand how to design, scale, and fine-tune a social discovery feature that leverages friend interactions, handles cross-platform behavior differences, and maintains low latency at global scale.

Prerequisites

Technical background

Familiarity with recommendation systems (collaborative filtering, embedding models)
Experience with distributed data processing (e.g., Apache Spark, Flink)
Understanding of mobile platform constraints (iOS vs. Android)
Basic knowledge of AB testing and online evaluation

Tools & infrastructure

ML framework: PyTorch or TensorFlow for model training
Feature store: e.g., Feast or custom key-value store
Real-time serving: inference infrastructure (e.g., TorchServe, TF Serving)
Monitoring: metrics pipeline for engagement and latency

Step-by-Step Instructions

1. Define the social signal

Friend Bubbles relies on explicit and implicit signals from a user's social graph. Start by collecting:

Explicit reactions: likes, comments, shares on Reels by friends.
Implicit engagement: watch time, replay count, completion rate.
Affinity weight: prioritize signals from close friends (e.g., based on interaction frequency).

Store these as time-decayed aggregates. Example schema:

user_id, friend_id, reel_id, reaction_weight, timestamp

2. Build the recommendation model

The ML model evolved from a simple collaborative filter to a multi-task learning architecture that predicts both engagement and social relevance. Key components:

User & Reel embeddings: trained via matrix factorization, enriched with social graph features.
Friend influence embedding: learn a representation of how a user's friends influence their content preference.
Loss function: combine engagement prediction (binary cross-entropy) and friend relevance score (ranking loss).

Example pseudocode for training loop:

for batch in dataloader:
    user_emb = user_encoder(batch.user)
    reel_emb = reel_encoder(batch.reel)
    friend_influence = friend_aggregator(batch.friend_signals)
    score = dot(user_emb + friend_influence, reel_emb)
    loss = engagement_loss(score, batch.label) + ranking_loss(score, batch.friend_scores)
    optimizer.step(loss)

3. Handle platform differences (iOS vs. Android)

One surprising discovery was that engagement patterns differ significantly between iOS and Android users. iOS users tended to interact more with Reels from close friends, while Android users showed broader social exploration. To account for this:

Platform-specific embeddings: train separate models for each OS — or use platform as a side feature.
A/B test different thresholds: tune the friend-weighting parameter per platform.
Client-side logic: adjust bubble display position and animation based on OS-specific UI guidelines.

4. Scale to billions of users

Real-time inference pipeline

Use a two-stage retrieval-reranking architecture:

Retrieval stage: precompute top-N candidate Reels for each user using approximate nearest neighbor search (e.g., FAISS) on friend-influenced embeddings. Update every few hours.
Reranking stage: serve a lightweight ML model (e.g., 1-2 layer neural net) at request time to personalize the final 10-20 bubbles.

Data locality

Partition user embeddings by geographic region to reduce memory footprint. Use sharded feature stores that replicate most-frequent friend clusters.

5. The breakthrough: surprise discovery

The engineers on the Meta Tech Podcast recounted that the feature finally clicked when they realized recency of friend interaction mattered more than frequency. A friend who watched a Reel 10 minutes ago had much higher influence than one who watched it a week ago — even if the weekly friend was more active overall. This led to:

Introducing a time-decay factor with an exponential half-life of 1 hour.
Adding a momentum feature: if multiple friends watch the same Reel within a short window, boost it significantly.

6. Evaluation and monitoring

Track both online metrics (engagement rate, dwell time, friend bubble click-through) and offline metrics (NDCG, recall@k). Use staged rollout:

Shadow test → canary → 1% → 10% → global.
Monitor for filter bubbles: ensure variety of recommended Reels even when friends are highly active.

Common Mistakes

Over‑reliance on friend frequency

Assuming that the most active friends always provide the best signals can degrade diversity. Use surprise-based scoring (e.g., TF-IDF weighting on friend interactions).

Ignoring cold start for new friends

When a user adds a new friend, their recent watch history may be sparse. Fall back to community-level trends in the same interest cluster.

Platform uniformity

Treating iOS and Android identically can lead to suboptimal engagement. Always test platform-specific models even if the core algorithm is shared.

Latency vs. freshness tradeoff

Real-time updates can cause high latency. Use a staleness budget (e.g., max 5 minutes delay) and merge incremental updates asynchronously.

Summary

Building a social discovery feature that scales to billions demands careful orchestration of ML modeling, platform-aware tuning, and near‑real‑time infrastructure. The key takeaways are: embrace time‑decay signals over raw frequency, account for iOS/Android behavioral differences, and use a retrieval‑reranking pattern to balance accuracy and speed. By following the steps outlined here — starting with signal definition, progressing through model architecture, and validating with staged rollouts — you can implement a Friend Bubbles‑like feature that feels both personal and performant at massive scale.