How I took Pathao Food's homepage from a one-size-fits-all carousel to a personalised recommendation engine that surfaces the right restaurant for each user - and how iterating from V1 to V2 made it measurably better across every metric that mattered.
Pathao Food's homepage had a section called "Picks For You" - a curated carousel of recommended restaurants displayed prominently when users opened the app. On paper, it was prime real estate: the first thing a hungry user sees when they open the app at lunchtime.
In practice, it was doing something far less interesting. The restaurants shown were essentially the same for everyone - popular restaurants in the user's delivery zone, maybe with some light recency weighting, but nothing that felt personal. For a returning user who orders biryani three times a week, seeing a burger joint they've never touched ranked above their regular spot wasn't discovery - it was friction.
The data reflected this. Homepage conversion was lower than it should be for a feature occupying the most valuable vertical space on the screen. Users were scrolling past the carousel to search manually - which meant the carousel wasn't doing its job. This is what made Picks For You a P0 initiative.
The brief I gave myself: Make "Picks For You" earn its name. If the collection says "for you," the restaurants in it should actually reflect something true about the person looking at it.
When I dug into the problem, what looked like one issue was actually three separate failure modes stacked on top of each other. Solving only one of them would leave the other two dragging on results.
I identified two distinct user modes that the recommendation engine needed to handle well - not two different user segments, but two different states the same user could be in on any given day.
The critical insight from framing it this way: both states require the same underlying infrastructure - a rich user taste profile and a restaurant similarity model - they just apply it differently. The Regular needs profile-to-familiar-restaurant matching; the Explorer needs profile-to-lookalike-restaurant matching across zones. V1 solved the first problem. V2 solved both.
Before writing a single requirement, I made four strategic calls that defined the shape of the solution. Each one had a meaningful trade-off and I want to be transparent about why I landed where I did.
V1 was the ground-up build. Before V1, there was no taste profile infrastructure, no user-user similarity model, and no restaurant diversity logic. Everything in this version was built from scratch.
I defined four input signals for the recommendation engine, ordered by the specificity of personal information they carry:
| Signal | What it captures | Why it matters |
|---|---|---|
| User Order History | Cuisine types, order frequency, basket size, meal timing, delivery zones | The highest-fidelity signal - revealed preference, not stated preference |
| Lookalike User Pool | Restaurants ordered by users with similar behavioural vectors | Extends recommendations beyond the user's own history; helps cold-start |
| Taste Diversity Score | How varied or repetitive the user's recent orders have been | High repetition = show more familiar; high variety = increase discovery weight |
| Restaurant Metadata | Cuisine, pricing tier, delivery time, MPF tier, rating | Enables budget and preference filtering; feeds the similarity model |
Having a ranked list of personalised restaurants is a necessary but not sufficient condition for a good carousel. I wrote four explicit diversity constraints that the algorithm had to respect:
After diversification filtering, the final ranking combined five signals into a composite score:
Ranking factors (in priority order): Proximity → Cuisine match to taste profile → Restaurant rating → Frequency of occurrence in lookalike pool → Location match with user's delivery area. A restaurant scoring high on all five would rank at the top; the diversity rules ensured no single cuisine dominated even when multiple high-scorers shared a cuisine type.
I defined explicit fallback behaviour for the cases where personalised data wasn't available - new users with thin history, areas with insufficient order data, or DS pipeline failures. In all cases the fallback was trending/popular restaurants in the user's zone, with a silent degradation to pure popularity ranking. I specified this not as an afterthought but as a first-class requirement - a feature that fails gracefully is more trustworthy than one that breaks visibly.
V1 solved the core personalisation problem within familiar zones. But V1's data revealed two remaining gaps: popular restaurants within the collection still appeared in a generic order rather than a personalised one, and users who browsed from a new area saw the personalisation collapse. V2 addressed both.
In V1, the "popular in zone" pool that fed the collection was sorted by aggregate popularity - the same ranking every user saw. User A who loves biryani and User B who loves pizza would see the same popular restaurants in the same order within the personalised results. The personalisation was at the inclusion level but not the sort order level.
When a user opened the app in a delivery zone with no order history, V1 had nothing to work with. The system fell back to generic popularity for that zone - showing the user restaurants that had no connection to their demonstrated preferences. Personalisation was essentially absent for the "Explorer" use case.
The V2 acceptance criteria I wrote: (1) Popular restaurants in Picks For You sorted using personalised ranking. (2) New-area browsing shows cuisine/type lookalike recommendations from the user's other zones. (3) Recommendations adapt dynamically when area changes. (4) Fallback to top-rated nearby restaurants when no cross-area match exists. All four had to pass for V2 to ship.
Personalisation quality is only part of the conversion equation. Two external factors - where the carousel sits on the homepage and whether it auto-scrolls - could independently affect whether users even engaged with the personalised results. I designed an A/B testing framework to measure both.
Testing whether "Picks For You" performs better in the 1st carousel slot versus the 3rd slot on the food homepage. Hypothesis: personalised content earns first position; placing it there lifts CTR by reducing the scroll required to reach the most relevant recommendation. Multiple cohorts with different position assignments, all controlled through the A/B framework.
Testing auto-scroll ON versus OFF within the collection. The risk with auto-scroll is that it skips the highest-ranked, most personalised restaurants before the user sees them. When auto-scroll is OFF, users immediately see the top-ranked options - exactly the ones with the highest similarity score. The hypothesis was that removing auto-scroll would increase engagement with high-similarity restaurants.
Why this mattered: A personalised recommendation engine that's buried on the homepage or auto-scrolled past before users notice it doesn't convert. These experiments were about making sure the engineering work we put into personalisation could actually be measured. Good personalisation in a bad position gives you noisy data and real users a worse experience.
I specified the A/B framework requirements in the same PRD as the recommendation engine itself - because you can't retrofit testability. The framework needed to support:
Multiple cohorts simultaneously receiving different carousel positions (1st, 3rd, etc.), with consistent assignment across sessions for the same user.
Server-side toggle for auto-scroll ON/OFF, configurable per cohort without requiring an app release. Essential for running the experiment at speed.
Tracking CTR and conversion attributable specifically to Picks For You interactions, isolated from other homepage carousels to avoid contamination.
North Star: Uplift in Picks For You collection CTR - the single number that tells you whether personalisation is earning user attention on the homepage. Everything else either explains why CTR moved, or guards against gaming it.
| Layer | Metric | Why it's on the dashboard |
|---|---|---|
| Engagement | CTR on Picks For You collection | Primary signal - did the recommendation earn a tap? If a user sees the carousel and scrolls past, the recommendation failed. |
| Engagement | Zero-click homepage sessions on Picks For You | Diagnostic metric. A high zero-click rate signals a systematic recommendation quality problem - not variance. Tracking the reduction tells you personalisation is working, not just that CTR improved for an unrelated reason. |
| Conversion | Order conversion rate from Picks For You tap | Closes the loop from carousel tap to placed order. CTR without conversion means we're surfacing restaurants users find interesting but don't actually want to order from - clicks without revenue. |
| Conversion | Repeat order rate on recommended restaurants | Measures whether recommendations surface restaurants users come back to independently - a signal that the recommendation matched genuine preference, not just triggered a one-off curiosity click. |
| Discovery | % of orders from restaurants never previously ordered from | Measures whether the 40% "new but aligned" discovery slot is actually working. If this stays flat, the diversity rules aren't generating genuine new discovery - they're just cycling through the same pool. |
| Discovery | Cuisine diversity index per user (V2) | Tracks whether users are ordering from a broader range of cuisines over time after personalisation launches. A narrowing index would be an early filter-bubble warning signal. |
| System | Recommendation freshness - % of sessions served from updated model | Confirms the DS pipeline is refreshing taste profiles and similarity models on schedule. Stale models produce increasingly irrelevant recommendations over time without any visible error. |
| System | Fallback trigger rate | The % of sessions where personalisation data wasn't available and the system fell back to popular-in-zone. A rising fallback rate signals a pipeline problem before users report degraded quality. |
Time spent on homepage. A feature that improves recommendation relevance should reduce browsing time, not increase it. If Picks For You is working, users find what they want faster and place orders sooner. Optimising for session duration would mean optimising for friction.
Aggregate homepage impressions. Impressions go up when quality goes down - a confusing carousel generates more scroll behaviour, which inflates impression counts. Tracking impressions as a success signal would reward a worse product.
A/B experiment metrics (separate layer): The position test tracked CTR differential between cohorts (1st vs. 3rd carousel slot). The auto-scroll test tracked tap rate on the top-3 ranked restaurants - the hypothesis being that turning off auto-scroll would increase engagement with the highest-similarity results, since users would see them immediately rather than having them scroll past. Both were instrumented independently of the core recommendation metrics to avoid signal contamination.
The actual requirement artefacts behind the product - user stories, acceptance criteria, and the algorithm decision flow as written in the PRD.
| ID | As a… | I want… | So that… |
|---|---|---|---|
| US.1 | Returning user with order history | the Picks For You carousel to surface restaurants that match my cuisine preferences and budget | I spend less time scrolling and find something to order faster |
| US.2 | User visiting a new delivery zone | to see restaurant recommendations that reflect my preferences from other areas - not a generic popular list | personalisation follows me across the app, even in zones where I have no order history |
| US.3 | User who orders the same 3–4 restaurants repeatedly | to be shown at least some restaurants I haven't tried before - but ones that fit my taste profile | I can discover new options without feeling like the app is showing me random noise |
| US.4 | New user with fewer than 5 orders | to see a sensible carousel even before the system knows my preferences | the app doesn't feel broken or irrelevant during my first few sessions |
| # | Criteria | Owner | Verified by |
|---|---|---|---|
| AC.1 | Picks For You must use the user's order history (cuisine, basket size, delivery zone, meal time) as primary ranking signal - not aggregate popularity | DSE | QA: two users with different taste profiles must receive different carousel outputs in the same zone |
| AC.2 | No more than 3 restaurants from the same cuisine type must appear consecutively in the output | DSE | QA: run output for a user with a single dominant cuisine - verify cuisine cap applies |
| AC.3 | At least 40% of recommended restaurants must be restaurants the user has not previously ordered from but match their taste profile | DSE | Analytics: discovery % metric tracked per session from day one |
| AC.4 | If no personalised data is available (new user, pipeline failure), carousel must silently fall back to trending/popular-in-zone - no error state shown to user | BE | QA: simulate pipeline timeout - verify fallback content renders within SLA |
| AC.5 | The same restaurant must not appear in more than one homepage carousel at a time (de-duplication pass required) | BE | QA: verify no duplicate restaurant IDs across all carousels rendered on the same homepage load |
How the system decides what goes into Picks For You for a given user session.