Personalisation ML · Recommendation Growth Pathao Food · 2025

Picks For You
Restaurant Personalisation Engine

How I took Pathao Food's homepage from a one-size-fits-all carousel to a personalised recommendation engine that surfaces the right restaurant for each user - and how iterating from V1 to V2 made it measurably better across every metric that mattered.

CTR
Picks For You collection click-through rate post-launch
CVR
Order conversion from carousel tap - clicks turning into placed orders
0-click
Homepage sessions where Picks For You received zero engagement
Discovery
Orders placed at restaurants users had never previously ordered from
01 — Background

The homepage carousel that wasn't earning its position.

Pathao Food's homepage had a section called "Picks For You" - a curated carousel of recommended restaurants displayed prominently when users opened the app. On paper, it was prime real estate: the first thing a hungry user sees when they open the app at lunchtime.

In practice, it was doing something far less interesting. The restaurants shown were essentially the same for everyone - popular restaurants in the user's delivery zone, maybe with some light recency weighting, but nothing that felt personal. For a returning user who orders biryani three times a week, seeing a burger joint they've never touched ranked above their regular spot wasn't discovery - it was friction.

The data reflected this. Homepage conversion was lower than it should be for a feature occupying the most valuable vertical space on the screen. Users were scrolling past the carousel to search manually - which meant the carousel wasn't doing its job. This is what made Picks For You a P0 initiative.

The brief I gave myself: Make "Picks For You" earn its name. If the collection says "for you," the restaurants in it should actually reflect something true about the person looking at it.


02 — Problem

Three failure modes, not one.

When I dug into the problem, what looked like one issue was actually three separate failure modes stacked on top of each other. Solving only one of them would leave the other two dragging on results.

01
Generic ranking with no personal signal
The most fundamental issue. Picks For You was essentially a popular-in-zone list. Every user in Banani at noon saw the same restaurants in the same order. There was no cuisine preference weighting, no order history influence, no budget consideration. A user who exclusively orders rice-based meals would see pizza restaurants ranked above their regular spots simply because those restaurants had more aggregate orders in the zone.
02
No discovery mechanism - just repetition
Even when users did find restaurants they liked, the carousel tended to resurface the same options repeatedly. Choice fatigue was building. If you're seeing the same 8 restaurants every time you open the app, you stop looking at the carousel. The feature needed to balance familiarity (restaurants the user has ordered from before) with genuine discovery (new restaurants they'd likely enjoy based on their taste profile).
03
Area-switching broke personalisation entirely
This was the subtler issue, exposed as we looked more carefully at the data. When a user opened the app in a new delivery zone - visiting a different neighbourhood, or travelling to a different city - the personalisation completely collapsed. The system had no order history for that zone, so it fell back to generic popularity ranking, showing the user restaurants that bore no connection to what they'd demonstrated they actually like. A user who loves biryani in Dhaka visiting Chittagong would see completely unrelated restaurants.

03 — Users

Two types of users the feature had to serve simultaneously.

I identified two distinct user modes that the recommendation engine needed to handle well - not two different user segments, but two different states the same user could be in on any given day.

AK
The Regular
Ordering from a familiar zone, familiar cuisine
Akash orders biryani or kebabs from Banani 4–5 times a week, mostly at lunch. He's not looking for discovery - he wants speed and confidence. The right recommendation for him surfaces his go-to spots at the top, immediately. He'll skip the carousel if it shows him things unrelated to his taste. For this user, personalisation means familiar favourites, ranked right.
SR
The Explorer
Ordering from a new zone, or bored with their usual options
Same user, different day. Akash is visiting Uttara for a meeting, or he's just tired of his usual spots. He wants the app to make an intelligent suggestion - not popularity-driven noise, but something that fits his known preferences even in unfamiliar territory. For this user, personalisation means: "I know you like biryani in Banani - here's the equivalent in Uttara."

The critical insight from framing it this way: both states require the same underlying infrastructure - a rich user taste profile and a restaurant similarity model - they just apply it differently. The Regular needs profile-to-familiar-restaurant matching; the Explorer needs profile-to-lookalike-restaurant matching across zones. V1 solved the first problem. V2 solved both.


04 — My Approach

Four decisions that shaped the entire product.

Before writing a single requirement, I made four strategic calls that defined the shape of the solution. Each one had a meaningful trade-off and I want to be transparent about why I landed where I did.

01
Build a taste profile, not just an order history
The simplest version of personalisation would have been: "show them restaurants they've ordered from before." That's not personalisation - it's a browsing history. I pushed for a layer of abstraction: tag each user with cuisine preferences, budget range, order timing, and frequency patterns. This "taste profile" is what makes the system generalise - it's how you can recommend a restaurant the user has never tried but would probably love. Engineering pushback was real (it required a separate DS pipeline), but it was the right call.
02
Use lookalike users, not just similar restaurants
I wanted two complementary signals: user-user similarity (find users with similar behaviour and see what they've ordered) and restaurant-restaurant similarity (find restaurants similar to the user's past orders by cuisine, pricing, and dish profile). Most recommendation systems use one of these; using both creates a more robust signal. Restaurant similarity alone would over-index on familiar cuisine; user similarity alone would be too slow for new users with thin history.
03
Mandate diversity - don't just rank by predicted preference
Left to itself, a recommendation system that ranks purely by predicted match score will surface the same three or four dominant restaurants in every session. I wrote explicit diversity constraints into the spec: no more than three restaurants from the same cuisine in sequence, a 60/40 split between familiar and new, and budget spectrum coverage. This was a deliberate decision to cap relevance in the name of discovery quality - and it was the right trade-off.
04
Ship V1 clean, then iterate to V2 - not the other way around
Cross-area personalisation (the "Explorer" problem) was the harder engineering problem. I could have held the feature until it was solved, but that would have delayed the bigger, more impactful core personalisation by months. I shipped V1 with strong within-zone personalisation and a clear V2 commitment for cross-area extension. This gave us real-world signal from V1 to calibrate V2, and meant the core user experience improved on a faster timeline.

05 — V1 Design

Building the personalisation foundation - user signals, lookalikes, and diversity rules.

V1 was the ground-up build. Before V1, there was no taste profile infrastructure, no user-user similarity model, and no restaurant diversity logic. Everything in this version was built from scratch.

The signal stack

I defined four input signals for the recommendation engine, ordered by the specificity of personal information they carry:

Signal What it captures Why it matters
User Order History Cuisine types, order frequency, basket size, meal timing, delivery zones The highest-fidelity signal - revealed preference, not stated preference
Lookalike User Pool Restaurants ordered by users with similar behavioural vectors Extends recommendations beyond the user's own history; helps cold-start
Taste Diversity Score How varied or repetitive the user's recent orders have been High repetition = show more familiar; high variety = increase discovery weight
Restaurant Metadata Cuisine, pricing tier, delivery time, MPF tier, rating Enables budget and preference filtering; feeds the similarity model

The diversification rules

Having a ranked list of personalised restaurants is a necessary but not sufficient condition for a good carousel. I wrote four explicit diversity constraints that the algorithm had to respect:

🍽️
Cuisine cap
No more than 3 restaurants from the same cuisine type appearing consecutively in the ranked output. Prevents the list from feeling monotonous.
⚖️
Familiar vs. new balance
60% of shown restaurants should be familiar (ordered before or from lookalike pool); 40% should be new but aligned with taste profile.
💰
Budget spectrum
Include both budget-friendly and premium options within the user's demonstrated price range. Avoid clustering entirely at one price point.
📍
Locality awareness
Prioritise restaurants from the user's home delivery zone. Proximity is a strong signal for conversion - a great recommendation too far away is useless.

The ranking algorithm

After diversification filtering, the final ranking combined five signals into a composite score:

Ranking factors (in priority order): Proximity → Cuisine match to taste profile → Restaurant rating → Frequency of occurrence in lookalike pool → Location match with user's delivery area. A restaurant scoring high on all five would rank at the top; the diversity rules ensured no single cuisine dominated even when multiple high-scorers shared a cuisine type.

A concrete example

User: Akash - frequent biryani and kebab orders from Banani, lunch-heavy, mid-budget
User-user similarity: System identifies 40 users with similar behavioural vectors (rice-based meals, Banani zone, mid-basket). Pulls restaurants they frequently ordered.
Restaurant-restaurant similarity: Finds restaurants similar to Akash's past orders by cuisine and pricing - e.g. "Biryani Express," "Spice N Rice."
Merge and rank: Combines both lists, applies composite score. "The Kebab Factory" and "Dhaka Biryani House" rise to the top.
Diversity pass: Ensures the top 8 slots include at least 2 cuisine types and at least 2 restaurants Akash hasn't ordered from before.
Final output: Akash opens the app at 12:30 PM and sees a carousel led by familiar favourites, followed by discovery recommendations that actually fit his taste.

Fallback logic

I defined explicit fallback behaviour for the cases where personalised data wasn't available - new users with thin history, areas with insufficient order data, or DS pipeline failures. In all cases the fallback was trending/popular restaurants in the user's zone, with a silent degradation to pure popularity ranking. I specified this not as an afterthought but as a first-class requirement - a feature that fails gracefully is more trustworthy than one that breaks visibly.


06 — V2 Design

Making it work everywhere - personalized sorting and cross-area recommendations.

V1 solved the core personalisation problem within familiar zones. But V1's data revealed two remaining gaps: popular restaurants within the collection still appeared in a generic order rather than a personalised one, and users who browsed from a new area saw the personalisation collapse. V2 addressed both.

V1 Limitation
Popular restaurants ranked generically

In V1, the "popular in zone" pool that fed the collection was sorted by aggregate popularity - the same ranking every user saw. User A who loves biryani and User B who loves pizza would see the same popular restaurants in the same order within the personalised results. The personalisation was at the inclusion level but not the sort order level.

V2 Fix
Personalised sorting within the collection
Apply the same personalised ranking algorithm from the "Personalised Restaurant Ranking" feature to sort restaurants within Picks For You.
Sorting signals: user's order history (cuisine, basket size, frequency, meal timing) combined with restaurant features (rating, cuisine match, similarity score).
Outcome: every user's Picks For You feed shows the same eligible restaurants in a different personalised order. Akash's biryani spots rise; a pizza-first user sees different restaurants ranked at the top.
V1 Limitation
New area = personalisation collapse

When a user opened the app in a delivery zone with no order history, V1 had nothing to work with. The system fell back to generic popularity for that zone - showing the user restaurants that had no connection to their demonstrated preferences. Personalisation was essentially absent for the "Explorer" use case.

V2 Fix
Cross-area personalisation via cuisine mapping
When a user browses a new area with no local history, map their taste profile from other zones to restaurants in the new zone via cuisine and dish similarity.
Akash loves biryani in Gulshan → visiting Uttara → system finds biryani restaurants in Uttara matching similar cuisine/price profile.
Context sensitivity: recommendations auto-refresh when the user's delivery area changes. Area change is the trigger for re-running the cross-area mapping.

The V2 acceptance criteria I wrote: (1) Popular restaurants in Picks For You sorted using personalised ranking. (2) New-area browsing shows cuisine/type lookalike recommendations from the user's other zones. (3) Recommendations adapt dynamically when area changes. (4) Fallback to top-rated nearby restaurants when no cross-area match exists. All four had to pass for V2 to ship.


07 — A/B Testing

Testing two questions at once - position and behaviour.

Personalisation quality is only part of the conversion equation. Two external factors - where the carousel sits on the homepage and whether it auto-scrolls - could independently affect whether users even engaged with the personalised results. I designed an A/B testing framework to measure both.

Experiment 1

Carousel Position

Testing whether "Picks For You" performs better in the 1st carousel slot versus the 3rd slot on the food homepage. Hypothesis: personalised content earns first position; placing it there lifts CTR by reducing the scroll required to reach the most relevant recommendation. Multiple cohorts with different position assignments, all controlled through the A/B framework.

Experiment 2

Auto-scroll Behaviour

Testing auto-scroll ON versus OFF within the collection. The risk with auto-scroll is that it skips the highest-ranked, most personalised restaurants before the user sees them. When auto-scroll is OFF, users immediately see the top-ranked options - exactly the ones with the highest similarity score. The hypothesis was that removing auto-scroll would increase engagement with high-similarity restaurants.

Why this mattered: A personalised recommendation engine that's buried on the homepage or auto-scrolled past before users notice it doesn't convert. These experiments were about making sure the engineering work we put into personalisation could actually be measured. Good personalisation in a bad position gives you noisy data and real users a worse experience.

How I set it up

I specified the A/B framework requirements in the same PRD as the recommendation engine itself - because you can't retrofit testability. The framework needed to support:

Cohort Control

Multiple cohorts simultaneously receiving different carousel positions (1st, 3rd, etc.), with consistent assignment across sessions for the same user.

Behaviour Toggle

Server-side toggle for auto-scroll ON/OFF, configurable per cohort without requiring an app release. Essential for running the experiment at speed.

Clean Attribution

Tracking CTR and conversion attributable specifically to Picks For You interactions, isolated from other homepage carousels to avoid contamination.


08 — Metrics

What success looked like - and what we measured to know.

North Star: Uplift in Picks For You collection CTR - the single number that tells you whether personalisation is earning user attention on the homepage. Everything else either explains why CTR moved, or guards against gaming it.

Layer Metric Why it's on the dashboard
Engagement CTR on Picks For You collection Primary signal - did the recommendation earn a tap? If a user sees the carousel and scrolls past, the recommendation failed.
Engagement Zero-click homepage sessions on Picks For You Diagnostic metric. A high zero-click rate signals a systematic recommendation quality problem - not variance. Tracking the reduction tells you personalisation is working, not just that CTR improved for an unrelated reason.
Conversion Order conversion rate from Picks For You tap Closes the loop from carousel tap to placed order. CTR without conversion means we're surfacing restaurants users find interesting but don't actually want to order from - clicks without revenue.
Conversion Repeat order rate on recommended restaurants Measures whether recommendations surface restaurants users come back to independently - a signal that the recommendation matched genuine preference, not just triggered a one-off curiosity click.
Discovery % of orders from restaurants never previously ordered from Measures whether the 40% "new but aligned" discovery slot is actually working. If this stays flat, the diversity rules aren't generating genuine new discovery - they're just cycling through the same pool.
Discovery Cuisine diversity index per user (V2) Tracks whether users are ordering from a broader range of cuisines over time after personalisation launches. A narrowing index would be an early filter-bubble warning signal.
System Recommendation freshness - % of sessions served from updated model Confirms the DS pipeline is refreshing taste profiles and similarity models on schedule. Stale models produce increasingly irrelevant recommendations over time without any visible error.
System Fallback trigger rate The % of sessions where personalisation data wasn't available and the system fell back to popular-in-zone. A rising fallback rate signals a pipeline problem before users report degraded quality.

What I deliberately excluded

Time spent on homepage. A feature that improves recommendation relevance should reduce browsing time, not increase it. If Picks For You is working, users find what they want faster and place orders sooner. Optimising for session duration would mean optimising for friction.

Aggregate homepage impressions. Impressions go up when quality goes down - a confusing carousel generates more scroll behaviour, which inflates impression counts. Tracking impressions as a success signal would reward a worse product.

A/B experiment metrics (separate layer): The position test tracked CTR differential between cohorts (1st vs. 3rd carousel slot). The auto-scroll test tracked tap rate on the top-3 ranked restaurants - the hypothesis being that turning off auto-scroll would increase engagement with the highest-similarity results, since users would see them immediately rather than having them scroll past. Both were instrumented independently of the core recommendation metrics to avoid signal contamination.


09 — Risks

What could go wrong - and how I designed against it.

High
Filter bubble: personalisation reinforces existing taste rather than expanding it
A pure preference-matching engine would show users increasingly narrow options over time - the biryani user only ever sees biryani. This is a long-term retention risk. Mitigated directly in the spec by the 40% "new but aligned" discovery slot and the cuisine diversity cap. The feature is designed to surface novelty, not just familiarity. The diversity rules were non-negotiable.
High
Cold-start for new users produces a generic experience
Users with fewer than ~5 orders have insufficient history for the taste profile and lookalike models to produce good signals. The result would be a personalised-looking carousel with generic content - which feels worse than an openly generic carousel. Mitigated by the fallback specification: below a history threshold, degrade gracefully to trending/popular-in-zone with no pretence of personalisation.
High
Stale taste profiles make recommendations irrelevant over time
A taste profile built on order history from 6 months ago may not reflect current preferences. Seasonal changes, lifestyle changes, or simply evolving taste could make the profile a liability rather than an asset. Mitigated by specifying recency weighting in the profile generation - recent orders carry more signal than old ones - and by including order recency in the "taste diversity score" that modulates how much weight the system gives to historical vs. recent behaviour.
Medium
Cross-area mapping surfaces irrelevant lookalikes
The quality of cross-area recommendations depends entirely on how well the cuisine and dish similarity model generalises. A "biryani restaurant in Gulshan" and a "biryani restaurant in Uttara" may be very different in quality, price point, and style. If the mapping is too loose, users get recommendations that feel generic anyway. Mitigated by specifying that cross-area matching must use cuisine AND pricing tier - both must match, not just cuisine alone.
Medium
A/B test results confounded by the A/B framework itself
If cohort assignment isn't stable across sessions, users may flip between variants, contaminating results. A user who sees position 1 one day and position 3 the next doesn't produce clean data for either cohort. Mitigated by requiring user-level (not session-level) cohort assignment in the framework specification, with consistency checks in the analytics tracking.
Low
Restaurant de-duplication across carousels
If the same restaurant appears in both Picks For You and another carousel on the same homepage, it reduces the perceived freshness of the recommendations and wastes carousel real estate. Mitigated by specifying a de-duplication pass across all homepage carousels before final rendering - a restaurant that appears in Picks For You should not appear in any other carousel below it.

10 — Lessons

What building this taught me about personalisation and shipping in phases.

01
Personalisation without diversity is just a preference echo chamber
The temptation in a recommendation system is to maximise relevance - show users more of what they've already demonstrated they like, ranked by similarity score. I resisted this, and I'm glad I did. The 40% discovery slot and the cuisine diversity cap weren't compromises on quality - they were quality. A carousel that shows you the same five restaurants with higher confidence is less useful than one that shows you four familiar and four genuinely good new options. Relevance and discovery are not opposing forces in a well-designed recommendation system.
02
Phasing by iteration boundary, not by feature scope, gave us cleaner data
I could have tried to ship V1 and V2 together - building personalised sorting and cross-area mapping at the same time as the base recommendation engine. Instead I drew the line at a natural product boundary: V1 solves the core problem (personalisation in familiar zones), V2 solves the harder extension (personalisation anywhere). This meant V1 metrics reflected the base recommendation quality in isolation, which made V2 impact attributable to the specific V2 changes rather than confounded with everything else. Phased delivery is often presented as a resourcing trade-off; it's also an analytical trade-off that's worth making deliberately.
03
Testability has to be designed in, not bolted on
I included the A/B testing framework requirements in the same PRD as the recommendation engine. The PM instinct to ship the feature first and instrument it later is a trap - by the time you want clean A/B data, the feature is already live and you're trying to retrofit cohort assignment into a running system. Specifying the experiment framework alongside the feature meant we could measure carousel position and auto-scroll effects from day one, which gave the business actual evidence rather than intuition about what configuration to commit to.
04
The fallback is a first-class product requirement, not an edge case
Every state where personalisation data isn't available - new users, new areas, pipeline failures - needed an explicit, specified fallback. I wrote these as acceptance criteria, not as footnotes. A recommendation carousel that breaks or degrades visibly for 15% of sessions is not a feature that's 85% shipped - it's a feature with a 15% failure rate that users will remember. Fallback quality is part of product quality.
05
Cross-functional alignment on scope is the hardest and most important work
This feature required real coordination across Data Science Engineering (for the similarity models and taste profile pipeline), Backend (for the API layer, caching, and fallback logic), and Product (for the diversity rules, ranking logic, and A/B framework). The biggest risk to the timeline was not engineering complexity - it was scope misalignment between teams. Writing clear dependencies, writing explicit acceptance criteria that each team could verify against, and holding joint reviews before each phase moved to implementation was the work that made the feature ship on time.
Appendix - The Evidence

Functional Spec Excerpts

The actual requirement artefacts behind the product - user stories, acceptance criteria, and the algorithm decision flow as written in the PRD.

Appendix A - User Stories
ID As a… I want… So that…
US.1 Returning user with order history the Picks For You carousel to surface restaurants that match my cuisine preferences and budget I spend less time scrolling and find something to order faster
US.2 User visiting a new delivery zone to see restaurant recommendations that reflect my preferences from other areas - not a generic popular list personalisation follows me across the app, even in zones where I have no order history
US.3 User who orders the same 3–4 restaurants repeatedly to be shown at least some restaurants I haven't tried before - but ones that fit my taste profile I can discover new options without feeling like the app is showing me random noise
US.4 New user with fewer than 5 orders to see a sensible carousel even before the system knows my preferences the app doesn't feel broken or irrelevant during my first few sessions
Appendix B - Acceptance Criteria (V1)
# Criteria Owner Verified by
AC.1 Picks For You must use the user's order history (cuisine, basket size, delivery zone, meal time) as primary ranking signal - not aggregate popularity DSE QA: two users with different taste profiles must receive different carousel outputs in the same zone
AC.2 No more than 3 restaurants from the same cuisine type must appear consecutively in the output DSE QA: run output for a user with a single dominant cuisine - verify cuisine cap applies
AC.3 At least 40% of recommended restaurants must be restaurants the user has not previously ordered from but match their taste profile DSE Analytics: discovery % metric tracked per session from day one
AC.4 If no personalised data is available (new user, pipeline failure), carousel must silently fall back to trending/popular-in-zone - no error state shown to user BE QA: simulate pipeline timeout - verify fallback content renders within SLA
AC.5 The same restaurant must not appear in more than one homepage carousel at a time (de-duplication pass required) BE QA: verify no duplicate restaurant IDs across all carousels rendered on the same homepage load
Appendix C - Algorithm Decision Flow

How the system decides what goes into Picks For You for a given user session.

1
Input collection
Receive: User ID · Current location (lat/long) · Previously ordered restaurants · Session timestamp
2
History check
Does the user have ≥ 5 orders? YES → proceed to taste profile. NO → serve trending/popular fallback. END.
3
Source A - User-user similarity
Cluster users with similar behavioural vectors (cuisine type, basket size, meal timing, delivery zones). Pull restaurants frequently ordered by the 40 nearest lookalike users.
4
Source B - Restaurant-restaurant similarity
Identify restaurants similar to the user's past orders by cuisine type, pricing tier, and dish category. Excludes restaurants already in Source A to avoid duplication before merge.
5
Merge + composite score
Combine Sources A and B. Apply composite ranking: Proximity → Cuisine match → Rating → Lookalike frequency → Delivery area match.
6
Diversity pass
Apply 4 diversity rules: cuisine cap (max 3 consecutive same cuisine) · 60/40 familiar/new split · budget spectrum coverage · locality filter. Re-rank output to enforce all four rules simultaneously.
7
De-duplication pass
Remove any restaurant already appearing in another homepage carousel. Ensures Picks For You contains exclusively unique recommendations not visible elsewhere on the screen.
Final output
Scored, ranked, diversity-enforced list of personalised restaurants → served to "Picks For You" carousel. Cached per user session for homepage load performance.