Pathao Food Personalised Restaurant Ranking

01 — Background

Every collection showed the same restaurants, in the same order, to everyone.

Pathao Food organises its restaurant catalogue into curated collections - "Lunch Specials," "Budget Meals," "Best Nearby," and dozens more. These collections are prime discovery real estate: a user who taps "Budget Meals" has already told us something valuable about their intent. They're in a budget-conscious mindset right now.

The problem was that we weren't using any of that context. Every user who opened "Budget Meals" at noon saw the exact same ranked list. The office worker who orders rice-based meals at lunch and the student who prefers wraps saw the same first result. The ranking inside each collection was driven by aggregate restaurant popularity - a global signal that said nothing about the specific person looking at it.

We had a growing library of user order history, meal-time behavioural data, and restaurant feature data. None of it was being used to influence what a user saw first when they opened a collection. This was the gap the Personalised Restaurant Ranking project was built to close.

The brief I set: Make the order of restaurants inside every collection reflect something true about the person opening it - their cuisine preferences, their usual basket size, how they behave at this specific meal time. The best restaurant for Akash at 12:30 PM on a Tuesday should rank higher for Akash than it does for anyone else.

02 — Problem

Four compounding failures in the existing ranking model.

When I broke down why the current ranking wasn't working, I found not one problem but four distinct failures. Any solution that addressed only some of them would leave measurable value on the table.

01

Popularity bias crowded out relevance

The existing ranking was dominated by aggregate order counts - the most ordered-from restaurants in the city floated to the top of every collection, regardless of whether they matched the user's actual preferences. A user who has never ordered pizza would see a popular pizza restaurant ranked above their preferred biryani spot inside "Best Nearby." The most popular answer was rarely the most personally relevant answer.

02

No meal-time context

A user's preferences change by meal time in ways that are highly predictable from their history. The person who orders a light sandwich at breakfast is often the same person who wants a full rice meal at lunch. Showing the same restaurants in the same order at 8 AM and 1 PM treats completely different intent states as identical. The ranking had no time dimension at all - it was static across the entire day.

03

Repeat-order restaurants weren't rewarded

If a user has ordered from a restaurant three times in the past month, that's the clearest signal of preference we have - stronger than any similarity score. But the existing ranking treated a frequently-ordered restaurant the same as one the user had never touched. Demonstrated loyalty had no weight in the ranking. Users had to scroll past unfamiliar options to reach restaurants they already trusted.

04

New users got the worst experience by default

For users with no order history, the ranking had no personal signals to work with at all. The fallback was undefined - which meant the ranking was essentially arbitrary for new users. A new user's first collection-browsing experience was the least curated experience in the app, at exactly the moment when first impressions matter most. There was no explicit fallback logic, no graceful degradation.

03 — My Approach

Five decisions that shaped the product before a line was written.

Before writing functional requirements, I made five strategic calls. Each one had trade-offs and I want to be direct about how I reasoned through them.

01

Build per-slot taste profiles, not a single user profile

The naive approach would have been to build one aggregate preference profile per user. I pushed for five separate profiles - one per meal slot (breakfast, lunch, snacks, dinner, late night) - based on the insight that the same user behaves very differently at different meal times. A single profile would blur these signals together, making the ranking less accurate than the history alone. The DS engineering cost was real, but the relevance improvement justified it.

02

Use similarity scoring, not just history replay

Pure history replay - "show them restaurants they've ordered from" - is too narrow. It doesn't handle restaurants the user has never tried but would likely enjoy. I specified a user-restaurant similarity score built from four features: cuisine match, basket size range, delivery speed preference, and minimum rating threshold. This is what makes the ranking generalise beyond familiar ground.

03

Give repeat-order restaurants a hard priority boost

I wrote an explicit rule: if a user has ordered from a restaurant more than once, that restaurant receives elevated priority in the ranking - above its similarity score alone. Revealed preference through repeat orders is the strongest behavioural signal we have. No amount of similarity modelling can produce a signal as clean as "this user chose this restaurant again." This was a non-negotiable design decision.

04

Specify the fallback as a first-class requirement

Rather than treating new users as a corner case, I specified an explicit fallback path: popularity-sorted by total orders from the last 6 months, with random tie-breaking. This gave new users a sensible, well-defined experience from day one - not an undefined arbitrary list. Graceful degradation was written into the spec at the same priority level as the personalisation logic itself.

05

Ship V1 without a frontend change - ranking only

I made a deliberate scoping call: V1 would change only the sort order of restaurants inside collections. No UI changes, no new surfaces, no additional features. This let us get real-world ranking quality data before committing to a broader product investment. It also made V1 easier to instrument and attribute - if metrics moved, they moved because of the ranking algorithm, full stop.

04 — V1 Design

Building the ranking foundation - profiles, similarity scores, and the sort logic.

V1 comprised six functional requirements, all classified P00 (critical to have). None were optional - each was a necessary piece of the ranking pipeline. Here is what I specified and why.

The five meal-time slots

The first design decision was how to define meal slots. I specified five, balancing granularity against practical data density:

☀️

Breakfast

6 AM – 10 AM

🍱

Lunch

10 AM – 3 PM

🍵

Snacks

3 PM – 6 PM

🍛

Dinner

6 PM – 10 PM

🌙

Late Night

10 PM – 6 AM

The six V1 functional requirements

1

P00 · Critical

Generate user food preference profiles per time slot

Build and store a preference profile for each user × meal slot combination using order history from the last 6 months. Profile captures: preferred cuisine tags, average basket size, average delivery speed preference, and minimum restaurant rating accepted.

Acceptance criteria

Profiles must be generated and stored separately for each of the 5 meal slots
If a user has no order data for a specific slot, fall back to combined history from all other slots
Profile attributes: cuisine tags, avg. basket size, avg. delivery speed, min. rating threshold

2

P00 · Critical

Map restaurant features for similarity comparison

Generate a feature vector for every restaurant from the last 6 months of activity - the same four dimensions used in user profiles - so that user-restaurant similarity can be computed on a like-for-like basis.

Acceptance criteria

Every restaurant must have a scored feature vector covering cuisine types, avg. basket size, avg. delivery time, and rating
Feature vectors refreshed on a rolling 6-month window

3

P00 · Critical

Calculate user–restaurant similarity score per time slot

For the active meal slot, compute a similarity score between the current user's slot profile and every restaurant's feature vector. This score drives the ranking order inside each collection.

Acceptance criteria

Score must incorporate: cuisine match, delivery proximity, basket size range match, rating threshold, order frequency
Score recalculated dynamically based on current time-of-day slot
Tie-breaking: if two restaurants have identical scores, one is selected randomly - no deterministic tie-break that could create a persistent bias

4

P00 · Critical

Prioritise restaurants ordered multiple times by the user

Apply a hard priority boost to restaurants a user has ordered from more than once - elevating them above their base similarity score. Repeat orders are the strongest behavioural signal available; they must outrank any similarity-only recommendation.

Why this matters

Similarity scores are estimates; repeat orders are confirmed preference
Users should not have to scroll past unfamiliar restaurants to reach their regulars

5

P00 · Critical

Sort restaurants inside each collection by similarity score

The similarity scores computed above drive the visible ranking inside every collection. High-to-low similarity = top-to-bottom in the list. The sort updates dynamically as the user's meal slot changes through the day.

Acceptance criteria

Visible restaurant order inside any collection must reflect similarity score descending
Ranking updates automatically when time crosses a slot boundary
No frontend card format changes required - ranking change only

6

P00 · Critical

Handle users with no or insufficient order history

For new users or low-activity users where profile generation is not viable, fall back gracefully to a popularity-sorted list - defined as restaurants ranked by total orders in the last 6 months within the relevant collection.

Acceptance criteria

Fallback triggered when user has insufficient history for reliable slot profile generation
Fallback ranking = order count from last 6 months (descending)
Tie in popularity: random selection - no persistent ordering bias
Fallback degrades silently - no user-facing indicator that personalisation is absent

A concrete example

Akash opens "Lunch Specials" at 1:15 PM on a weekday

→

Slot detection: 1:15 PM falls in the Lunch slot. System retrieves Akash's lunch preference profile: biryani/kebab dominant, mid-basket size, avg. delivery 25 min, min. rating 4.0.

→

Similarity scoring: Every restaurant in "Lunch Specials" is scored against Akash's lunch profile. "Dhaka Biryani House" (cuisine match, rating 4.3, similar basket) scores 0.87. A popular but cuisine-mismatched restaurant scores 0.41.

→

Repeat-order boost: Akash has ordered from "Spice N Rice" 4 times. It receives priority elevation, placing it above its raw similarity score of 0.79.

→

Final ranking: Spice N Rice (repeat-order boost) → Dhaka Biryani House (0.87) → other high-similarity matches → low-similarity restaurants at the bottom.

→

Result: Akash sees his trusted spot first, with relevant discovery options immediately below - not after scrolling past ten popularity-driven but personally irrelevant results.

05 — V2 Design

Extending the algorithm - additional signals for a richer ranking model.

V1 established the core ranking infrastructure: slot profiles, similarity scoring, repeat-order boosting, and the fallback path. V1's post-launch data surfaced a clear gap: the ranking still defaulted heavily toward high-scoring-but-familiar restaurants, with insufficient weight for contextual factors like active promotions and real-time restaurant availability. V2 extended the algorithm with these signals.

V1 Limitation

Algorithm relied solely on preference and history signals

V1's ranking combined similarity scores and repeat-order boosts - both of which are backward-looking signals derived from historical behaviour. They said nothing about what was actively happening on the platform right now: which restaurants had live promotions, which had recently improved their delivery performance, which were experiencing a surge in orders. A restaurant that was offering 20% off was ranked identically to one that wasn't.

V2 Extension

Add popularity momentum and discount signals

Popularity momentum: Incorporate a recency-weighted order velocity signal - restaurants trending upward in recent orders receive a ranking boost, capturing real-time demand signals that history alone misses.

Discount signal: Restaurants with active promotions receive a contextual boost in the ranking - surfacing deals to users at the moment they're deciding what to order, rather than requiring them to browse separately to a deals section.

Blended scoring: V2 combines the V1 similarity-based rank with the new signals into a weighted composite score, preserving the personalisation benefits of V1 while adding real-time context.

Why I didn't add these in V1: Both signals required additional data pipelines - real-time order velocity tracking and a live promotions API feed - that weren't available when V1 shipped. Rather than delay V1 to build these dependencies, I scoped them explicitly as V2 requirements with a clear handoff once V1 data confirmed the core ranking model was working. V1 gave us the proof of concept; V2 gave us the full model.

06 — Metrics

What success looked like - and the full framework I used to measure it.

North Star: Reduction in time from collection open to checkout initiation - the clearest signal that a user found what they were looking for faster, directly attributable to ranking quality.

Layer	Metric	Why it's on the dashboard
Engagement	CTR on top-3 ranked restaurants in collection	Directly measures whether the ranking is surfacing relevant restaurants at the positions users actually look at. If CTR on position 1–3 increases, personalisation is working at the top of the list where it matters most.
Engagement	Scroll depth before first tap within collection	Good personalisation should reduce how far users scroll before finding something they want. Decreasing scroll depth = the right restaurant appeared earlier in the ranked list.
Speed	Time from collection open to checkout initiation	North Star metric. Captures the full browsing-to-decision journey. A shorter time means less cognitive load - the user didn't have to work hard to find a relevant option.
Speed	Time from collection open to order request	Extends the north star to full conversion. Collection open → order placed is the complete funnel, and ranking quality should compress it.
Conversion	Order conversion rate from collection pages	Guardrail metric. Personalisation should increase conversion, not just speed. If CVR stays flat while time decreases, users are finding options faster but still not satisfied with them - a signal to revisit ranking quality.
Conversion	Repeat-order rate on ranked restaurants	Measures whether users are returning to restaurants they discovered via the ranked collection - confirmation that the recommendations generated lasting preference, not just one-off clicks.
System	Profile coverage - % of active users with slot profiles	Operational health check. If coverage is low, many users are being served the fallback popularity ranking. Tracking this ensures the personalisation pipeline is running and coverage is expanding as order history accumulates.
System	Fallback trigger rate per slot	Shows which meal slots have insufficient data density. A persistently high fallback rate for "Breakfast" tells us breakfast ordering behaviour is thin - actionable input for the DS team on data collection priorities.

What I deliberately excluded

Session length. A ranking system that works should reduce session length, not increase it. If users are spending more time browsing after personalisation launched, the ranking may be showing them more options without making the right one obvious - optimising for session length would reward the wrong outcome.

Total page impressions. High impressions with no CTR increase means users are looking but not engaging. Tracking impressions as a success metric would obscure whether personalisation was actually creating relevance or just attracting passive attention.

Guardrail metric: Conversion rate was tracked as a guardrail, not a primary success metric. The goal was to improve speed-to-decision; conversion was expected to follow. If conversion dropped while time-to-checkout improved, that would be a signal to investigate - it shouldn't happen, and tracking it as a guardrail means we'd catch it immediately if it did.

07 — Risks

What could go wrong - and how I designed against it.

High

Sparse slot profiles for low-frequency meal times produce irrelevant rankings

Not every user orders breakfast or late-night meals regularly. A profile built on 2 breakfast orders in 6 months is unreliable. If this profile drives the ranking, users get personalisation that's worse than the fallback. Mitigated by specifying that slots with insufficient order density fall back to the user's combined cross-slot history - a richer signal than an unreliable sparse profile.

High

Repeat-order boost creates a filter bubble over time

If frequently-ordered restaurants are always ranked first, users may never see anything new in a collection they open regularly. The best-match result becomes increasingly identical across sessions. Mitigated by capping the repeat-order boost - it elevates but doesn't monopolise the top positions - so discovery candidates always appear in the visible ranking.

High

Stale profiles cause ranking quality to degrade silently

If the DS pipeline stops refreshing slot profiles, users receive rankings based on months-old behaviour that no longer reflects their preferences. This fails silently - there's no visible error, just increasingly irrelevant results. Mitigated by the "profile coverage" and "fallback trigger rate" metrics, which would show anomalies before users report quality issues.

Medium

Similarity model penalises restaurants in mismatched but relevant collections

A user who rarely orders pizza but opens "Pizza Specials" explicitly is signalling current intent that overrides their historical profile. A strict similarity score might rank their profile's cuisine preferences above the collection's explicit theme. Mitigated by ensuring the similarity model scores within the collection's restaurant set - not globally - so collection context constrains the ranking appropriately.

Medium

V2 discount signal inflates ranking for promoted restaurants at the expense of relevance

If the discount signal is weighted too heavily, collections become effectively a promotions list - ranking restaurants by who's offering deals rather than who's actually relevant. This damages trust in the personalisation. Mitigated by treating the discount signal as a tiebreaker or secondary boost, with similarity score remaining the dominant ranking factor.

Low

Random tie-breaking creates inconsistent experience across sessions

When two restaurants have identical similarity scores, the random tie-break means the same user opening the same collection twice might see different orderings. For most users this is imperceptible, but for power users who notice, it could feel inconsistent. Accepted as a deliberate trade-off over deterministic tie-breaking, which would create a persistent ordering bias that advantages specific restaurants unfairly.

08 — Lessons

What building this taught me about ranking systems and PM craft.

01

Context-awareness is not a feature - it's a prerequisite for relevance

The biggest insight from this project was that a single user profile produces mediocre personalisation. The same person has genuinely different preferences at breakfast and dinner, and treating them as one static entity produces recommendations that are half-wrong all the time. Building five slot profiles was more complex than one aggregate profile, but the ranking quality gain was not marginal - it was structural. Any personalisation system that ignores context is working with one hand tied.

02

Demonstrated preference beats modelled preference every time

The repeat-order priority rule was the simplest thing in the spec and arguably the most impactful. No similarity score is more reliable than "this user chose this restaurant again." The lesson I took from this: when designing any recommendation or ranking system, identify the clearest revealed-preference signals first. Build the complex model second. The simple rule often does more work than the model.

03

The fallback IS the product for a material percentage of users

In a two-sided marketplace like Pathao Food, new users are constantly arriving. During launch and for weeks afterward, a significant share of users would hit the fallback path. I specified the fallback at P00 priority - the same as the core personalisation logic - because a poorly designed fallback affects real users at real scale. The lesson: never design a fallback as an afterthought. Design it as deliberately as the main path.

04

Phasing by data dependency, not by feature scope, gave us cleaner attribution

I could have tried to include V2's popularity and discount signals in the initial release. The reason I didn't was pipeline readiness, not feature desirability. But the side effect was valuable: V1's clean launch gave us unambiguous attribution - any metric movement was caused by the similarity ranking, full stop. When V2 launched, we could compare against V1's baseline rather than trying to untangle multiple simultaneous changes. Phasing by dependency produced better analytical clarity than scope alone would have.

05

Speed-to-decision is the right north star for a ranking feature

I chose time-to-checkout as the north star rather than CTR or conversion rate. This was intentional. A ranking feature that works should make decisions easier and faster - the user shouldn't need to scroll as far or think as hard. CTR and conversion are downstream of that. If time-to-checkout decreases, CTR and conversion should follow. Choosing the more direct signal over the more familiar one forced cleaner thinking about what the feature was actually trying to do.

ID	As a…	I want…	So that…
US.1	Returning user with lunch-time order history	to see my preferred cuisine types ranked at the top of any collection I open at lunchtime	I find a relevant restaurant faster and spend less time scrolling through options that don't match my taste
US.2	User who orders from the same 2–3 restaurants regularly	my trusted regular spots to appear at the top of collections, above restaurants I've never tried	I don't have to search for or scroll to restaurants I already know I like
US.3	User browsing at dinner time whose breakfast preferences are very different	the collection ranking to reflect my dinner preferences, not my breakfast behaviour	the app feels contextually aware of when I'm ordering, not just what I've ordered in aggregate
US.4	New user with no order history	to see a sensible, well-ordered collection even before the system knows my preferences	my first browsing experience isn't a random or arbitrary list that makes the app feel unpolished

#	Criteria	Owner	Verified by
AC.1	User preference profiles must be generated separately for each of the 5 meal slots. Cross-slot fallback applies only when a specific slot has insufficient data.	DSE	QA: users with orders concentrated in one slot must have distinct profiles for that slot vs. others
AC.2	Similarity score must incorporate all 5 dimensions: cuisine match, delivery proximity, basket size range, rating threshold, and historical order frequency with the restaurant.	DSE	QA: confirm scoring model outputs distinct scores for restaurants varying on each dimension independently
AC.3	Restaurants ordered more than once by a user must rank above their similarity-score position in the collection. Repeat-order boost is a hard override, not a score modifier.	DSE	QA: test user with repeat-order history - verify those restaurants appear before higher-scored but never-ordered alternatives
AC.4	Restaurant ranking inside each collection must update dynamically when the user's active meal slot changes (i.e., at slot boundary times). No manual refresh required.	BE	QA: simulate time crossing a slot boundary mid-session, verify ranking updates without page reload
AC.5	Fallback for users with insufficient history: rank by total orders in the last 6 months, descending. Random tie-breaking. No user-facing indicator that personalisation is absent.	BE	QA: new test account sees popularity-ranked collection; no UI difference visible vs. personalised experience

Personalised Restaurant
Ranking Engine

Every collection showed the same restaurants, in the same order, to everyone.

Four compounding failures in the existing ranking model.

Five decisions that shaped the product before a line was written.

Building the ranking foundation - profiles, similarity scores, and the sort logic.

The five meal-time slots

The six V1 functional requirements

A concrete example

Extending the algorithm - additional signals for a richer ranking model.

What success looked like - and the full framework I used to measure it.

What I deliberately excluded

What could go wrong - and how I designed against it.

What building this taught me about ranking systems and PM craft.

Functional Spec Excerpts

Personalised RestaurantRanking Engine

Every collection showed the same restaurants, in the same order, to everyone.

Four compounding failures in the existing ranking model.

Five decisions that shaped the product before a line was written.

Building the ranking foundation - profiles, similarity scores, and the sort logic.

The five meal-time slots

The six V1 functional requirements

A concrete example

Extending the algorithm - additional signals for a richer ranking model.

What success looked like - and the full framework I used to measure it.

What I deliberately excluded

What could go wrong - and how I designed against it.

What building this taught me about ranking systems and PM craft.

Functional Spec Excerpts

Personalised Restaurant
Ranking Engine