Price drop tracker system design

7 min readSep 1, 2024

Design a system to track the price change in product listed on Amazon.com and alert to users if change in price meet the criteria.

User can watch for following in terms of price change

Relative Price Change: User can set a price drop percentage and should get notifies if percentage price drop to that level.
Absolute price drop to x: User can set a price target value and should get notifies if price drops to that level.

User should be notified as soon as product price change is detected.

There are two types of system design on price tracker

#1 Server driven price drop criteria

User intent to subscribe product price drop
Server tracks the price drop and notifies all the subscribed users.

#2 User driven price drop criteria

User intent to subscribe product price drop as per their own criteria
Server keep checking every user’s criteria and only notify to that user if condition match.

In this design, we are going to focus on #2 but will keep design flexible to support both.

We should consider following scale to handle traffic and optimal design

Support 1+ million active users
Each user can subscribe up to 100 products
Globally we should support 10+ million products

High level System design

Following is high level system design for price drop checker system.

On high level we have following systems

Crawler: It crawls products for price changes.
Price Check Systems: It detects the product price drops as per user config and generate events.
Product Subscription Service: It exposes API to allow user to subscribe for products.
Alerts System: It takes care of sending alerts to user.

Following is high level flow

User request for product price drop subscription.
Crawler system fetch product prices every one hour and publish events to ProductPriceQ.
Price checker system fanout to each user and run check to verify if price drop matches the user’s subscription request. If it does then publsih event to AlertsQ.
Alerts Queue handler takes care of sending alerts to user.

Let’s go through each systems in details.

Product Subscription Service

It exposes following API which allow user to subscribe for product price drop alerts.

POST /products/<product_id>/subscription
Request {
  UserId: xxx
  PriceChangeThreshold: x%
  PriceValueThreshold: xxx
}
Response {
  Status: OK
}

It writes user subscription request into ProductSubscriptions table. Following is schema for this table.

ProductId  UserId  PriceChangeThreshold  PriceValueThreshold CreateTimestamp
xxx        xxx     xxx                   xx                  xxx

Primary Key: ProductId + UserId

ShardKey: ProductId

Product Crawler

Product crawler will take care of scanning product price changes. We have two choices here.

Call eCommerce API to get the product details.
Let eCommerce to call our pre-registered callback in case of change in any price.

If eCommerce support #2 option then it makes overall design cleaner and we can produce alert near realtime.

If #2 is not available then we will have to make eCommerce API on certain interval. Let’s design our system based on #1.

Following is key feature for crawler.

Be polite while making HTTP call to target eCommerce server.
Handle duplicate products to avoid hitting target server for the same product.
Design to rescan product to get the latest price.
Cleanup the product that doesn’t exist anymore.
To discover new products, leverage the similar products API from eCommerce server.

Following are overall design for product price checker crawler.

Product Price Queue

There are two ways to initialize this queue.

On demand: We can only scan the product if user has shown interest.
Scan all product: We can scan all the product and be ready with product data.

#1 has works well for small scale. However if user base grows then #2 become a better solution. For the sake of discussion, we will go with #2 approach here.

We can initialize ProductPriceQueue with initial list of products we would like to start crawling. Handler for this queue does following

Make rpc call to eCommerce to get the product price details.
It writes new price details into ProductPriceHistories table only if new price is not same as previously captured price.
If it discover that product price is changed then it will publish message into ProductPriceQueue to notify Price Checker System to take care of change in price.
On success publish message into RelatedProductsQueue to discover new products.
It will also enqueue new message for the same product into ProductPriceQueue with delay of x hours to add gap before processing same product again.

Following is schema for ProductPriceHistories table.

ProductId  Timestamp  Price
xxx        xxx        xxx

PrimaryKey = ProductId+Timestamp

ShardKey = ProductId

How to discover price changed?

It can run following query to read the last price seen for this product.

SELECT Price
FROM ProductPriceHistories
WHERE ProductId= xxx 
ORDER BY TIMESTAMP DESC
LIMIT 1

Use the lastPrice and compare with current price to detect if there is any change.

But above query will not be efficient as it scans all the historical prices for the product. Since our goal is to only read the last price therefore we can add ProductLastPrice table and always use this table for comparison. Following will be schema for this table.

ProductId   CurrentPrice UpdatedAt
xxx         xxx          xxx

PrimaryKey = ProductId

ShardKey = ProductId

Now we can run primary key based query to read the last scan price.

SELECT Price
FROM ProductLastPrice
WHERE ProductId= xxx

Note: We will have to make sure to run idempotent query to handle error retry. This can be done including timestamp as condition while updating this table.

Politeness design while making rpc call to eCommerce server

Since there is going to be a single eCommerce server therefore we need to be really sensitive to increase traffic to server. We can take following approaches.

User-Agent Identification: We should specify the user-agent has crawler so that server can understand who is accessing their side and potentially tailor their response.
Delay Between Requests: We should leverage delayed delivery message delivery to add gap between next processing of same product.
Handle Errors Gracefully: Implement error handling mechanisms to deal with HTTP status codes like 404 (Not Found), 500 (Internal Server Error), and rate limiting responses. If a request fails, first understand the error code and if error is retry-able then only retry that also after a suitable delay.
Rate Limiting: We should leverage the queue partition technique with fixed number of partition along with control rate to process the each product to have determinist outbound traffic to eCommerce system.

Following is design for Crawler to handle politeness.

Let’s estimate these numbers.

No of products = 10 million

QPS supported by eCommerce = 1000 qps

Number of workers = 1000

Time to scan all the products =10million/1000/(60*60) = 2.7 hours ~ 3 hours (consider delayed delivery as well will add some delay for next call).

It means in worst case, we will have 3 hours of stale data in our server.

Do we need to worry here?

Yes if we don’t design to match the target server architecture. eCommerce like Amazon doesn’t have concept of global qps. Amzon has partitioned their infra based on different geo location. User access pattern is also different due to time zone difference, product availability, users traffic etc.

For example, 1+ million user doesn’t mean all will be accessing our system same time. May be x% users are from us-east-zone, y% from us-west-zone, z% from eu-zone, p% from south-america-zone, q% from africa-zone, s% from asia-zone etc.

It means we should also consider our scan target to match these geo located users. Let’s say one of the hot zone serve 30% traffic. It means we only need to focus on scanning 3million product.

Time to scan all the products =3million/1000/(60*60) ~ 50 minutes ~ 1 hours

As you see, we can reduce our scan frequency if we match the target server geo location sharding. Following is design to showcase zone based partitions.

Price Checker System

Price checker system will take care of calculating if there is a drop in price then notify the alert system. Following are design for this system.

Product Price changed Queue

A message will be published to this queue if price change was detected by the crawler system. Since our system is required to calculate the drop alert based on user requested threshold therefore we first need to fanout this message to all subscribed users for the product.

Following is schema of message in this queue.

{
  ProductId: xxx
  PublishTimestamp: xxx
  Payload: {
    ProductId: xxx
    CurrentPrice: $xxx
    PriceTimestamp: xxxx
    LastPrice: $xxx
   }
}

Handler for this queue will do following.

Read the list of subscribed users from ProductSubscriptions table.
Publish message into UserPriceDropQueue for further processing.

User Price drop queue

Following is the schema for this queue.

{
  UserId: xxx
  PublishTimestamp: xxx
  Payload: {
    ProductId: xxx
    CurrentPrice: $xxx
    PriceTimestamp: xxxx
    LastPrice: $xxx
    UserId: xxx
    PriceChangeThreshold: x%
    PriceValueThreshold: xxx
   }
}

Handler for this queue will do following.

Calculate the price value based drop in price.
Calculate the price changed threshold based drop.
If it satisfy the threshold limit then publish message to AlertsQueue otherwise drop it.

Alerts system

Alert system will take care of notifying user about the price drop. Based on the user preference, it will notify either based on email or mobile push notification or realtime browser based notification.

Hope you have enjoyed this design. Feel free to share your feedback. Enjoy learning :-)

Price drop tracker system design

High level System design

Product Subscription Service

Product Crawler

Product Price Queue

Related Products Queue

Politeness design while making rpc call to eCommerce server

Price Checker System

Product Price changed Queue

User Price drop queue

Alerts system

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Dilip Kumar

Responses (1)