Ticket master system design

9 min readJul 19, 2024

Requirements

Design an online booking system to purchase tickets for various events. User should be able to search and browse events. User should be allowed to view and select the seats. User should be claim ticket and to hold it for 5 minutes until it expires or successfully able to purchase it.

Scale

A popular event can have 1+ millions of users.
On every day 100 new events are posted for ticket sale.
Event can be watched online as well therefore consider the geo distributed users.
Rate of ticket selling is fast, consider a user experience to update the client on changing the ticket availability.
Allow user to select the seat

Complexity of system

If multiple user is trying to claim for the same seat then only one user should be completed successfully, rest should be rejected.
There should be mechanism to notify rejected users in case of availability of tickets.
App should keep updating the availability of tickets if seats gets available on cancellation or rejection.
System should have fairness for ticket reservation.

High level system design

Following is high level system design.

On high level, following is ticket booking flow.

Search: User will search events based on their choice. It will return list of EventIds with the other minimal data needed to show details as a list to user.
Avail: Search service can make one time call to backend to get the seat availability details to show on the search result page.
Details: User click on event to learn more details about event. This will make Events service to get the data. This service will also make static call to backend to get the seat availability details to show on the event details page.
Seat Selection: Seats service will be used to show the seats. At this time client will establish streaming connection to the Session server. This is needed to update the seat status realtime to connected clients. At this time we will also capture the User and Events mapping so that server can do the fanout about seat status.
Payment: Until payment, flow is called shopping. Payment flow onwards it is called Booking. Before payment, we start the price lock (could be external and internal call to lock the price). Generally a third party API is called for the payment processing. ThirdParty gives either Poll API so that we can keep polling on certain interval. Or it also provides a callback mechanism. Where ThirdParty will call our API once payment processing is either completed or failed.
Booking is a two step process HOLD and BOOK. During price check, inventory is put on HOLD. If price change then again show new price to user and ask them to accept it. Once payment is completed successfully then BOOK process is started to complete the booking. Generally, system is designed to keep HOLD longer and very short period for Payment window. So that payment should always be completed before HOLD. Entire booking will be on hold if payment confirmation is not returned from external payment.

Search Service

To support fulltext search, we will maintain a separate search index. Following fields will be indexed for better search user experience.

Event title
Event description
City
Celebrity

User will also be able to apply date filter to narrow down searching events based on their desired timeline.

On high level, following is API for search events.

GET /events/search
Request {
  Title: xx
  Description: xxx
  City: xxx
  DateFrom: xxx
  DateTo: xxx
}
Response {
  data: [
   {
      EventId: xxx
      EventImage: xxx
      Title: xxx
      Description:xxx
      City: xxx
      ShowTime: xxx
      SeatsAvailable: xxx
     }
  ],
  nextPage: {
    size: xxx
    start:xxxx
  }
}

In order to return the latest seats inventory status, rating etc details, Search API will also make query to other system to get the desired details.

Events Service

Once user sees the list of events they will click on one of the event to see the more details. Events service will be used to show more details about selected Event. This service will use Events database table to get the more details. Following are details on API needed by this service.

GET /events/details/<event_id>
Response {
    EventId: xxx
    EventImage: xxx
    Title: xxx
    Description:xxx
    City: xxx
    ShowTime: xxx
    SeatsAvailable: xxx
    Rating: xxx
    ConcurrentUsers: xxxx
}

Following is schema for Events table.

EventId Title Description City Image ShowTime Rating 
xx      xx    xx          xx   xx     xxx     xx

Events table will have EventId as primary key. We can keep City as shardKey to keep events belongs to one city on same shard server.

In order to return the latest seats inventory status, rating etc details, EventDetails API will also make query to other system to get the desired details.

Seats Service

Seats service will expose API to allow user to view the seats arrangement. This will allow user to choose the right seats to start the reservation process. Following are API details.

GET /events/<event_id>/seats
Response {
  data: [
         {
          SeatId: xxx
          EventId: xxx
          Row: xx
          Col: xx
          Label: xxx
          Status: FREE/LOCKED/BOOKED
        }
  ]
 nextPage: {
    size: xxx
    start: xxx
  }
}

This service will use Seats table to get the seat details about the event. Seats table will have SeatId as primary key. EventId will be used as ShardKey to keep all seats for the same event on same shard. Following is schema for this table.

SeatId EventId RowId ColId Label 
xx      xx      x      x    xx

Seat availability will be based on Reservations table. In the beginning of launching event for sale, this table will be populated from Seats table with status as FREE .

SeatId EventId Status HoldExpirationTime  UserId
1      1       FREE    
2      1       FREE
3      1       FREE
4      1       FREE

Following will be the enums for Status.

FREE
HOLD
BOOKED

Following is query to get the seat details.

SELECT S.SeatId, S.RowId, S.ColId, S.Label, R.Status
FROM Seats S, Reservation R
WHERE S.seatId AND = R.SeatID AND S.EventId=xxxx

At this time client will establish streaming connection with SessionServer. To manage the fanout, we will maintain EventSessions table with following schema.

EventId  UserId Hearbeat ExpiryTime(TTL)
xx       xxx    xxx

Client will keep sending heartbeat which will be used to add ExpiryTime column used as row retention policy.

Please refer to SessionServer for more details.

Reservation Service

Reservation service will be used to book the tickets. User can claim multiple seats in one request.

Hold claim

Each request will be processed using database transaction. If multiple users makes the request for same Seat then only one will be completed successfully and others will fail. On successful transaction, it will update Status , HoldExpirationand UserId column.

SeatId EventId Status  HoldExpiration UserId
1      1       HOLD    t1+5mins         1    
2      1       HOLD    t1+5mins         1
3      1       FREE
4      1       FREE

Following is query to perform write

UPDATE TABLE Reservations
SET UserId=xxxx, Status = 'HOLD'
WHERE SeatId=xxx AND STATUS=='FREE'

To make above query idempotent, we will have to combine it with READ+MODIFY transaction.

Notify Seat status

On successful transaction, we have goal to notify other active client interested for that Event to show the seatmap with updated status.

We have two approaches

Part of Reservations transaction, we can also leverage distributed transaction tech stack to also publish a message into NotificationQueue as well as HoldExpirationQueue . Entry in HoldExpirationQueue will be made with 5mins of delayed delivery so that handler will received this message after 5mins only.
We can use changelog stream on Reservations table and publish message into NotificationQueue if status column was changed. We can also publish to HoldExpirationQueue only if the status was changed from FREE to HOLD with 5mins of delayed delivery.

For #1, we need specialize database like Google Spanner otherwise we can also go wtih #2 as it allows to use separate database stack for Reservations table and separate messaging queue.

We will go with #1 for better architecture.

NotificationQueue Handler

This will take care of fanout seat status based on EventSessions . It will makes rpc call to SessionServer which will take care of sending seat status to the recipient user.

HoldExpirationQueue Handler

After ~5mins, this message will be delivered to handler. Once message received, handler will do following.

Check the Reservations table to see if Seat is still on hold.
If seat is still on hold then mark it as FREE and unset UserId.
If seat is no more hold then ignore the message.

Note: We can also design to give a prompt to user if they want to extend the hold. If not accepted then perform the cleanup.

How to handle rejected claim

If claim is rejected then user will received new status of seat if other user had already successfully hold the seat as streaming events.

Once user sees that their earlier seats are no more free then either user can wait or try other seats.

There is alternate way to handle the booking where we don’t allow user to choose seat, instead system decides seat.

Alternate booking: Doesn’t allow user to select seats

We can simplify claim system if we don’t allow user to select the seat, instead only allow to request for number of seats with preference and let server to allocate seats. This approach is used in train booking system etc.

In this approach, booking service will simply enque the request into AsyncBookingQueue. None of the user’s claim request is rejected. All requests are concurrently written into queue. System will process the request one by one (or in batch) to allocate the next available seats based on choices.

Session Server

Session Server will be used by client and server to establish the bi-directional streaming connection.

Client uses Registration rpc to find out the available target. Server allocate the available session server and return the details to the client. Client uses that details to establish streaming connection to target server.

It also expose DispatchEvent rpc to send event to connected clients.

Payments Service

Once seats are hold by user then user use Payment Service to start the payment flow. This service does two things

Make external payment system to begin the payment
Publish message into PaymentProcessingQueue to track the payment with 2 minutes of delayed delivery time.

If ThirdParty Payment system provides callback mechanism then we will register our /payment/callback API and once this API is invoked by Third party then we will update the Reservation status based on fail/success.

If ThirdParty payment system gives their API and we should be calling it in interval then code at the client side should be calling it in interval to check the status. At the end it should call /payment/callback API to update the payment status.

PaymentTrackerQueue handler will look into Reservations table and check if payment is not yet completed then notify user if they want more time. If want more time then client should make /payment/extend rpc call which simply enqueue a new message into PaymentTrackerQueue with next two minutes of ttl. If payment was completed then it will mark the reservation completed.

Scale system to match the requirement

To support globally distributed users, we need to make sure that every requests are served by co-located servers.

We can achive this by partitioning servers and map users to that partition. We can come up with following partitions.

Compute layer partitions: us-01, us-02….us-0n,eu-01-…eu-0n, asia-01….0n,.
Database layer partitions: A corresponding shards are created for storage layer us-01, us-02….us-0n
A separate Homemap database is used to maintain the user and it’s partition based on geo location, load and other scenarios.

UserId  AppPartition DbPartition
xx      xxx            xxxx

4. A smart router is used which is aware of UserID as entity present in the every request. Note: This forced user to must logged in before start interacting with application. Alternate approach is use the generate unique session_id and use it as ID for user for partition.

5. Once router allocates the compute layer shard then all the requests for that user will goes to same app layer.

6. Database sharding router also uses same Homemap db to find out the partitions for the data.

This gives us a opportunity to scale the geo location based traffic. Based on the API latency, we can add copies of replicas in that partition to achieve the desired traffic.

Do we need to match any global qps?

Global qps based scaling is a waste exercise. Across the geo, due to different timezone, user’s online presence is different. Therefore instead of trying to scale our system for global, we will scale based on the geo boundaries as discussed in the apps partition above.

Hope you have enjoyed this blog :-)