Comment reply system design

Dilip Kumar
3 min readJul 17, 2024

--

Designing a comment system, is a feature integrated into websites that allows users to post comments on published content, interact with other users through replies, upvote or downvote comments, and sort or filter comments.

Schema design to store threaded reply

Approach 1: Use parent child relationship

Following can be schema for Posts table.

PostId  Text       CreateTimestamp  UserId
1 Party? 1 1
2 Tennis? 2 2

Following can be schema for Comments table with CommentId as primary key.

CommentId PostId ParentCommentId Text      AuthorUserId  CreateTimestamp
1 1 NULL Hello 1 1
2 1 1 Hi 2 2
3 1 2 Meeting? 1 3
4 1 3 No 2 4

Following can be query to read all the comments of level1 and level2 on post #1.

SELECT 
PostId,
parent.Text as Level1,
child.Text as Level2,
FROM Comment AS parent
LEFT JOIN
Comment as child on child.ParentCommentId = parent.CommentId
WHERE PostId = 1;

But query become complex if need to find N level comments. Following is example of repeated self join query.

SELECT 
PostId,
p1.Text as Level1,
p2.Text as Level2,
p3.Text as Level3,
p4.Text as Level4,
p5.Text as Level5,
p6.Text as Level6,
FROM Comment AS p1
LEFT JOIN
Comment as p2 on p2.ParentCommentId = p1.CommentId
LEFT JOIN
Comment as p3 on p2.ParentCommentId = p2.CommentId
LEFT JOIN
Comment as p4 on p2.ParentCommentId = p3.CommentId
LEFT JOIN
Comment as p5 on p2.ParentCommentId = p4.CommentId
LEFT JOIN
Comment as p6 on p2.ParentCommentId = p5.CommentId
WHERE PostId = 1;

Obviously, this approach goes out of control quickly.

Approach 2: Parent path-style Identifiers

We can remove the ParentCommentId column from Comments table and add Path column to store the path hierarchies of comment.

CommentId PostId Path            Text      AuthorUserId  CreateTimestamp
1 1 NULL Hello 1 1
2 1 1/ Hi 2 2
3 1 1/2/ Meeting? 1 3
4 1 1/2/3/ No 2 4
5 1 1/ How are you? 3 5

We can now query parent comment by comparing the current row’s path to a pattern formed from path of another row.

For example, to find out ancestor of comment #4 whose path is 1/2/3/ we can run following query.

SELECT CommentId, Text
FROM Comments AS c
WHERE (Select Path from Comments Where CommentId=4) LIKE c.Path || '%'

This matches the patterns formed from paths of ancestor 1/2/3/% , 1/2/% and 1/% .

Drawback of this schema is that Database can’t perform any schema validation on Path column.

Approach 3: Tree style table

We can introduce Relationship table to store the comments relationship instead of using the Comments table.

Following will be schema for Comments table.

CommentId PostId Text      AuthorUserId  CreateTimestamp
1 1 Hello 1 1
2 1 Hi 2 2
3 1 Meeting? 1 3
4 1 No 2 4

Following will be the schema for Relationship table.

ParentCommentId    ChildCommentId
1 2
2 3
1 4

To get the all the child comments for #1 will be as below.

SELECT *
FROM Comments AS C
JOIN
Relationship p ON c.CommentId = p.ChildCommentId
WHERE
C.CommentId = 1

Similarly the query to print all the parents will be as below.

SELECT c.*
FROM Comment AS c
JOIN
Relationship AS p ON p.ParentCommentId = c.CommentId
WHERE
c.CommentId = 3

Approach 4: XML/JSON based schema

We can leverage either XML/JSON based NoSQL storage system and write entire comments thread into a single XML/JSON object. Everytime a reply is made will be modifying this XML/JSON file.

High level System design

On high level, we can have microservice for each components as below.

Enjoy system designing :-)

--

--

Dilip Kumar
Dilip Kumar

Written by Dilip Kumar

With 18+ years of experience as a software engineer. Enjoy teaching, writing, leading team. Last 4+ years, working at Google as a backend Software Engineer.

No responses yet