Design the Facebook post privacy functionality
Once user makes a post on Facebook, they can decide the restriction of post that can be made available to other users. This feature is called privacy. In technical term, this is also related to authorization check.
Goal is to develop the backend system with the following key features:
- Enable a user to specify the different levels of privacy for a post so that it is only visible to a particular set of users on Facebook.
- Implement four levels of privacy
Public
,Friends
,Friends of friends
andCustom groups
High level system design
Following is high level system design to implement Facebook privacy feature.
When user make a post to Facebook, Posts Service write data is written into Posts
table. It also makes rpc call to Authorization Service to update the post and it’s relations with other users.
When other user uses Feed Service to get the list of feeds then Feed service makes rpc call to Authorization service and drop the post if user doesn’t have permission to access it.
Authorization Service
Authorization service is core to take care of privacy implementation. This would be a generic service to handle Authorization check.
It allows application to define their own relations type and map with supported authorization levels followed by apis to check the relationships auth level.
It supports following three authorization levels
- OWNER
- EDITOR
- READER
On high level, it provides following
- It allows to define the relationship as configuration file.
- Write API to establish relationship between ObjectId and UserId.
- Read API to validate <ObjectId-Relationship-UserId> tuple.
Let’s go through deeper to understand this service.
Define relationships
Taking our privacy check as an example, we need to define following relationships and what it authorize.
- post_owner
- post_reader
- user_friend
Following is example of configuration file to define these relationship.
{
name="post_owner"
type: OWNER
allowed: {
include: this
}
}
{
name="post_reader"
type: READER
allowed: {
include: this
include: "user_friend"
include: "post_owner"
}
}
{
name = "user_friend"
type: READER
allowed: {
include: this
compute_set: {
$ref: "friend_of_friend"
}
}
}
{
name="friend_of_friend"
type: READER
userset: {
include: this
}
}
include=this
means role is allowed to given relationship to itself.include=xxx
means role is also allowed to other defined relationship. For examplepost_reader
relationship allowsuser_friend
andpost_owner
relationship also to read the post.
Write API
Authorization service expose Set API to update the relationships between objects.
POST /authorization/relationships/objects
Request {
from: object_xxx <post_id>
relationship: xxxx <post_owner>
to: user_xxx <user_id>
}
In our example, when user makes a post then Posts Service makes above rpc call to update the Post and owner relationship as below.
post_1:post_owner:user1
post_2:post_owner:user2
post_3:post_owner:user3
Similarly, when a user accept the friend request then Friends service will use this rpc to update the user and friends bi-directional relationships as below.
user1:user_friend:user2
user2:user_friend:user1
user1:user_friend:user3
user3:user_friend:user1
user1:user_friend:user4
user4:user_friend:user1
Following is example to show the diagram for object and user relationships.
Read API
Read API is used to perform the authorization check. Following are details on this API.
GET /authorization/relationships/check
Request {
from: object_xxx
relationship: xxxx
to: user_xxx
}
Response {
allowed: boolean
}
Authorization service behind the scene runs the algorithm to find out the path between source and target to decide if valid authorization check established or not.
Instead of making Check API call one by one, Authorization service can also provide batch request as below.
GET /authorization/relationships/check/batch
Request [
batch: [{
from: object_xxx
relationship: xxxx
to: user_xxx
}]
Response {
data: [
{
from: object_xxx
relationship: xxxx
to: user_xxx
allowed: boolean
}
]
}
User Timeline service with privacy support
In the user timeline, we can show following posts.
- Only my post
- My friends post
- My friend’s friend post
- Others Public post
Feed service which takes care of returning feeds for user, it will make Check
rpc call for each PostId to validate if post_id_xxx
has read_post
relationship access to user_id_xxx
. Based on that it will decide to show the post to user or not.
Feed Service can also make one batch call to get the authorization details and based on that it can decided to exclude posts from the feed.
Please note; Public pots are allowed to be viewed by any users therefore no need to make rpc call to Authorization service. Feed service will simply include all the public posts by default.
Authorization Server Architecture
Facebook application serves billions of users across the globe. To match this scale, we need to globally scalable distributed transaction database. Distributed transaction is needed to support the realtime privacy update to the impacted users.
Following is overall design for this Authorization service to handle the scale.
aclservers
aclservers are the main server type. They are organized in clusters and respond to Check, and Write requests. Requests arrive at any server in a cluster and that server fans out the work to other servers in the cluster as necessary. Those servers may in turn contact other servers to compute intermediate results. The initial server gathers the final result and returns it to the client.
Globally distributed databases supporting shard and transaction
Authorization Server stores ACLs and their metadata in following Global databases. This is key component of Authorization server.
- One database to store relation tuples for each client project
- One database to hold all project configurations
- One changelog database shared across all projects
aclservers read and write those databases in the course of responding to client requests.
Tuples
table store each row identified by primary key (shardID, objectID, relation, userId, commitTimestamp
). Following is sample for this table.
ShardId ObjectId Relation UserId CommitTimestamp
1 post_1 post_owner user_1 t1
1 user_2 user_friend user_1 t2
1 user_3 user_friend user_1 t3
1 user_4 user_friend user_5 t4
1 user_1 user_friend user_2 t5
1 user_1 user_friend user_3 t6
1 user_5 user_friend user_4 t7
watchservers
watchservers are a specialized server type that respond to Watch requests. They tail the changelog
and serve a stream of namespace changes to clients in near real time.
Update user’s Timeline on realtime if change in privacy
In case owner of posts change the privacy, we would want other impacted user’s timeline should be updated without forcing client to refresh it’s timeline page.
In-order to support the realtime update, we would need to implement SessionServer concept as following
- Allow every client to establish bidirectional streaming connection to
SessionServer
- Develop
SyncService
to use the Authorization Serverwatchserver
to establish streaming connection to watch the posts sent to users previously. Feed Service will make RPC call to SyncService to send the list of PostIds sent to user. - If there is any change in ACL for these PostId and UserId then Authorization server
watchserver
will emit event. Which will triggerSyncService
to push change in ACL to the client. - Client will then hide the posts in case if there is change in ACL which no longer allows user to show that post.
Note: This design will still not show the new posts might become available to show to user in case of change in ACL. Due to complexity, for now we will assume user will refresh the browser to see it.
Reference
This design is based on Google’s ACL server Zanzibar. Feel free to go through following paper to learn more about it.
Hope you have enjoyed this post. Keep designing systems :-)