Big Data Workflow

on

Courtesyhttps://blog.twitter.com/2015/handling-five-billion-sessions-a-day-in-real-time

Goal:

Design a system that receives events, archives them, performs offline and real-time computations, and merges the results of those computations into coherent information. All of this needs to happen at the scale of millions events per second.

Workflow:

Event reception

answers_architecture_screenshot1_0

Archival

answers_architecture_screenshot2_0

Batch computation

answers_architecture_screenshot3_0

 

Stream computation

answers_architecture_screenshot4

 

Note: Stream computation happens in parallel with archival.

 

Advertisements

Subjectivity aside, leave a reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s