Data Stack

Single-node Data store

  • SQL – SQLite, MySQL, PostgreSQL
  • Key-value – Redis, BoltDB
  • Document-oriented – CouchDB
  • Graph – Cayley
  • Block device – ext4
  • File system – XFS, btrfs

Cluster-aware Data store

  • SQL – MySQL (Galera, NDB), PostgreSQL
  • Key-value – Riak KV, Amazon Dynamo
  • Document-oriented – SequioaDB, MongoDB, TokuMX, Couchbase
  • Graph – ArangoDB, OrientDB, Neo4j
  • Column family – Cassandra, InfluxDB
  • Column-oriented – Druid, Amazon Redshift, Google BigQuery
  • Distributed block storage – Ceph RBD
  • Distributed file system – HDFS, Ceph FS, ZFS, XtreemFS
  • Object storage – Amazon S3, Ceph Object Gateway, OpenStack Swift

API

  • Graph – Pregel (from Google), Apache Giraph (from Facebook), GraphX (from Apache Spark), Apache Tinkerpop
  • Distributed SQL – Dremel (from Google), Apache Drill (open source version of Dremel), Spark SQL (from Apache Spark), Presto (from Facebook)
  • Batch processing – MapReduce (Hadoop)
  • Stream processing – Spark Streaming, Apache Storm, Apache Samza
  • Unified batch and stream processing – Google Cloud Dataflow, Apache Flink
  • Machine learning – Spark MLlib, Apache Mahout, Amazon Machine Learning
Advertisements

Subjectivity aside, leave a reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s