There are huge number of choices, a huge number of benefits, competitive communities & multiple use case scenarios. Finding out the best possible match database in Graph-Based or in Column-Based or it is Document-Based, it is never easy.
Every category has a list of different options to choose from with a different number of advantages or disadvantages.
In this post, we will try to understand the advantages, disadvantages, examples & use cases for most of the popular available databases.
Three main keys to consider while choosing a database
Availability
Consistency
Partition Tolerance
According to the CAP theorem (Brewer’s theorem), when you are designing a distributed system you can get cannot achieve all three of Consistency, Availability and Partition tolerance. You can pick only two out of above mentioned three. ~ wiki
Let’s get into practicality of selection. We have divided No-SQL Databases into 4 categories.
Key Value
Document-Based
Column-Based
Graph-Based
Categorized Database Names:
Key Value: Riak, Redis Server, Memcached, Scalaris, Tokyo Cabinet.
Document-Based: MongoDB, CouchDB, OrientDB, RavenDB.
Column-Based: Cassandra, Hbase, Hypertable, BigTable.
Graph-Based: Neo4J, InfoGrid, Infinite Graph, Flock DB
Apart from above there is something more called, Multi Model Database. We’re going into it after covering these above 4.
Key Value
It is faster, but it’s schema less (unstructured).
Examples: Url Shortner, PasteBin, E-commerce- in usecases for, temporary prices, user profiles, product recommendations, session information etc.
Companies using:
Twitter uses Redis to deliver your timeline.
Pinterest uses for follwers, following etc.
Document Based
Designed for storing, retrieving and managing document based information.
Advantages: Data Tolerant
Disadvantages: Query Performance, no structured query.
Use Cases: Can be used as scalable general purposes.
Example: A famous weather app (iOS), delivers weather alerts to 40M users, SEGA uses MongoDB for handling 11M in-game accounts.
Column Based
Offer very high performance and a highly scalalable architecture, because it is fast to load data and query it.
Excellent real time usages:
– Tweet information of a user is saved as column-wise
– Organizes the data into rows and groups of columns.
– Facebook: uses Column-based for nearby friends (Hbase).
– Spotify: uses Cassandra to store user profile attributes like, artists, songs etc.
Graph Based
Graph based is used for various purposes and used by many good companies.
– LinkedIn- For showing connections
– Google Knowledge Graph– For example, search for – Indian Prime Minister and the first result box given by google is an example of graph based.
– Walmart– uses Ne04J for customers personalized products recommendations.
– Medium– uses Neo4J to build their social graph to enhance content personalization.
The below picture depicts where and when you can utilize.
Source: Martin Fowler
Multi Model Database
A Multi Model Database, combines the capabilities of Column-Based, Graph Based, Document Based and Key Value Databases.
Example: Microsoft Azure Cosmos DB, Orient DB
The Monolithic Database Approach
Issues in monolithic approach
Difficult to make schema changes
Vertical scaling
Single point of failure
Technology lock-in
Let’s split around this monolithic approach to resolve issues
Data Categorization
Transient Data:
Information generated from application/system.
1- Events, logs, signals
2- No persistent storage so it should be highly available
Ephemeral Data
Temporary data whose sole purpose is to improve the user experience by serving information in real time e.g. cache for user experience.
Operational Data
Information gathered from user sessions – such as user clicks, cart data.
Transactional Data
Payment processing and order processing data.
I hope this is somewhere useful for you. Let me know your views.
Thanks for reading.
Read the Full Post here: http://becauseitsonelife.com/?p=106
Top comments (0)