More and more data is being created every day. Everyone might have heard that 90% of the world’s data has been created in the last 2 years alone. Ever wonder where and how that much data can be stored (or) which database should we choose to store that data?
Data is very crucial for business. With growing volume of data, we usually need a mechanism to store data for an unlimited period of time. Databases are and have been a very reliable source for storing data for decades. They not only store the data but also provide security and manage data efficiently.
One should think very carefully while choosing a database. The main goal is to choose the right database at the beginning of a project and stick to it till the end. Changing the decision later will be difficult, expensive and result in the dissipation of employee efforts. So, organizations cannot afford to go wrong in the first instance.
When in those situations, organizations will pick the technology they know best. It would just be like trying to get a database to work even if it doesn’t meet your needs. The decision should be based on how the software meets the needs of your organization, rather than familiarity with the platform that the program is developed in.
SQL and NoSQL – can they be friends?
The real inquest here are the scalability and availability requirements for your database.
Relational Databases are best fit for transactional type applications, as they are more versatile and furnishes atomic properties. These databases are extensively supported and minimize data redundancy. RDBMS follows ACID properties. That is, a database transaction must be Atomic, Consistent, Isolated and Durable.
- Atomicity : Atomicity requires that each transaction is “all or nothing.” If one part of the transaction fails, the entire transaction fails and the database state is left unchanged
- Consistency : The database should remain in a consistent state before a transaction starts and once it’s over
- Isolation : If any modifications are done to a transaction, those changes should be independent of another transaction
- Durability : Once the transaction is successfully done, it should persist and not be undone
In SQL databases, the amount of data stored will be depending on the physical memory of the system. When a relational database grows out of one server, it is no longer that easy to use. RDBMS are vertically scalable i.e load can be increased by increasing the CPU, RAM, etc.
However, today’s data is not properly structured, and a different format is needed to store the data which is not related to SQL. This is where NoSQL comes in. NoSQL databases will manage with various portions of ACID in order to achieve certain other benefits like Eventual Consistency, Partition Tolerance, and availability. NoSQL databases follow Brewers CAP theorem.
The CAP theorem states that a distributed computer system cannot guarantee all of the following three properties at the same time. They can only support two out of the following properties at a time:
- Consistency : once data is written, all future read requests will contain that data
- Availability : the database is always available and responsive
- Partition Tolerance : If one part of the database is unavailable, other parts remain unaffected
NoSQL doesn’t have any limits on storing the data as the system can be scaled horizontally. The NoSQL databases can usually scale across different physical servers easily without needing to know which server the data you are looking for is on. If one machine with NoSQL fails, the query will be executed on another replica.
A Data Fairytale
Depending on what problem the Organization is trying to solve, it will determine if a NoSQL or SQL should be used. Choosing the right database depends on the requirements and the use case. An application may have different requirements for storage and scaling data. SQL and NoSQL are each good for specific applications. It can be said that NoSQL databases are a compliment in the database field and not a replacement for SQL.
There really isn’t a ‘one-system-fits-all’ approach, so why not use both?
NoSQL and SQL are very different but their differences complement one another, with each delivering functionality the other cannot. Many Organizations may choose to use a hybrid mix of both.
For Example, NoSQL is a great tool for online e-commerce platform which sacrifices ACID properties to provide flexibility and processing speed. Because of the high volume of bandwidth coming through their servers, they need to be able to store all of the information (like clickstream, etc) coming in at a fast and reliable rate. They will be in need of a highly reliable, ultra-scalable key-value database.
While for storing transactional data of that e-commerce application, data integrity is very important. The transactional data will be well structured. If one of the steps in the transaction fails, then the steps must be rolled back to the state where no changes were made to the database. RDBMS will be best suited for this situation. By this, we can conclude that SQL and NoSQL can co-exist with each other.
Good Article Mounika, Very Informative.
Very good effort by Author, well defined points for both database flavors. And good scenario explanation with simple jargon.
Informative and interesting. Godspeed, team!
Great work, Mounika. Really useful.