Introduction to NoSQL
NoSQL, as many of you may already know, is basically a database used to manage huge sets of unstructured data, where in the data is not stored in tabular relations like relational databases. Most of the currently existing Relational Databases have failed in solving some of the complex modern problems like :
- Continuosly changing nature of data - structured, semi-structured, unstructured and polymorphic data.
- Applications now serve millions of users in different geo-locations, in different timezones and have to be up and running all the time, with data integrity maintained
- Applications are becoming more distributed with many moving towards cloud computing.
NoSQL plays a vital role in an enterprise application which needs to access and analyze a massive set of data that is being made available on multiple virtual servers (remote based) in the cloud infrastructure and mainly when the data set is not structured. Hence, the NoSQL database is designed to overcome the Performance, Scalability, Data Modelling and Distribution limitations that are seen in the Relational Databases.
What is Structured Data?
Structured data is usually text files, with defined column titles and data in rows. Such data can easily be visulaized in form of charts and can be processed using data mining tools.
What is Unstructured Data?
Unstructured data can be anything like video file, image file, PDF, Emails etc. What does these files have in common, nothing. Structured Information can be extracted from unstructured data, but the process is time consuming. And as more and more modern data is unstructured, there was a need to have something to store such data for growing applications, hence setting path for NoSQL.
NoSQL Database Types
Following are the NoSQL database types :
- Document Databases : In this type, key is paired with a complex data structure called as Document. Example : MongoDB
- Graph stores : This type of database is ususally used to store networked data. Where in we can relate data based on some existing data.
- Key-Value stores : These are the simplest NoSQL databases. In this each is stored with a key to identify it. In some Key-value databases, we can even save the typr of the data saved along, like in Redis.
- Wide-column stores : Used to store large data sets(store columns of data together). Example : Cassandra(Used in Facebook), HBase etc.
Some Advantages of NoSQL Databases
Here we will be discussing some of the main advantages of NoSQL databases with examples.
Dynamic Schemas
You must be wondering what does dynamic schema means? In Relational Databases like Oracle, MySQL we define table structures, right? For example, if we want to save records of Student Data, then we will have to create a table named Student, add columns to it, like student_id, student_name etc, this is called defined schema, where in we define the structure before saving any data.
If in future we plan to add some more related data in our Student table, then we will have to add a new column to our table. Which is easy, if we have less data in our tables, but what if we have millions of records. Migration to the updated schema would be a hectic job. NoSQL databases solve this problem, as in a NoSQL database, schema definition is not required.
Sharding
In Sharding, large databases are partitioned into small, faster and easily manageable databases.
The (classic) Relational Databases follow a vertical architecture where in a single server holds the data, as all the data is related. Relational Databases does not provide Sharding feature by default, to achieve this a lot of efforts has to be put in, because transactional integrity(Inserting/Updating data in transactions), Multiple table JOINS etc cannot be easily achieved in distributed architecture in case of Relational Databases.
NoSQL Databases have the Sharding feature as default. No additional efforts required. They automatically spread the data across servers, fetch the data in the fastest time from the server which is free, while maintaining the integrity of data.
Replication
Auto data replication is also supported in NoSQL databases by default. Hence, if one DB server goes down, data is restored using its copy created on another server in network.
Integrated Caching
Many NoSQL databases have support for Integrated Caching, where in the frequently demanded data is stored in cache to make the queries fater.
MongoDB - NoSQL Database
MongoDB is a NoSQL database written in C++ language. Some of its drivers use the C programming language as the base. MongoDB is a document oriented database where it stores data in collections instead of tables. The best part of MongoDB is that the drivers are available for almost all the popular programming languages.
In today's competitive technological world, every company has started hosting its enterprise applications over the cloud in order to expand the business globally, provide faster services and to personalise the customer's experience with the application and overall business. And NoSQL has become the first choice in database technology for developing such applications.