Hi Friends, this is my new step of by blog into Big Data skill set. Big Data as we all know is a very growing field and a highly demanding one. Today I will focus on providing some insights into What is Big Data And Its Characteristics – Four V of Big Data ? In the coming days, I will be showing some tutorials on doing hands-on work on Big Data technologies like Hadoop, Spark, Pig etc.
What is Big Data ?
Big Data, as the name suggests, is a technology that enables us to handle very high volume of data with a high performance. Big Data provides us with a framework where we can move very high volume of data into our Big Data landscape and perform our computation on the data. The data can be from variety of sources and can be of different formats.
Why do we need Big Data ?
We may all think that why we need Big Data at all when we already have other technologies like Database, ETL/ Data warehousing solutions. And what are those requirements that made it necessary to use Big Data ? Before looking into What is Big Data And Its Characteristics – Four V of Big Data, lets first see what our existing technologies cannot do or what they find difficult to do.
In todays world, the data is increasing at a very fast rate. Currently we are consuming consuming data at a very high speed and at the same time we are generating data too fast to be handled by traditional databases. Just see below a glimpse into the data that is getting processed daily on various social media platforms:
- Google processes 100 billion searches a month. That’s an average of 40,000 search queries every second
- Facebook has 2.603 billion monthly active users – Facebook adds 500,000 new users every day; 6 new profiles every second
- YouTube has 2 billion monthly active users.
- WhatsApp has 2 billion monthly active users.
- Facebook Messenger¹ has 1.3 billion monthly active users.
- Instagram’s potential advertising reach is roughly 1.08 billion.
- Reddit has 430 million monthly active users.
- Pinterest has 367 million monthly active users.
- Twitter’s potential advertising reach is roughly 326 million – It took 3 years, 2 months and 1 day to go from the first Tweet to the billionth
Now just imagine if we use our traditional RDBMS for these data processing, it will take hours to analyze such huge data. And in todays world user does not want to wait for the output. Hence, we need a solution, which is capable enough to handle huge data, fast enough to process the data and has the potential to read and understand data of variety of formats like text, images, videos, csv etc.
To solve the above issues Big Data came into picture and the issues became its characteristics. So now let’s see Big Data characteristics –
Four V of Big Data:
- Volume – Big Data is capable enough to handle high volume of data which we saw just above.
- Variety – Apart from high volume of data, we want Big Data to handle data of different variety like images, videos, GIFs, and various other formats that we share over various social networking platforms.
- Velocity – The velocity at which the data is generated and processed is very fast. In a single day, crores of tweets are made, and even more images are uploaded. Speaking about share market, the price varies with in seconds so you cannot wait for data processing. Hence Big Data is designed to cater to such high velocity data processing.
- Veracity – Apart from all above properties, another important aspect is Veracity which determines the quality of the data that is being analyzed. Today we only need the data which is meaningful to us so that we can make quality decisions out of it. If you have 1000 records in front of you and only 500 are useful, the data is of low quality and hence low Veracity. Big Data framework is provided with tools and technologies which provides us with meaningful data.
These are the major characteristic of Big Data. I hope you might have got some idea on ‘What and Why of Big Data’. I the next session, I will try to explain the framework and architecture of Big Data. Later on we will move to performing hands-on for Big Data which will give us more idea on these technologies.
We can read more details on Big Data on this page: https://www.oracle.com/in/big-data/what-is-big-data.html