Data can be made up of facts or figures bits of information, but not in itself information. When data are processed, interpreted, organized, structured or presented to make them meaningful or useful, this is called information and information provides context for data.
Now companies around the world are realising that having access to large amounts of information or raw data is a valuable commodity and how that data is handled is just as important as raw data itself isn’t valuable but once that data is data is processed and analysed it can be turned into information that can be used to guide companies through specific market targeting, product development, organisational changes etc.
This data can come from different sources but is classified as big data when the information that is being gathered is collected in a high volume, at great velocity/speed and a variety of information. It is often quoted as:
“’Big Data’ is high -volume, -velocity and –variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”
– Gartner (Gartner, Inc. (NYSE: IT) is the world’s leading information technology research and advisory company)
What this breaks down to is the introduction to what many people refer to as the 3 V’s of Big Data
- Volume – The amount of data that is available in the world goes through huge leaps every year with the common theory being that “the amount of the data in the world doubles every two years” this can be seen with the evolution of different web services such as Facebook, Netflix, Amazon and Google offering new service, adopting existing aspects and implementing new areas for people to upload and share their data with the world (and also with the hosting site).
- Velocity – The speed in which new data is created, stored and analysed refers can greatly affect the data management principles as having the most current and relevant information from the data can have drastic effects to how a company operates, how quick they can adapt to changing trends or adapt to shifts in the market.
- Variety – This refers to the wide array of data that is being collected from different sources such as social websites such as Facebook & Twitter, smart phones supplying SMS, GPS locations, photos, audio, video through multiple smart phone apps which are used daily by large amounts of the worlds’ population. There is also the wide use of office documents such as word documents, spreadsheets, relational and unstructured databases all of which used in business daily. With all this information being created using all these different mediums it can easily be said that the majority of the information being created daily can be called unstructured and the quest is how to use this varied data to be of use to organisations and how they relate to each other is something continuing to be explored and utilised
These 3 V’s are used to help identify key dimensions of Big Data but they are certainly not the only V’s that have been associated with big data with some sources adding 2-3 items onto the list with others adding another 11 V’s to the list. Each of them adapting and pulling from the three core elements.
Some of the most popular additions are:
- Value – How valuable is the information that is being gathered or does the data being collected have a value. This can be considered by looking at how although you might have access to data that it does not mean that the data has any value
- Viability – In an article in Wired Neil Biehn stated that “we want to carefully select attributes and factors that are most likely to predict outcomes that matter mist to businesses” This can be shortened to the fact that with the information gathered how viable is it and can it be assessed to check the viability of that data as with so many varieties of data and variables to take into consideration when creating and building an effective predictive model
- Veracity – How confident can you be in the data is it reliable and accurate
- Variability -How likely is the data going to change in the case of sales is the information you are receiving seasonal trends or could the changes be part of temporary flux in trends (such as Ugg boots popularity* see references below)
- Visualization – How can the data be viewed in a manner that is easy to understand and comprehend.
- Virality – How easily is data used by others and at what rate is done.
- Volatility – How stable is the data, does the information have a deadline or a best before date before the information become useless or irrelevant.
Looking at these and taking into my own opinions what we can take from the above it can be said that along with the 3 core V’s some of the additional V’s do have value and should be considered as useful in breaking down the value of Big Data but some could be considered as unnecessary and used simply to break down other factors down further.
Biehn, N. and PROS (2013) A Gorgeous—and Unsettling—Video of evolution in action. Available at: https://www.wired.com/insights/2013/05/the-missing-vs-in-big-data-viability-and-value/ (Accessed: 08 September 2016).
(No Date) Available at: http://blueshiftideas.com/reports/051504DownTrendinDeckersUGGBootsWillOnlyWorsen.pdf (Accessed: 12 September 2016).
here (2014) Top 10 big data challenges – A serious look at 10 big data V’s. Available at: https://www.mapr.com/blog/top-10-big-data-challenges-%E2%80%93-serious-look-10-big-data-v%E2%80%99s (Accessed: 10 September 2016).
reserved, A. rights (no date) Data vs information – difference and comparison. Available at: http://www.diffen.com/difference/Data_vs_Information (Accessed: 11 September 2016).
During World War II, the crew of the British submarine HMS Trident kept a fully grown reindeer called Pollyanna aboard their vessel for six weeks (it was a gift from the Russians).