Data Management & Data Governance

data_management_banner_1

Data management is the term used for referring to how companies and organisations around the world handle data that is being recorded on a daily basis with ever more data being generated each year than the previous. The importance of having this information at hand is proving more and more important so trying to maintaining uniformity across all machines in different countries, time zones and organisations can be a difficult. This is why more and more companies are looking to Master Data Management in order to gain the maximum benefits from the available data and to use it in the most effective way possible.

According to a Gartner article from 2013

“Master data management (MDM) is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise’s official shared master data assets.  

Master data is the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts.”

What this implies is that for a company to really use MDM that insuring the right manor and methods are used when dealing with the data and data entry. The correct handling of this MDM is importance for companies as:

  • Master data is used to make decisions on all company levels.
  • Business processes throughout the entire company rely on master data.
  • Higher quality master data helps to improve the operational efficiency of a company.
  • With high-quality master data, costs can be reduced.

The goal of an MDM initiative is to provide processes for collecting, aggregating, matching, consolidating, assuring quality and distributing critical data throughout an organisation to ensure consistency and control in the ongoing maintenance and application use of this information

In order to utilise Master Data Management there are different processes that can be followed, they are:

  • ETL: Extract, Transform, Load. A process in database responsible for data extraction extracting data from different sources and compiling it into a consolidated location
  • EAI: Enterprise application integration is the term for the plans, methods and tools used to modernizing, consolidating, and coordinating data for use in companies systems
  • EII: Enterprise information integration, is the ability to support a unified view of data and information for an entire organisation

Data Governance

blog-banner-data-governance


Data Governance is used by organisations to share common goals of company/corporate polices for data definition, enforcement and for communicating ideas and principals.

It is thought that as most companies data is held in databases and on computers that the responsibility for the data should fall within the IT department but this is not always the case  and some people don’t see the need to look after the data governance in a company on a continuing basis.

Data governance initiatives can improve data quality by have an assigned teams responsible for data’s accuracy, accessibility, consistency, and completeness, among other metrics. The team commonly consisting of project management, business managers, and data stewards.

These would be the people that would drive strategy and vision for that data, what data is stored, assign and manage the data stewards and set “Best Pratices” for the company to follow

Some of the most popular tools for Data Governance are

  • Onesoft Connect
  • A.K.A
  • Collibra
  • Acaveo
  • BigData
  • Fusion Platform

References


Gartner (2013) http://www.gartner.com/it-glossary/master-data-management-mdm [Accessed 16th September 2016]

Baum, D., (n.d) ‘Masters of the Data’ Oracle. [Online]. Available from: http://www.oracle.com/us/c-central/cio-solutions/information-matters/importance-of-data/index.html [Accessed 17th September 2016]

Couture, N., (n.d.) ‘Implementing an enterprise Data Quality Strategy’. Business Intelligence Journal, vol. 18, no. 4.

David L., (n.d) ‘Data Governance for Master Data Management and Beyond’ SAS The power to know. [Online]. Available from: http://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/data-governance-for-MDM-and-beyond-105979.pdf [Accessed 15th September 2016]

Big Data and the 3ish V’s

bigdata

Data can be made up of facts or figures bits of information, but not in itself information. When data are processed, interpreted, organized, structured or presented to make them meaningful or useful, this is called information and information provides context for data.

Now companies around the world are realising that having access to large amounts of information or raw data is a valuable commodity and how that data is handled is just as important as raw data itself isn’t valuable but once that data is data is processed and analysed it can be turned into information that can be used to guide companies through specific market targeting, product development, organisational changes etc.

This data can come from different sources but is classified as big data when the information that is being gathered is collected in a high volume, at great velocity/speed and a variety of information. It is often quoted as:

“’Big Data’ is high -volume, -velocity and –variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”

– Gartner (Gartner, Inc. (NYSE: IT) is the world’s leading information technology research and advisory company)

What this breaks down to is the introduction to what many people refer to as the 3 V’s of Big Data

  • Volume – The amount of data that is available in the world goes through huge leaps every year with the common theory being that “the amount of the data in the world doubles every two years” this can be seen with the evolution of different web services such as Facebook, Netflix, Amazon and Google offering new service, adopting existing aspects and implementing new areas for people to upload and share their data with the world (and also with the hosting site).
  • Velocity – The speed in which new data is created, stored and analysed refers can greatly affect the data management principles as having the most current and relevant information from the data can have drastic effects to how a company operates, how quick they can adapt to changing trends or adapt to shifts in the market.
  • Variety – This refers to the wide array of data that is being collected from different sources such as social websites such as Facebook & Twitter, smart phones supplying SMS, GPS locations, photos, audio, video through multiple smart phone apps which are used daily by large amounts of the worlds’ population. There is also the wide use of office documents such as word documents, spreadsheets, relational and unstructured databases all of which used in business daily. With all this information being created using all these different mediums it can easily be said that the majority of the information being created daily can be called unstructured and the quest is how to use this varied data to be of use to organisations and how they relate to each other is something continuing to be explored and utilised

These 3 V’s are used to help identify key dimensions of Big Data but they are certainly not the only V’s that have been associated with big data with some sources adding 2-3 items onto the list with others adding another 11 V’s to the list. Each of them adapting and pulling from the three core elements.

Some of the most popular additions are:

  • Value – How valuable is the information that is being gathered or does the data being collected have a value. This can be considered by looking at how although you might have access to data that it does not mean that the data has any value
  • Viability – In an article in Wired Neil Biehn stated that “we want to carefully select attributes and factors that are most likely to predict outcomes that matter mist to businesses” This can be shortened to the fact that with the information gathered how viable is it and can it be assessed to check the viability of that data as with so many varieties of data and variables to take into consideration when creating and building an effective predictive model
  • Veracity – How confident can you be in the data is it reliable and accurate
  • Variability -How likely is the data going to change in the case of sales is the information you are receiving seasonal trends or could the changes be part of temporary flux in trends (such as Ugg boots popularity* see references below)
  • Visualization – How can the data be viewed in a manner that is easy to understand and comprehend.
  • Virality – How easily is data used by others and at what rate is done.
  • Volatility – How stable is the data, does the information have a deadline or a best before date before the information become useless or irrelevant.

Looking at these and taking into my own opinions what we can take from the above it can be said that along with the 3 core V’s some of the additional V’s do have value and should be considered as useful in breaking down the value of Big Data but some could be considered as unnecessary and used simply to break down other factors down further.


References

Biehn, N. and PROS (2013) A Gorgeous—and Unsettling—Video of evolution in action. Available at: https://www.wired.com/insights/2013/05/the-missing-vs-in-big-data-viability-and-value/ (Accessed: 08 September 2016).

(No Date) Available at: http://blueshiftideas.com/reports/051504DownTrendinDeckersUGGBootsWillOnlyWorsen.pdf (Accessed: 12 September 2016).

here (2014) Top 10 big data challenges – A serious look at 10 big data V’s. Available at: https://www.mapr.com/blog/top-10-big-data-challenges-%E2%80%93-serious-look-10-big-data-v%E2%80%99s (Accessed: 10 September 2016).

reserved, A. rights (no date) Data vs information – difference and comparison. Available at: http://www.diffen.com/difference/Data_vs_Information (Accessed: 11 September 2016).


Interesting Fact
During World War II, the crew of the British submarine HMS Trident kept a fully grown reindeer called Pollyanna aboard their vessel for six weeks (it was a gift from the Russians).