Five Essential Capabilities: Master Data Management
This is part four in our five part series on the essential capabilities of the competitive data-driven enterprise.
Businesses have been deploying enterprise data governance (defining what the data should be) and master data management (ensuring the data is as defined) programs for decades. Even if your company doesn’t have a formal master data management program by name, chances are good that they are doing some form of master data management in your data warehouse, CRM or ERP systems. As the trend towards decentralized data analysis continues to progress we see a few forces in play that make the case for incorporating a master data management capability into your organizational roadmap:
The Power of a Single View
Organizations have acknowledged the benefits of bringing together all of their data from all of their disparate systems to maximize their data-driven problem-solving potential, identify new business opportunities, and increase the accuracy of machine learning models. This single view is not only used to power more accurate data analysis, but is also flexible enough to drive your operational business process.
Increased Machine Learning Potential
Businesses that are investing in data science and machine learning are quickly realizing that some of their greatest opportunities for optimization are stifled by a lack of usable machine learning training data. This is most often due to both insufficient data collection and/or variable execution (e.g. human error, governance) of the business process. An MDM program can benefit your data science initiatives by applying the taxonomies and hierarchies to data that would be needed to power machine learning.
The Democratization of Data Analysis
Domain and subject matter experts are becoming increasingly more responsible for developing their own data value hypothesis (and modeling their own data), and they are autonomously doing so with highly accessible and capable self-service analytical tools. Often the most difficult, time consuming, and error prone data wrangling task is entity resolution; Standardizing, deduplicating, cleansing and keying so that data seamlessly resolves down to the unique entity at the center of an analysis (e.g. Customer, Location, Product, Employee, etc.) In this context, mastered data is often vastly easier to blend, analyze, interpret and trust – overall reducing the time and cost for an individual to derive insight from business data.
The Limits of Data Integration Silos
When master data management is implemented within specific functional business systems (e.g. CRM), it can limit the efficacy of the program because:
- Access to these systems is not always universal
- Not all relevant business data is integrated into such systems
- These systems are not often designed with universal data integration in mind, and the cost to master or integrate data that is not native to the business processes they manage can be very high both in terms of initial development and accumulated long term technical debt
The Importance of Data Privacy
As data privacy regulation is rolled out, organizations will be required to manage how key customer data is used in the business. They will need to manage and track consent and usage across all sources and ensure that information is only being utilized for purposes that were authorized by its owner. In addition, concepts such as data obfuscation and masking could enable broader business innovation through both internal and external crowd-sourcing.
Take for example Numerai, a hedge fund that has encrypted sensitive elements of the training data they use to power their trading algorithms, and then published that data as part of an ongoing Kaggle-style data science competition where anyone can compete to improve their performance and earn financial rewards. They’ve successfully used advanced data privacy techniques to both negate potential bias and crowd-source the engine that runs their business without revealing any of their most valuable intellectual property. What could your business do if it viewed data privacy as more than just risk management?
Master Data Management Essentials
If you are in agreement that decentralized data analysis is the most-likely model to which your organization will continue to evolve and mature, then implementing a centralized master data management capability to maximize the efficacy of enterprise data assets and unburden knowledge workers at the edge will be a meaningful way to reduce the organization’s overall time and cost to produce valuable insight from data. Most centralized MDM programs and the platforms that succeed in this pursuit will demonstrate some or all of the following elements:
- Extends Existing Data Governance Program and Tools
- Cleansing and Correction of Erroneous Data
- Data Quality Monitoring and Reporting
- Business Taxonomy and Hierarchy Management
- Concept Standardization (e.g. Address)
- Deduplication, Matching and Unique Keying
- Easily Integrates and Coexists with Existing Business Systems (ETL, DW, CRM, ERP)
- Provides Interfaces for Workflows, Approvals, and Human Classification Task Delegation
- Provides Graph-based Data Structures for Relationship & Network Analysis
- Enables Data Obfuscation and Privacy Regulatory Compliance
In addition to classic MDM steward-driven approaches to standardizing corporate data, from the data science orientation new options have emerged for outsourcing the labor of creating good training data for machine learning models. We recommend these new alternatives also be considered to accelerate the process of creating good training data for time sensitive business opportunities.
Next Up: Elastic Data Processing & Storage
Elastic Data Processing & Storage is the concept around how the cloud changes the economics of data storage and processing in order for an organization to reduce the time and cost of problem solving. Moving your analytic query, data processing and storage to elastic will allow you to see many benefits for the business. There are several features one should consider when deciding how to implement this capability.
About Ironside
Ironside was founded in 1999 as an enterprise data and analytics solution provider and system integrator. Our clients hire us to acquire, enrich and measure their data so they can make smarter, better decisions about their business. No matter your industry or specific business challenges, Ironside has the experience, perspective and agility to help transform your analytic environment.