How Data Management and Governance Can Enable Successful Self-Service BI
Ironside’s Crystal Meyers was featured in Vol. 19, No. 4 of TDWI’s Business Intelligence Journal. Her points are particularly relevant as business analytics begins a more pronounced shift toward a bimodal model in 2016.
Reprinted with permission of 1105 Media, Inc. As published in TDWI’s Business Intelligence Journal.
Abstract
How can you build a dashboard quickly and efficiently without the risk of using inaccurate or outdated data? This article examines the role of data management and data governance in ensuring accurate and trustworthy reporting in a business intelligence solution. We introduce the power user model and explain how implementing data controls can strengthen the business-IT partnership.
Introduction
Consider this common scenario: the recently appointed sales director at ABC Retail wants a new dashboard. She places a request in the IT ticket system and three days later somebody schedules a requirements-gathering session. A week later there is a follow-up meeting. Two weeks after that, she is asked to approve a requirements document. A month after placing the initial request, she is given a project plan with an estimated implementation date three months down the road.
Frustrated with the pace of the project, she talks to a business analyst on her team who informs her that he can build the dashboard in two weeks with the new self-service BI tool he just downloaded to his PC. As promised, two weeks later, the analyst delivers a great-looking dashboard with accurate data; IT is written off as slow and difficult to work with, and self-service BI has saved the day.
Months later, after using the dashboard to drive strategy and send weekly updates to her superiors, the sales director notices that some of the numbers don’t make sense. She reaches out to the analyst to troubleshoot the problem, but there is no way to tell exactly when the issue began, because only the reports from the current day and each month-end are saved. All they know is that the problem started sometime during the current month.
The analyst knows nothing has changed on his end, so he starts investigating his data source—an Access database that connects to several spreadsheets, SharePoint lists, and data warehouse tables. After a few days of investigating with the help of a member of the data warehouse team, the analyst determines that one of the tables he was connecting to in his Access database was replaced with a new table and is no longer being updated. The analyst had access to this table from his role in a previous department, but his permissions were never changed and he was still able to use the information. He was overlooked when users were informed of the change because there was no formal access review policy in place.
Obviously, many factors contributed to the data problem that occurred with the dashboard in this case. However, one common solution could have prevented all of them: an enterprise data management and governance program.
Establishing a data management and governance program allows your organization to define the policies, procedures, and standards necessary to ensure consistency in data ownership, access, quality, documentation (such as metadata, lineage, catalogs, and definitions), development and implementation processes, change management, technology standards, and data retention across the enterprise, as well as to ensure that users adhere to these policies, procedures, and standards.
With such a program in place, business users can utilize self-service BI against trusted, governed data sources with minimized risk. BI development can be moved out of IT, with the added bonus of freeing up technical resources to focus on improving the quality of the data infrastructure, data model, BI platform, and other fully technical tasks.
Data Management versus Data Governance
Let’s define a data management and governance (DMG) program. We must first clarify the difference between data management and data governance. Although these terms are often used together or perhaps even interchangeably, they actually describe two different components of an enterprise information management program. To put it simply, data management comprises the policies, procedures, practices, and tools that are designed to enhance the use of data assets. Data governance is the application of enforcement over those policies, procedures, practices, and tools. Like the legislative and executive branches of the U.S. government, these programs are designed to be separate arms of a larger overall program that serve to balance one another.
To get the most value from your enterprise information management program, both the data management and governance aspects must be addressed. It is worth noting that although many data management aspects will be addressed by technology, a DMG program should not be a technology initiative. It is the business that is ultimately going to benefit from the program, and so the business should be driving the effort to create data management policies and put governance in place.
Impact on BI
According to a 2012 study by Wayne Eckerson, approximately 64 percent of self-service initiatives are rated as having achieved only an “average” or lower level of success by BI professionals. A BI program—self-service or not—can fail quickly if the users don’t trust the information it provides. With self-service, the ability for end users with a wide range of technical skills and data familiarity to access, compile, and interpret data without input or interference from a centralized body (such as IT or a central BI team) can at best lead to inconsistencies in how metrics are defined across teams; at worst, the entirely wrong data may be used for business scenarios.
Many organizations that implement self-service BI without adequate data management and governance will find that their BI program quickly serves as a catalyst to implement DMG. Inconsistencies in reporting as well as duplicated data and reports will result in time wasted reconciling information as well as an overall mistrust of the data. Unfortunately, once users mistrust information in a report or BI application, it is incredibly difficult to regain that trust.
The benefit of setting up a DMG program prior to or in parallel with implementing self-service BI is that although DMG can exist on its own, self-service BI really needs DMG to be successful. If you establish your DMG program first, you will set the stage for success for BI and other potential programs, such as master data management (MDM) and big data. DMG will augment your BI program by monitoring and improving data quality, establishing and ensuring consistency in data standards and metric definitions, and instituting a standard BI architecture as the foundation for your end users.
Data Management and Governance Scope
The scope of DMG can be a point of contention. The program itself should be considered an enterprise initiative. Its purpose is to ensure consistency across multiple areas of the business; therefore, it must be managed centrally and standardized across all business areas. Once the program standards are defined, implementing them within each business area can take a more localized approach.
Another issue about scope is adherence to the guidelines of the data management program. For example, what data delivery methods fall under the scope of an audit? The full extent of your DMG program will be dictated by the business model of your organization, but your BI program should certainly fall under this umbrella.
Managed reports, dashboards, scorecards, cubes, analytical applications—these are all alternative methods of delivering data and should fall under the same DMG policies as the raw data itself. BI self-service is an area where this line starts to blur. If a user is creating a report for their own use that they don’t plan to share with anyone else, does the report need to follow DMG standards? The answer lies in what users plan to do with the information they gather from that report. Will the data be used to drive strategy? Will it be leveraged to make decisions that could have legal, regulatory, or other high impact to the organization? Will users rely heavily on this report to run their business? If the answer to any of these questions is yes, then the report should be governed.
The DMG Program
What are the elements of a DMG program that will enable you to leverage self-service BI successfully? Many of them will vary based on the size and complexity of your organization, as well as the maturity and flexibility of your business model and the volume of data you manage. However, there are some basic elements of data management and governance that most organizations will want to leverage.
Data Management Elements at the Raw Data Level
To manage raw data, such as that held within operational systems, databases, data marts, or data warehouses, you will need to establish:
- Data ownership and stewardship
- Data quality standards
- Data access standards
- A common point of data access
- Data modeled into an easy-to-understand semantic layer
- Metadata and lineage
Data Management Elements at the Reporting Level
These are items you will leverage to manage metrics, reports, and other data outputs rather than the raw data itself:
- Data definitions
- Report dictionary/catalog
- Report documentation/workflows
- “Official” stamp of approval on reports
- Standard development and implementation process
Data Management Elements at Both Levels
Certain policies or procedures will address both the raw data and the data output or reporting levels. In these cases, there may be separate policies to address each level, or the same may apply to both, including:
- Access policies
- Archiving and data retention policies
- Change management
- Training plan(s)
- Naming standards
- Standard technologies
Once data management elements are in place, additional data governance elements will need to be established to measure compliance with the policies and standards that you set forth. You will need to define the roles and responsibilities of those who will interact with the data. Some of the components that should be put in place to address these needs include:
- Roles and responsibilities documentation
- Data quality monitoring
- Access reviews
- Report audits
- Auditing of metadata and lineage
Looking at this list, it is easy to see why it is often recommended that a DMG program be implemented iteratively. Many of the elements listed here will either start out very basic or may apply only to certain subsets of data, and they will grow over time—either organically with the data needs of the organization or because of a concentrated data governance effort.
Overcoming Limitations with the Power User Model
Proper data management and governance will help you lay the foundation for a successful self-service BI program. However, implementing self-service BI can be an intricate project, with several other factors contributing to its success. Differing levels of user ability or desired level of interaction with the data is a big factor, as is the training required to learn the self-service tool. Is it realistic for 500 users in an organization to familiarize themselves with DMG standards and ensure that every self-service report or query they create meets those standards, in addition to learning how to use the BI tool that will allow them to create those reports or queries? Unlikely. A more feasible solution is for a group of power users within each business line or department to help other end users with self-service requests when they lack technical prowess or time.
These power users should be fully trained in the organization’s DMG standards. Ideally, each business line will have a data steward who will work closely with these power users to help them develop a comprehensive understanding of the data and metrics used within their functional areas. The power user model eliminates the need to deploy full self-service capabilities to all 500 users; instead, users may access information without the risks associated with manipulating data, such as the ability to combine elements from existing reports into personalized, custom dashboards, or to run queries and reports with prompts that allow them to enter parameters that guide the output of the report. For needs beyond these capabilities, general users would work with the power users in their areas.
Another factor in the success of self-service BI is the business-technology partnership. Both data management and business intelligence are programs that bring value to the business and require executive sponsorship and support to be successful. However, the ongoing support and maintenance of the programs heavily involves IT. Without a strong partnership and commitment to maintaining that relationship from both sides, failure is likely. The power user model can also help maintain this partnership, as power users – individuals with an understanding of and appreciation for both the business and technology aspects – can serve as a bridge between the two groups. They can help communicate business needs to the technology teams as well as serve as evangelists of the technology to the business.
Finally, although you want your DMG program to be encompassing and robust enough to put the proper controls in place and minimize risk, you also want to ensure that you can maintain flexibility when necessary. One way to address flexibility needs in BI is with data sandboxes. Sandbox data typically does not reach the same level of governance as production data, although it should still be held to high standards. Sandboxes provide an area for users to explore their data, use it in prototypes, and determine how, or even if, it adds value to the production environment.
Putting Theory into Practice
Revisiting our initial scenario, let’s see what happens if we apply some of the DMG elements just outlined.
This time, because self-service BI is in place at ABC Retail, our sales director knows she doesn’t need to go to IT with her dashboard request. Instead, she goes directly to her business analyst, one of the designated power users in her area. Because IT is no longer focusing so much effort on report development, they have been able to create a semantic layer within the enterprise data warehouse consisting of fact and dimension tables containing all of the metrics the business has defined in its enterprise data definitions.
The sales business analyst, having been subject to monthly access reviews, has access only to this semantic layer, where any changes to the underlying data tables are transparent to end users. Thus, when the decision was made to stop updating the table that was originally used to build the dashboard, any objects in the semantic layer referencing that table would have been automatically updated to the replacement table and fully regression tested per the organization’s change management policy. All relevant stakeholders would have been involved in this change process, so even though the change is transparent to the business analyst, the sales data steward is well aware of the fact that the underlying tables were changed and was involved in the testing process.
In this situation, we have the same benefit of getting the dashboard built quickly and efficiently. However, the risks of the data becoming corrupted, outdated, or inaccurate are eliminated in this case. We also have the added benefit of strengthening the business-technology partnership, as the sales director is given no reason to believe that IT is difficult to work with, but instead sees IT as enabling the power users across the organization to provide the most effective information to the right people in the most proficient manner.
If you’d like to read more from Crystal, see her list of articles here. You can also get Geoff Speare’s perspective on the growing bimodal analytics movement here.
References
Eckerson, Wayne . “Business-Driven BI,” BeyeRESEARCH, September. http://www.beyeresearch.com/study/16441
Ladley, John . Data Governance: How to Design, Deploy, and Sustain an Effective Data Governance Program, Morgan Kaufmann.