Stop the Wrangle Over Data Wrangling

data wrangling

According to ‘The Economist’, data is the new oil. It is now the world’s most valuable resource. The volume of data available to organizations to capture, store, and analyze has changed the ways in which organizations address innovation, and analytics is a true competitive differentiator.

Unfortunately, business analysts, data scientists, and other line of business users performing self-service analytics are spending a majority of their time preparing data for analysis rather than actually garnering and sharing the insights to be found in it (1), even with the help of self-service data prep tools like Alteryx, Trifacta, and Tableau’s Maestro (coming soon).

Based on common challenges that we have seen our clients struggle with, we have identified some common missteps to be aware of and work towards resolving in order to maximize the value of your self-service analytics rollout and reduce time to insight.

Are you making any of these 5 mistakes that could be driving your analysts to devote additional time and resources to data wrangling efforts, detracting from the value of their analytics?

1. Restricting or obfuscating access to specific tools or for specific groups.

data wrangling

Under rising demand to innovate, increasing complexity of business operations, rapidly increasing urgency, and exponentially growing available data, business analysts are faced with producing insights. Most of the time they need to do so without IT assistance due to time and priority constraints, so they look to tools that enable them to acquire, integrate, and transform the data they need to answer their business questions. If a central team has the ability or inclination to control who has access to these tools and the data that can be accessed by them, and exercises that control in the form of restrictions on groups or tools that they deem unworthy, the only result will be increased time to insight, as these business analysts will find another way to get what they need. Their workaround will likely be less efficient, less scalable, and more risk inducing than providing them with the requested tools and access. When it comes to self-service, an approach that provides collaboration between business analysts and central teams overseeing reporting, analytics, and data management, will provide the most optimal results. By having visibility into what business analysts are working to accomplish, IT teams can better focus their priorities toward activities that may have more of a business impact.

2. Not cataloging your data (all of it).

A Data Catalog allows your organization to keep track of all of its various data assets, regardless of where they live and what structure they have. Analysts don’t discriminate, and just because data is not part of a Data Warehouse or Enterprise Star Schema Model does not mean that it won’t be the key to facilitating critical insights. A robust data catalog should be the first stop for any analyst starting out with a new business question to assess. By capturing key data about the data, your organization can provide insight not only to the analyst, but also to those working collaboratively with that analyst to better understand what next steps to take in the data management process around each data set.

3. Not implementing a data stewardship program.

Although the objective of enabling self-service data preparation and analytics is to ultimately minimize the need for outside party intervention and involvement, there will be times when a data catalog or automated tool will not provide all of the necessary information about a particular data set that an analyst needs. When questions arise that cannot be answered by a catalog or other on-hand resources, analysts need to know where to go with their questions or concerns around access, quality, completeness, frequency, and overall integrity of the data. 

4. Not staging your data for prototyping and analytics prior to undertaking data modeling efforts. 

data wrangling

The volume of data that could be useful in analytics is simply too high, and traditional Data Warehousing activities too costly and time intensive, to support including all data in a fully modeled Enterprise Data Warehouse. Investing considerable effort into fully incorporating new data into Enterprise Star Schema models for use in analytics and prototyping could prove to be a waste of valuable efforts, as the data may not be as useful as originally believed, or the use case may change quickly, even before the data integration activities are complete. By focusing on the minimum viable state of the data necessary to support the analytic use case, the time to insight can be greatly reduced, without compromising governance capabilities. As the analytics are proven and refined, the data can also progress along the data management life cycle. 

5. Not providing a path to operationalizing analytics and prototypes with demonstrated value.

This mistake may not be impacting the initial analytics, however by providing a seamless integration path from prototype and analysis to repeatable, shared reporting, efficiency can be gained in quickly operationalizing and distributing valuable analytics created by business analysts operating in a self-service manner. Tremendous value is gained in enabling a continuous process of data management and analytics that does not have to be re-designed or re-developed in order to scale out to the larger organization. One example of this is using a desktop based self-service data preparation tool that also offers a server product with additional automation and integration capabilities, and leveraging workflows created by analysts during the self-service process as requirements and initial design for future integration and modeling efforts.

With the right approach to not only self-service analytics and data preparation, but to data management in a way that enables these activities, you can help drive your organization’s success with modern analytics. 

 


bimodal enablement assessment

 

 

References

1 Zaidi, Ehtisham , et al. Market Guide for Data Preparation. Gartner, 2017, Market Guide for Data Preparation, www.gartner.com/doc/3838463/market-guide-data-preparation.

 

 

About Ironside

Ironside was founded in 1999 as an enterprise data and analytics solution provider and system integrator. Our clients hire us to acquire, enrich and measure their data so they can make smarter, better decisions about their business. No matter your industry or specific business challenges, Ironside has the experience, perspective and agility to help transform your analytic environment.