By 2022, 35% of large organizations will be either sellers or buyers of data via formal online data marketplaces, up from 25% in 2020. With AI and ML supplementing existing data sources, there is always more value to be derived from large quantities of data.

For years, the data management industry has been talking about the ever-growing volumes, velocity, and variety of data.  For traditional analytics, the challenge has been about how to reduce the data used in reporting and BI; how to separate the noise from the signal, how to prioritize the most relevant and accurate data, and how to make a company’s universe of data usable to an increasingly self-service user population. This notion of having too much data is well-founded – so much data in an organization isn’t readily useful for traditional analytics. Data may be incomplete, inaccurate, too granular, unavailable, or simply not useful for a particular use case. However, in implementing AI and ML, it turns out that the more data that is available from as many sources as possible is one of the most important ingredients in building a successful model.

In traditional analytics, the user decides which data is most useful to their analysis and, in so doing, taints their results through their own intentional omissions and unintentional biases. But, in AI/ML (and especially when we’re leveraging Automated Machine Learning (AML) technologies), we really can’t have too much good data. We can throw massive amounts of data at the problem and let AML ascertain what’s relevant and helpful, and what isn’t. We want lots of data, and unfortunately we usually don’t actually have enough.

In a recent project, we met a customer who (as with most) believed that they had all the data they needed to accurately predict insurance loss risk – they knew their customers, their properties, various demographics, payment histories, on and on. And so we built a loss prediction model for them, and got good results. The customer was very pleased.  

Then we decided to train the model with a combination of internal and 3rd party data to see whether there would be a difference. We loaded several sets of data that significantly enriched that customer’s already voluminous customer and property data.  The result was a 25% increase in the efficacy of the AI model – which as any Data Scientist will tell you, is a massive improvement. And the cost of that data was a drop in the bucket relative to the scope of the larger budget.

My message to customers facing these issues has evolved; I now encourage them to seek out more data than they already have. The inclusion of external data at marginal cost can drive substantial improvements in the quality of models and outputs. And many data vendors have made it easier to test, acquire, and parse data for where it is most impactful. The bottom line is that, in the area of AI, more is definitely better, and you can never be too data-rich. 

Ironside and our partner Precisely recently published a white paper where you can learn more about data enrichment for data science, which you can download here.

Ironside, an Enterprise Data and Analytics firm, was featured in a Wall Street Journal article highlighting AI consultants that enable their clients to be self-sufficient with AI and not have to rely on their consulting counterparts to manage the model. A key part of Ironside’s strategy is having a broad portfolio of technology partners to see how they fit together to provide more value to our mutual customers. Two partners that fit together well are Precisely, an industry-leading Data Provider, and AWS, an Industry cloud leader. Precisely and AWS are technology partners — Precisely Data is offered through the AWS Marketplace and AWS Data Exchange. Ironside saw an additional opportunity for them to work together and Precisely was very interested to move forward. 

Through Ironside’s combined expertise, in both Amazon QuickSight and Precisely’s expertly curated location data, they designed an enhanced business user experience for Amazon QuickSight customers.  In this solution, Ironside leveraged Amazon QuickSight’s ability to ingest multiple data sources (like Precisely Points of Interest data stored in S3/Redshift/Snowflake) in addition to customer data repositories to create enriched Insights. 

Exhibit A: Precisely Points of Interest Data providing Context to Amazon QuickSight Bike Sharing Ridership Dashboard

Ironside’s Practice Lead for Business Intelligence, Scott Misage, shared, “As the diversity and volume of data increase, organizations need to find ways to harness this data explosion and find pathways to bringing additional insights and intelligence to visualizations and dashboards. By leveraging the capabilities of these two complementary partners, it’s easier than ever for organizations to accelerate time to value with analytics.”

For Precisely’s Sales and Consulting teams, their Amazon QuickSight environment, created by Ironside, will provide them a User Experience to demonstrate how their data can enhance customers’ AWS data platform strategy. Matt Reaves, Vice President, Channel Sales at Precisely shared, “Ironside’s vision and expertise in execution has given our customers and sales teams great tools to showcase how Precisely Data can increase the value of their adoption of AWS. Ironside’s work with the QuickSight platform helps to demonstrate context for our multiple datasets.” 

Precisely’s Amazon QuickSight environment is supported by Ironside’s Managed Service team with additional expertise from their Business Intelligence Practice. As new features in Amazon QuickSight are made available, Ironside works to incorporate them into the environment and assists the Precisely teams with understanding how these releases impact a customer’s use of Precisely Data for enriched Insights.

ABOUT IRONSIDE
Ironside helps companies translate business goals and challenges into technology solutions that enable insightful analysis, data-driven decision making and continued success. We help you structure, integrate and augment your data, while transforming your analytic environment and improving governance.

ABOUT PRECISELY
Precisely is a new company with a remarkable heritage. We were formed when Syncsort and Pitney Bowes Software & Data combined, bringing together decades of experience and expertise in handling, processing and transforming data. Precisely data integration, data quality, location intelligence, and data enrichment products power better business decisions to create better outcomes.

ABOUT AWS

Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud platform for 14 years. AWS offers over 175 fully featured services for compute, storage, databases, networking, analytics, robotics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, virtual and augmented reality (VR and AR), media, and application development, deployment, and management from 77 Availability Zones (AZs) within 24 geographic regions, with announced plans for nine more Availability Zones and three more AWS Regions in Indonesia, Japan, and Spain. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—trust AWS to power their infrastructure, become more agile, and lower costs.