IDC has been using the phrase “data intelligence software” to describe a category of capabilities that provide intelligence about data, and the term “data intelligence” has caught on in the industry. But not all definitions of data intelligence are equal. Let’s take a closer look at how IDC defines the term, and some permutations that have emerged.
IDC’s definition of data intelligence is:
“Data intelligence leverages business, technical, relational and operational metadata to provide transparency of data profiles, classification, quality, location, lineage and context; Enabling people, processes and technology with trustworthy and reliable data.”
Data intelligence is intelligence about the data, as informed by metadata. Data intelligence is not intelligence informed by the data. As defined in several IDC publications, data intelligence helps organizations answer six fundamental questions:
- Who is using What data?
- Where is data, and where did it come from (lineage and provenance)?
- When is data being accessed, and when was it last updated?
- Why do we have data? Why do we need to keep (or discard) data?
- How is data being used, or perhaps more specifically – how should data be used?
- Relationships – what relationships are inherent within data and with data consumers?
I first came across this idea of data intelligence in 2016 when I was part of a webinar with ASG Technologies, who had started to use the term “Enterprise Data Intelligence” to refer to the suite of products they had for capturing and managing data lineage within an organization. We simply started calling it “data intelligence,” because what we can learn from metadata applies at all levels of the enterprise, it is also shorter by 10 characters, which is important for those of us who speak in hashtags and tweets.
Data Governance is Enabled by Data Intelligence Software
As I received more and more inquiries about where to buy a data governance solution, I had to tell clients that data governance is an organizational discipline with multiple dimensions of people, process, policy and supported by technology. Data intelligence software is part of the technology required to enable and support data governance disciplines
Data intelligence informs data governance so that security and compliance risks can be lowered, but data intelligence is also adding value to organizations in enabling data self-service analytics and decision support, being part of data literacy improvement, data quality management, and providing context to machine learning (ML) and artificial intelligence (AI) to help avoid garbage-in, garbage-out situations.
Data intelligence is also beginning to help organizations understand how they work with data, and it promises to improve organizational efficiencies. In 2019, a survey of data workers identified that we now spend 90% of our time looking for, preparing and protecting data, while only 10% of our time is spent in data analysis.
Data intelligence software as defined by IDC includes data definition, profiling and stewardship; master data intelligence (where are the systems of record, reference and entry, and what are the match and survivorship rules); data cataloging; and data lineage functionality. Within the IDC software market taxonomy, we are counting data intelligence software inside of the data integration and intelligence software market, and as such it is primarily focused on intelligence about structured and semi-structured data. Data intelligence could also bring metadata about unstructured data into scope, which when combined with metadata about structured data, bring new opportunities for integration, analysis and insight.
Data Intelligence Definitions in the Market
Data intelligence has now caught on in the market. Technology buyers have responded positively in that they have a better understanding of what metadata is, how it can add value and reduce risk. Many software vendors with data intelligence capabilities have latched onto the term, including:
- Alation
- ASG Technologies
- Collibra (formerly the data governance company, now calling themselves “the data intelligence company”)
- erwin, Inc.
- Experian
- Infogix
- Informatica
- Qlik
- Reltio
- Talend
- Unifi Software
This is not an exhaustive list, but it demonstrates data intelligence’s momentum as many of these suppliers are within the top 20 vendors of data integration and analytics markets.
In every family or circle of friends, there are one or two that just must be different. This is the case in the definitions of data intelligence that have emerged from SAP and Teradata: where data intelligence is about the metadata, but it is oriented towards the outcomes of data science and machine learning. In SAP’s case, it is about building pipelines in SAP Data Hub connected to a data science workbench, and operationalizing machine learning algorithms with trusted data in context and with confidence.
In Teradata’s case it is about applying data science and machine learning to data in the enterprise data warehouse, not needing to take it out into a data lake or sandbox. In both of these cases, data intelligence seems to be more about intelligence from the data, not intelligence about the data. Everyone has the freedom to make their own choices and definitions – SAP and Teradata have expanded the definition, perhaps to add more value into the equation, or perhaps they were looking for a shorter hash tag for machine learning and artificial intelligence?
Recently I presented my perspectives on data intelligence at the 13th annual MITCDOIQ symposium at the Massachusetts Institutes of Technology in Cambridge MA, “Evolving Data Intelligence for Organizational Performance.” The conference brought together data executives to discuss and learn about advances in data management, governance and enablement practices, methods, technologies and organizational constructs.
In addition to receiving positive feedback from attendees on my definition and perspectives on data intelligence, the same perspectives were echoed in keynote sessions delivered by chief data officers from healthcare, public sector, defense and financial services industries. Unless an expanded definition of data intelligence grabs hold in the market, IDC will continue to hold onto our definition because it provides us with a taxonomy by which we can size and forecast this software market. Our definition has been embraced by the financial markets on both the sell and buy sides, and it provides a cornerstone in the future of intelligence research programs that IDC will be focused on in our own future.
Explore the future of data intelligence with IDC’s FutureScape: Worldwide Data, Integration, and Analytics 2020 Predictions: