organizations today are excited about the value and potential power of data and most are focused on information management, quick wins, agility and fluidity. The new paradigm is to ingest data and, on the fly, do what they want with it. Unfortunately, leveraging data for everything isn’t as simple as this. There are key building blocks and behind-the-scenes practices that prepare data to deliver quick wins and enable fluidity; these are too often ignored.
Taxonomy and data ontology are key examples. Without proper definition and classification of data elements, components, and artifacts, and their relationships, organizations will have to grapple with making sense of their data, managing and processing it, reporting information, and engaging with customers across departments and across the board. market.
For example, a bank may not have standardized definitions of what constitutes a customer and the different types of customers: personal, private, business, or corporate. This could lead to inaccurate customer reporting, missed cross-sell or upsell opportunities, and a poor customer experience when the contact center tries to sell a product to a customer who already has it.
Learn more at www.kid.co.za
In some industries, common classification models are increasingly used to eliminate confusion, reduce costs, and support data-driven projects, including process automation, machine learning application, and artificial intelligence. Common industry standards and agreements support industries like accounting and chemistry, where accuracy and consistency are crucial.
In the same way, data architecture must align with taxonomy and ontology standards across industries and within organizations. Many don’t get to do it. Data projects are often complex and time-consuming, and many companies don’t prioritize and formalize their data organization upfront, resulting in wasted time and resources later.
Putting the ontology in place
Taxonomy classifies data into categories, all of which have common or different definitions, terminology, and/or semantics. It is a structured use of component definitions of data, processes, and systems. This finds its place in a business glossary, a data dictionary, and a metadata repository, which is a central point of reference for an ontology.
Ontology organize data and process elements, components, artifacts, definitions and everything related to data; assigns data (content) or placeholders (metadata) and relates or contextualizes them so that consumer devices and users can understand everything for business applications. The ontology enables easier selection and distribution of information to the right channels and enables better use of PPT resources (people, processes, technology). It is a key factor for data governance.
Business indecision about conflicting definitions and semantics of data and business will lead to inefficient data, process, and system designs, and lead to ambiguity in reporting. Lack of or weak data architecture practices will result in chaos and waste of PPT resources.
The ontology is spawned or created by data and process solution architects, who are guided and bound by data policies and standards as enforced by risk and compliance or data governance teams. The ontology is implemented by software and data application engineers in consultation with business managers, administrators, and data stewards. All relevant actors along the data journey have a responsibility to uphold ontology rules and standards.
Retaking ontology from the edge
A best practice for organizations is to ingest data and organize it according to ontological and taxonomic rules as needed. This approach is finding its way into data curation and data quality processes where classification and corresponding decisions can be automated.
One approach is to organize (classify/curate) data on the fly, per data query, at the edge (the consumer side). This can mean repeated organization of the same data, resulting in a waste of PPT resources.
Instead, the taxonomy should also apply to unstructured data. Mapping (relating or linking) data, either the content itself or its metadata, not only reveals the “anatomy” (or ontology) of the data landscape, but also enables efficiencies such as reusability; consolidate and eliminate duplicate data; automate processes and perform lineage evidence. It also reveals inefficiencies, overlaps, and gaps.
Organizations looking to achieve greater value from their data should start with the discipline of data classification and taxonomy in glossaries and dictionaries to support ontology. This is a transition process, almost organic in nature, but as it gains momentum, the organization will reach a point where most or all data elements, components, artifacts, and definitions are linked.
- The author, Mervyn Mooi, is director of Knowledge integration dynamics
- Read more Knowledge Integration Dynamics articles on TechCentral
- This promoted content was paid for by the interested party