For too many years, people working with data have used this trite – and truly inexact — definition of metadata: metadata is data about data. It’s short, it’s sweet and it sounds clever. But it doesn’t help anyone understand Metadata and what makes it distinct from the data itself. It’s quite likely that this inexactness of meaning for what constitutes Metadata continuously impedes good data management, integration initiatives, master data management design, and other data-focused processes.
To help clarify the definition, significance and roles of metadata, here are a few thoughts from various professionals:
Stanley Gaver looks at what metadata means in the context of enterprise architecture:
- Question: What’s the definition of “metadata”?
- Answer: The thing that causes most fights at data management conferences…
The immediate problem, of course, is that the generally-used definition for metadata, “data about data”, is terribly imprecise and, as the prior anecdote illustrates, when used injudiciously can be stretched and convoluted so much that ultimately almost anything can be considered “metadata”
Evan Levy in a Loraine Lawson interview:
What a lot of people don’t realize is the discipline of describing data and giving it rigor is called data management. Part of data management is creating metadata.
It’s only when you see things like product ID, product description and product code – those could be three different pieces of information. The metadata will offer “what do I call them,” “how are they represented,” and describe them to you. I can use the product ID to link two different files, so I can integrate data.
Ned Batchelder’s ideas for the value Metadata brings to users of data:
- Metadata is nothing new. Ned includes a far better definition of metadata than the standard “data about data” phrase: Metadata is information about a thing, apart from the thing itself.
- Metadata serves an incredibly valuable purpose: letting us step back from our information and talk about it rather than just use it.
From Marine Metadata Interoperability publications, The Difference between Metadata and Data: Metadata describe a data set sufficiently to permit searching and using the data. However, it is not always clear if a particular piece of information should be classified as data or metadata. Some information, such as geographic coordinates of observations, can be classified as both data and metadata. The distinction between metadata and data depends on the context and the needs of a given application or user.
So the point of all this:
- Truly understand why Metadata matters: Context and Purpose play significant roles.
- Take seriously the importance of precise metadata for initiatives such as master data management and data management / governance.
- Metadata is not just about data integration and data warehousing – for other enterprise needs, metadata helps find data and points the way to interpret and use data correctly.
- Quit using the useless “metadata is data about data” – it tends to confuse more than help, and let’s face it: it’s insipid.










2 Responses
In speaking with clients, I’ve found it challenging to speak about when data management is necessary (as compared to allowing unmanaged data) without also implying that only some data are important. Ned’s definition includes the word “information” — “Metadata is information about a thing” — and I agree; the word “information” implies that there’s some value there. It’s obvious that managed data, like the controlled vocabularies of product categories and access group permissions, have value, but so do all of the unmanaged data like modification dates, file locations, and social keywords. We’re not interested in data for data’s sake, it’s the meaning of all those data that matters. It’s not about what, it’s all about why.
Seth Maislin, Senior Taxonomist
Earley & Associates
seth.maislin@earley.com
http://www.earley.com
Hello Seth,
Thanks for taking time to read my post and to share your insights.
I agree that data shows its value when we consider its meaning. In addition to meaning showing the “why”, meaning also derives from context, the “how” data is being used in business processes, etc.
Cheers,
Julie Hunt