You are here

Meta-data

8 September, 2015 - 12:22

Reiterating, meta-data are data about data. One key theme in this discussion on data modeling has been the problem of understanding the meaning of the values of data in a collection. The meaning of a datum can be understood in a set-theoretic sense. The number “10” is easily understood in terms of its membership in the domain Integer, for example. It is not possible looking at a value in isolation what its purpose is or what it represents. Human understanding might make use of the attribute names, but as has been shown, even those are limited in explaining the contents of the attribute. For example, the attribute name “temperature” does not convey what scale its values represent. The purpose of meta-data is to fill these gaps in meaning.

Data can be explained at different levels. Thus, there are different levels of meta-data. At the most fundamental level are data about the physical storage properties of data. This is known as the physical schema. Meta-data at this level can include information about file formats in which data are stored, properties of the devices used for storage, and the bit-level and byte-level representations of data values.

In discussions above, data have been explained at a logical level. This is a level that is free of considerations about the physical details of data formats and their storage. This is known as the logical schema. Meta-data at this level describes the data model: entities, attributes of entities, data types of attributes, value constraints on attributes, and relationships between entities.

Higher levels of explanation can be given concerning data. As discussed above, meta-data are necessary for describing the meaning of data beyond what is usually described by a logical schema. Descriptions at these levels comprise what are sometimes called semantic data models (cf. [Hammer1978]). Meta-data at this level may: provide human descriptions of attributes and define the value system within which data should be interpreted (e.g. Celsius vs. Fahrenheit).

Long-standing research has been done to facilitate levels of understanding of data above the logical schema. That is, to facilitate understanding of data that approaches that of artificial intelligence.