You are here

Assigning values

8 September, 2015 - 12:22

Another level of reality is represented by the values we choose to assign to the attributes of the entities in a schema.Typically, data management software, particularly database management systems, can support the storage of various basic data types. These include types that represent the well-known numerical and logical domains integer, real, and boolean. Obviously, we need to be able to represent words in some way. For this, such systems allow the storage of character or string data. Strings are sequences of characters. Most software systems used for data management also permit the representation of other more specialized types such as dates, times; or arbitrarily complex combinations of other data types. The data modeler must decide on the types of values to be stored that best suit their use in a data management system. For example, if it will be necessary to perform arithmetic calculations using an integer value, the data values should be integers and not string versions of integers (i.e. 10 vs. “10”).

EX. WM-12:

The reader should note that we used two data types across past examples to express geographic coordinates. One was the traditional degrees, minutes (and seconds) form, where North / South relationships and West / East relationships relative to the equator and prime meridian are represented with characters “N”, “S”, “W”, and “E” respectively. Values in this form look like the following:

latitude = 45° 52’ N, longitude = 66° 32’ W

Another form that is more common these days is a degree decimal representation of coordinates. The above coordinates in degree decimal form are:

latitude = +45.866667, longitude = +66.533333

Coordinates in this form make it easier to perform mathematical calculations with the values, such as determining the distance between locations. North and South are represented by positive and negative latitude values, respectively. West and East are represented by positive and negative longitude values, respectively.

Suppose, however, that we need to cross-reference coordinates in our data collection with existing documents that employ the traditional form. In this, case it may be advantageous to store our data values in the same format.

One alternative is to derive values in the form we need them from one chosen data type.

Another data modeling problem, beyond the question of data types, is to determine how the values are to be interpreted. Another way of viewing this is as the type of units that a value represents. This is particularly the case when physical phenomena are being represented for which various scientific units apply. This data modeling issue relates to the earlier discussion of the meaning of data.

EX. WM-13:

If we examine a data value for an entity’s attribute in a data management system, the value alone will not tell us what it means. Suppose we decide to represent temperatures in our temperature_readingsschema as integers. If we then examine the value of the temperature attribute in some record, we will see only an integer, say -10. What would -10 mean though? We as humans know from the name of the attribute that it represents the physical quantity known as temperature.

The problem is that we also know that -10 could represent a temperature on one of several scales. The question is then: what type of units – beyond integers – does this value represent? Some geographic regions still use the Fahrenheit scale. Some use Celsius. Some scientific domains assume the use of the Kelvin scale. Thus, it is necessary in our schema to somehow indicate what type of units this attribute is to represent.

This is a complex issue in data modeling. Addressing it ultimately entails the use of what is called metadata. Meta-data are data about data. This will be discussed in Section 2.6. For now, let’s focus on the representation of the data values themselves.

We could decide to have two attributes for the Fahrenheit and Celsius temperature scales and record the equivalent values in each attribute. This may not be the best approach, however. We will need twice the storage space now to record temperature values. We will also have to spend twice as much time recording values.

All decent modeling or design choices have trade-offs. That is, there are advantages and disadvantages to each. Some modeling choices are just bad!

What would be useful here is to be able to store the temperature in one chosen scale and to be able to ask to derive it in another scale.