You are here

Data

7 September, 2015 - 12:26

In the typical case, data design consists of deciding how to organize the permanent data maintained by the application into a relational data base, made up of a series of tables—usually one for each entity identified in the entity-relationship model during the requirements analysis. Each line (record, tuple) of a table describes one particular occurrence of the entity type. For instance, in a customer table, there will be one record per customer; and each of these records will contain attributes or data items describing that customer: name, address, telephone number...

Each attribute is described with respect to its functional definition, its value domain, its physical representation on various media, and any restrictions on the operations that can be performed. This documentation decreases the number of programming errors and data base integrity problems during implementation and operation of the system.

For each entity, a key is chosen to identify it uniquely. A good key has no meaning outside of identifying the entity uniquely. In addition, a good key never changes its value for any given entity over the entire life of that entity.

For each relationship, an implementation is chosen. In most cases, relationships are implemented via foreign keys: an attribute of the entity is the key of another entity, serving as a pointer. For example, an order entity contains the customer number of the customer who placed the order. Then, when the order is processed, the application can access the customer data (for instance, to ascertain the delivery address).

One of the most important criteria for a good data design ids that the data items (attributes) are assigned to the right table (entity). The objective is to avoid storing data redundantly. For example, in the order process outlined above, you want to store the customer’s delivery address in the customer table rather than in each of the customer’s orders in the order table. This isn’t so much to save space as to insure consistency. If the same piece of data is stored in several places, there is a risk that the values at some point will come to differ. Then, you are faced with the dilemma of choosing one, but which one?

The process of assigning attributes to the correct tables is called normalization. There is a set of precisely defined steps and criteria that you can follow to make this process almost foolproof. (Need reference here). The acid test is to run down all the functions of the system and ask yourself whether any of them would require updating the same piece of data twice.

Even though the occasion of all this work is the implementation of a single system, one of the main purposes of data base design is to ensure coherence of definitions and consistency of data values across all the systems of an organization. Thus, much of this work is ideally performed by a data administration section, which covers the activities of the entire IS department, not only the current project. If there is no data administration section, then one of the analysts on the project plays the same role, covering only the project.