by Angela Guess
David Loshin recently posted his thoughts on how companies might be able to standardize data rules. This is his theory: “By categorizing different types of common rules, one could formulate a structure for each type of rule that would be abstract yet adaptable to different operational scenarios.” He continues, “Consider this scenario: data profiling tools can often highlight situations in which data attributes that are supposed to have values are actually null, or missing their values. This suggests a rule for verifying that a specific data element may not be null, and this is often noted within a data model description as an assertion. However, that assertion applies to each specific instance, not the data element concept in general, which allows for some degree of inconsistency to creep in.”
Loshin goes on, “For example, presuming that a customer’s record must have a telephone number, one can assert within each customer database that the telephone number may not be null. But managing that assertion may mean having yet another layer of detail documenting which models have the ‘not null’ assertion, and not all programmers and data analysts may be aware of this metadata layer.”
“Alternatively,” he continues, “creating a common rule format for conceptual models that map to (different) physical models allows an abstraction of the rule as long as there is an abstract representation of the conceptual model coupled with a mapping from the conceptual model to all instances of the physical representations. This allows one to define one rule: ‘customer.telephone_number not null’ is one example of a structured syntax (we can tinker with that structure as long as it remains automatically parsable). An internal representation not only can map the abstract structure to each physical instance, it can also map the validation of the assertion into different contexts.”
photo credit: Rego – d4u.hu

















