by Angela Guess
A recent blog post takes a look at what constitutes a good data model: “Designing data models is fun – at least if you are a geek like me. But as much as I like the academic thrill of building something that is complex – I am aware that it is often humans that eventually must see and maintain my beautiful (data) model. How can we design something that humans can understand? Humans are buggy! In general, they don’t deal well with complexity. You can blame modern education, you can scream and shout, or languish on the fact that the IT industry is riddled with incompetence, you may even throw Kimball or Inmon books at the wall in anger. But the empirical tests all show the same: the wetware is the final test of the model.”
The writer goes on to define the four criteria of a good data model: “ (1) Data in a good model can be easily consumed. (2) Large data changes in a good model are scalable. (3) A good model provides predictable performance. (4)A good model can adapt to changes in requirements, but not at the expense of 1-3.”
Commenting on the fourth point, the writer remarks, “I foresee that this will turn out to be one of the central contention point in the discussions to follow. Changes are a fact of life. If you have not yet experienced the intense pain of changing a beautifully designed data model half way through a project, I recommend you go work a few years for any organisation led by a guy with an MBA degree who believes 10% growth is maintainable forever, and that the organisation must react quarterly to ‘market changes’. I am sure you will find plenty such employers that will teach you the nature of change for good mammon. For now, I will assume that you believe that ‘change happens’ – though we shall later look more closely at what form such change typically takes. 1-3 must often be balanced with 4. The data model must be flexible in some way; it must remain agile.”


















“Good data model” is used rather generically here which is difficult since there are multiple levels of models that are specific to an audience. A logical model, for example, should never ever have anything to do with criteria #3 – performance. The purpose is to establish good taxonomy and understanding of information of the business. It has nothing to do with specific physical implementations. Performance is an aspect of specific database technology and intended use of the resulting data structures derived from a physical model – especially whether it is operational or analytical. The logical should establish the terminology and understanding of the data. How it gets used to physically design structures is purely based on intended use and scope of automation.
In truth it was an incredible advanced report nevertheless as with most wonderful writers there are a few items that could be worked upon. But by no means the a smaller amount it had been exciting.