You are here:  Home  >  Data Blogs | Information From Enterprise Leaders  >  Current Article

Myth 1 – Normalization: Friend, Foe, or Frenemy The Survey

By   /  August 1, 2011  /  4 Comments

by Karen Lopez

I recently blogged the introduction to this series:  Normalization Myths that Really Make Me Crazy – Introduction to a Rant.  You should check that introduction out for the background on this post.  It has caveats and warnings that you’ll need to keep handy while you read this series.

I’m starting with the basis for many of the myths that I’ll be covering in this series.   Just last week I learned that Normalization is Evil.  Actually, I’ve heard that every week for the last twenty some years.   The basis of this thought is that business users never ask for data quality and always ask for better speed.  Remember that stinking pile of poo from the introduction? I think it’s still nearby

Most business users don’t say “We need the data to be correct” because they expect that as a given.  Are we really going to stand before them and say “Is it okay if some, maybe more, of the sales tax remittances are wrong so that we can get better performance out of the system?” Or maybe we could wow them with “Is it okay if we calculate many of the customer bills incorrectly so that developers don’t have to code as much?”

Think that last one is a laugh?  I was once asked to design a database that gave each developer a single table to work with. This table would hold all the data assigned to be coded by the developer during a single sprint.  This design was seen as a great way of maximizing developer productivity.  Triggers, stored procedures and other code would take care of the data integrity I was assured.  Thankfully I was able to show just how much this design would harm performance.

To be fair, I work with data architects and database designers who think that all denormalizations are some sort of Sign of Beast — as if there is a higher power up there watching over their designs, ready to strike them down for thinking of performance tradeoffs.   In fact, my friend Michael Swart (blog | twitter)  has a great illustration of this concept:

Ted Codd Hates That Thing You Just Did Cartoon

The Friend or Foe myth is based on the idea that you have to be for or against normalization as a concept.  I’ve seen speakers start their presentation by asking people in the audience if they are pro-normalization or anti-normalization. When people raise their hands to pro-normalization, the speaker will half-jokingly ask them to leave. The odd thing about this this for-against mindset is that it seems to indicate that one can have a data structure that has no normalization to it at all or that one can design something without a normal form.  I suppose random meaningless numbers have no normal form, but we don’t need a design for that, right?

This got me thinking about people who are both for you and against you: the frenemy.

A “Frenemy” (alternately spelled “frienemy”) is a portmanteau of “friend” and “enemy” that can refer to either an enemy disguised as a friend or to a partner who is simultaneously a competitor and rival. [1]

Could it be that normalization has to be either a friend or a foe?  Is it something that you have to choose between Team Normal and Team Denormal?  Or is it a frenemy, one of those things that you can pretend to like but have to hate when it comes down to getting things done?  I asked the Twitterverse what they felt about normalization.  The following represent the range of responses I received:

Normalization Quotes

 Not everyone was post their own beliefs; some were quoting what they have heard in the wild. You can see that I got a variety of opinions that ranged from Normalization is Evil to Normalization is Our Only Hope.  My sample is biased because these came primarily from those who have an interest in the same topics I am interested in.  I think if I’d asked the general IT population I would have received many more negative thoughts about normalization and people who believe in it.

It’s common for me to be questioned when I start a project by project managers and others about my friend/foe relationship with normalization.  I’ll get questions like:

PM: Do you believe in normalization?

Me: Yes.

I find this one hard to respond to without giggling.  It’s as if I’m being asked if I believe in Santa Claus. Or Ted Codd.

PM: [long pause] Okay…we’ll probably need to review your designs with the developers then.  They don’t take kindly to normalizers.

Me: That’s fine.  I love collaborating with the developers.  Together we can…

PM: [interrupting] How far do you go?

Me: [blushing] Um…what do you mean?

PM: What normal form? Third? Fifth?

Me: Oh. Well, it depends…

PM: We only go to third normal form here.  We are traditionalists.

This always makes me wonder if there is a Church of Normalization, Reformed.  This might also mean there’s a Church of Denormalization, Unformed.

Me: Okay.  What about First or Second Normal Form?

PM: Oh, we aren’t that radical.  Just Third.

I wonder if they might have some performance issues that could be easily rectified by their need to have every data structure have the same normalization level.  I make a note.

I can tell from these interview questions that the PM thinks that normalization is his frenemy: something that someone in the IT world thinks is a “good thing” for a design, but that everyone on the project thinks is evil.

Our job as data architects is to help team members understand that all data structures have a normal level, that normalization is an important part of meeting business needs and that the evil parts can sometimes be exorcised.  If you need to denormalize to meet a business goal, then do it.  It’s not evil incarnate.  It’s how design works: cost, benefit and risk.

I keep my normalizations friends close and my enemies closer.  As for frenemies, I don’t believe in them.



 [1]Wikipedia contributors, “Frenemy,” Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Frenemy&oldid=437999108 (accessed July 27, 2011).

About the author

Karen Lopez is Sr. Project Manager and Architect at InfoAdvisors. She has 20+ years of experience in project and data management on large, multi-project programs. Karen specializes in the practical application of data management principles. She is a frequent speaker, blogger and panelist on data quality, data governance, logical and physical modeling, data compliance, development methodologies and social issues in computing. Karen is an active user on social media and has been named one of the top 3 technology influencers by IBM Canada and one of the top 17 women in information management by Information Management Magazine. She is a Microsoft SQL Server MVP, specializing in data modeling and database design. She’s an advisor to the DAMA, International Board and a member of the Advisory Board of Zachman, International. She’s known for her slightly irreverent yet constructive opinions and rants on information technology topics. She wants you to love your data. Karen is also moderator of the InfoAdvisors Discussion Groups at www.infoadvisors.com and dm-discuss on Yahoo Groups. Follow Karen on Twitter (@datachick).

  • Frank

    Sorry, I am a consultant so my answer it’s depend! What are you trying to accomplish? For OLTP or OLAP the answer is almost obvious the problem arise when these two environments mix up.

  • You don’t even have to be a consultant to know that every design decision comes down to “it depends” All design considerations should be based on cost, benefit and risk.

    Thanks for the comment.

  • Hey Karen! Thanks for plugging my site! That drawing of Dr. Codd is still one my favourites.

    I see where you’re coming from with this post. You know how everyone has a room temperature that they’re comfortable with? You know, not too hot, not too cold but just right. Thermostat wars can erupt when people disagree about what the perfect room temperature is.

    So for the past x years, I’ve been pushing the “Normalization thermostat” up in the direction of pro-normalization. It means a lot of explaining consequences sometimes. Consequences that aren’t always immediately obvious.

    It’s just that in my experience I’ve seen normalized databases solve (or avoid) so many problems. Also performance gains attributed to denormalized models can also be realized with materialized views, or indexed views.

    So Karen, put me down as pro-normalization. It’s not just a good idea from academia, it’s solves problems.

  • Jill

    Your comments about third normal form had me literally laughing out loud. I hear “3rd normal form only” all the time. Which to me is just as bad as saying I will only normalize or I am against normalization. It depends. And you are right on – to have them all the same form might be impacting performance. I love when I work with a developer and I can get them to understand WHY we should/should not normalize. I started as a developer, and I wish someone in the data world had worked with me to understand why. Now that I am on this side the past 5 years, my opinion on normalization has radically changed.

You might also like...

Don’t Call it a Data Lake, its a Data River. Here’s Why.

Read More →