You are here:  Home  >  Conference and Webinar Communities  >  Current Article

Fundamentals of Data Modeling in Agile Environments

By   /  September 22, 2015  /  No Comments

data modeling sig x300by Jelani Harper

Numerous circles have lauded the agile process within Data Management for its inclusive, expeditious approach that supposedly involves different facets of the enterprise. According however to a special interest group entitled “ER/Studio and Data Modeling Special Interest Group” held at Enterprise Data World 2015, hosted by Karen Lopez of InfoAdvisors and Ron Huizenga of Embarcadero, those circles generally do not include professionals specializing in Data Modeling.

The agile process regularly complicates the pivotal component of Data Modeling in the various applications and databases it engenders in many key ways, including:

  • Exclusion: Oftentimes, data modelers are not brought into the agile process until the various developers, scrum masters and business analysts are at the stage in which they actually need some data. Worse, as noted by Lopez, “Because I’m a consultant, typically I’m brought in on a troubled project that’s Agile and they have no data modelers, no data architect, and they’ve built this database and they can’t get the data in or out or it’s not performing. It’s a mess, usually.”
  • Developers vs. Modelers: There are numerous distinctions between the objectives, approaches, and needs of developers and data modelers that provide various points of conflict. Since the agile cycle typically begins and ends with developers, their tendencies can exacerbate the involvement of modelers.
  • Playing Catch-Up: Because of their exclusion at the outset of agile projects, modelers oftentimes must play catch-up with a process in which they ideally should work ahead of sprints. It is not uncommon for many modelers to work on several different agile teams simultaneously.

These issues and others were discussed in candid detail by the special interest group, which yielded a significant number of solutions and insights into the necessities of Data Modeling.

Data Modeling Similarities and Differences in Agile Environments

In theory, the fundamentals of Data Modeling exist the same in agile environments as they do outside of them. Modelers are generally tasked with implementing data at the conceptual, logical, and physical levels while accounting for an Enterprise Data Model as well. In agile environments, however, they must also accommodate a project model which can present critical differences. Stories replace the requirements provided in the aforementioned models —which frequently lack the detail of the former. Lopez mentioned:

“Usually when I’m brought in I’m given the stories at the same time that the developers are and the DBAs are, and the developers are like, ‘where’s my tables’? And I’m like, ‘I haven’t even read the stories’. And by the way, the stories are always crap because they say something like, ‘and then we have to charge sales tax’ and that’s the extent of the requirement and I know that sales tax is complex and crazy. In a real Data Model it takes about 70 tables to do right.”

Many times, modelers can get sufficient requirements from business analysts, and even do so in a way that enables them to keep abreast of sprints and their goals. Huizenga observed:

“I’m not slamming developers or programmers, but quite often they’re shortsighted in knowing what they need to include. So I found if I can work with the business analyst or whoever was there to get a glimpse ahead…I’ve found that it smooths the road quite a bit.”

Developer Realities

More importantly, perhaps, modelers are often pulled into a developer-centric world where there are many misunderstandings between these two groups, including:

  • The Very Notion of Data Modeling: Many developers think there is only one Data Model that applies to an agile project when in reality such an assumption is not true. According to Lopez: “They think there’s one Data Model, and that it matches all their production environment. And then if they use…their own dev environment on each of their laptops that’s even times 150, now. So I can’t compare the dev environment that they’ve been playing in, but I can compare what they’re proposing to do in QA.”
  • Target Modeling: Many developers frequently don’t realize—or care—that modelers must target Data Modeling to multiple environments. These include those for developers, QA, preproduction and production. Additionally, these different environments require data modeled at various points in time, which roundly increases the complexity of modeling in agile processes.
  • Agility: Some modelers have even encountered situations in which the flexible aspect of agility contrasts with the practical need to reuse design patterns. This need increases with the delayed involvement and potential multiplicity of projects modelers may be working on.

Upfront Modeling

The practice of upfront modeling can certainly help data modelers to keep pace with the rapidity associated with agile environments, which is readily exacerbated by all the models for which these professionals are responsible. Models required for various environments are multiplied by specific models that certain users require. According to Lopez: “Whether they’re physically separate models or snapshots or branches, I’m juggling all those versions of what’s really conceptually the same model. I might have 15 or 20 at the same time.” Utilizing upfront modeling and certain preconceived patterns associated with modeling can help reduce the complexity of so many models while also reducing the time to create and implement them. Developers are “sometimes reluctant on that because they consider that big upfront modeling,” Lopez said. “Yes, it’s upfront, but it’s thinking that’s been done—just like your code patterns.”


Another recourse for Data Modeling in time-intensive agile environments is to use branching. Branching is oftentimes advisable in situations where there are situational requirements which mandate different versions of models and other aspects of data. Instead of creating an entirely separate model in such instances, modelers can merely ‘branch off’ of a current model and then eventually merge back to the primary model. Another means of accounting for the time-sensitive environments that agile processes create is for modelers to work directly in developer sandboxes—which helps developers get an idea of model constraints and how to accommodate them. Such a tactic helps to facilitate the sort of interactivity and collaboration for which agile methods are known. Huizenga reflected on this approach:

“I used to start with a skeleton working with the developers saying, ‘here’s what I think you need’. We would throw it into their developer sandboxes on their desktops. We would play around with it and see what could make it work. Then I would do a compare and merge and bring it back and say, ‘okay that works, that doesn’t, let’s merge this way’ and then we would just keep going back and forth.”

Breaking the Build—and Fixing It

The realities of Data Modeling are greatly challenged when working in agile environments because of the strict deadlines that often present time constraints for everyone involved. Modelers can help to offset some of these issues which largely exist due to assumptions, misunderstandings, and general ignorance on the part of developers in several ways. These include getting clarification from the business about requirements and soliciting its involvement to broaden the scope of the project. They also include utilizing upfront modeling and branching in addition to working directly in developer sandboxes to give developers an idea of Data Modeling standards. In fact, working in developer sandboxes can help to create in ideal situation in which developers have near real-time access to their alignment with modeling needs. According to Huizenga:

“On one project I rescued, we took it to the point where we had five different teams going, and as soon as something got checked in, if it broke the build we actually had red flashing lights wired into the computers. Basically, everybody knew it was all hands on deck to figure out was wrong, fix the build, get on it with, and away you go. And it’s amazing the level of collaboration that will drive. It’s just having everybody working together. And the business teams that were a part of that, they just loved it that this stuff was happening real time and they were a witness to what was going on.”

About the author

Jelani Harper has written for a number of publications, both online and in print. He was a staff writer at both the Oakland Tribune and the San Mateo Times. He has written extensively about various aspects of IT and finance including business intelligence, cloud computing and cloud-based data, GPS, architecture, data management, and ERP.

You might also like...

Webinar: Mastering Data Modeling for NoSQL Platforms

Read More →