Data modeling is arguably the most granular discipline in the world of Data Management. From the conceptual to the logical to the physical, models facilitate the analysis of data within any enterprise, enhancing processes including database design or application integration.
Always evolving in sync with the lifeblood of business, a data model is never written in stone. Modeling tools like ERwin or PowerDesigner assist in all aspects of the modeling process, including schema and ER diagram creation as well as the versioning and management of model libraries.
This article looks at the latest editions of the two most popular modeling tools — the aforementioned PowerDesigner and ERwin, while also taking a look at new approaches in the real-world application of data modeling at the enterprise level.
Sybase PowerDesigner 16 Adds Information Architecture Modeling
PowerDesigner is an esteemed veteran among data modeling tools. Sybase claims the new version of PowerDesigner sports the world’s first integrated Information Architecture functionality. Version 16 of the modeling application saw its release earlier this year, with a host of other new features sure to please data modelers throughout the industry.
PowerDesigner 16 comes in three different editions, subtitled Data Architect, Information Architect, and Enterprise Architect. The latter two contain tools for Information Architecture modeling, including data movement models, conceptual data models, and business process models. PowerDesigner Enterprise Architect goes even farther, including modeling functionality for object-oriented models, as well as EA models and frameworks.
In addition to its modeling features, PowerDesigner 16 includes functionality to assist an organization in implementing Data Governance best practices by helping to manage the flow of information throughout the enterprise. The tool also allows for the easy capturing and sharing of corporate Metadata.
Other new features in PowerDesigner 16 include support for over 80 different databases, the most in the industry. The role-based user interface leverages Microsoft’s Windows 7 design standards. An Enterprise Glossary feature assists in the development and management of corporate ontologies. Finally, a thin-client portal interface facilitates the sharing of models within an organization.
All told, Sybase PowerDesigner 16 includes a host of attractive new features, with the additional Information Architecture functionality making the tool now worth a look for Enterprise Architects.
CA ERwin Hits the Cloud
Like a data modeling version of the perpetual Coke vs. Pepsi battle, PowerDesigner and ERwin continue to be the two dominant tools for modelers and data architects, each with its own boisterous collection of adherents. ERwin comes in a Standard Edition, suitable for most data modeling tasks. A free Community Edition limited to 25 objects per model is perfect for users who want to test drive the software. The Navigator Edition, with its free viewer, allows for easy distribution of models for review throughout an enterprise. For groups of modelers working as a team, ERwin Workgroup Edition provides repository, change management, and collaboration services functionality.
A new version of ERwin is aimed squarely at the Cloud. Microsoft Azure is the Redmond company’s Cloud computing service offering; ERwin Data Modeler for Microsoft SQL Azure is available either as a standalone product, or as an add-on to an existing version of ERwin. The tool allows for easy visualization of an enterprise’s data resources both on-premise or Cloud-based. While focused on SQL Azure, this ERwin version also supports other databases, with the same 25 object limit as the free Community Edition, unless purchased as an add-on to one of the other full ERwin editions which allows fully-defined models.
Another new ERwin product is the CA ERwin Web Portal, which appears to support a similar use case as the Navigator Edition, enterprise-wide access to read data models, but with a web-based interface. The tool features separate interfaces tailored to both business and technical users, in addition to powerful visualization and reporting functionality.
The Web Portal comes in Standard and Enterprise editions, with the Standard Edition being only suitable for evaluating the Enterprise Edition, as it only supports SQL Server Express. The Enterprise Edition integrates with the same database servers supported by the full desktop versions of ERwin, as well as the Workgroup Edition’s Model Manager.
Agile Data Modeling
The Agile project management methodology has garnered much acclaim in the past few years as a good way to get work done quickly in the IT industry. An iterative process not that different from the Spiral Lifecycle methodology popular in the early 90s, Agile depends on both fast prototyping and interactive communication. Where does data modeling fit in this new project management paradigm?
At this year’s Enterprise Data World conference, Len Silverston of Universal Data Models LLC gave a tutorial focused on the Agile Data Modeling process. Silverston stresses that Agile Data Modeling means flexible, high-quality data models delivered as part of a fast, interactive process as opposed to quick and dirty modeling done on the fly. It is not an excuse to develop one model with an eye towards building a completely new version one month later.
Some basic principles for Agile modeling include defining correct models for their correct purposes, developing a quick “broad brush” data model, prioritization, reuse of universal patterns, and the acceptance of the human dynamics at play. The latter includes understanding people’s motivations as well as developing trust within the project team.
That last principle might be the most important. The Agile process in general depends on quality human interaction and understanding, whether in developing a conceptual model or an interactive feature on a website. The Agile Manifesto states: “The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.”
Data Modeling and UML
Unified Modeling Language is a standard modeling language widely used in the software development industry. Though normally seen when modeling activities and actors with use-case diagrams, business processes with state diagrams, or software components, UML is obviously useful for data and object modeling. In fact, UML incorporates many techniques inspired by data modeling, specifically ER diagrams.
Modelsoft Consulting Corporation’s Michael Blaha gave a presentation at EDW on the use of UML when data modeling. The UML class model is perfect for the initial conceptual data models seen at the beginning of a project. Since it abstracts the underlying details of the database design, the class model is great for sharing with the business stakeholders on a project.
While UML, in some cases, overemphasizes programming language constructs within modeling, it remains useful for communicating conceptual data model designs to developers. Once data models become more granular however, most database designers prefer a more detailed modeling style.
Blaha also explored the similarities between UML and the notation used in Information Engineering methodology. This was used to clearly explain how UML describes things typical of database designs, like attributes and one-to-many relationships. Blaha feels that UML is well-suited for front-end modeling, while the more detailed data design is better for IE or other modeling styles. He also feels UML is not appropriate for the modeling of data warehouses.
An Elegant Approach to Enterprise Modeling
Dave McComb of boutique IT consultancy, Semantic Arts, emphasized a simplified, elegant approach to Enterprise Modeling in his EDW 2012 presentation. Complexity, in most forms, brings its own set of costs to any IT project — increased scale, interaction difficulties, etc.
McComb presented a new concept — diseconomy of scale — a twist on economies of scale, the principle describing the cost benefits achieved by companies as they continually expand. He used this to contrast how sometimes increased complexity is good (number of users, amount of data) versus the times when it is bad (lines of code, tables on a schema).
Both table proliferation within applications as well as the dependence on SAAS or off-the-shelf software packages leads to increased complexity that adversely affects the bottom line. In many cases, redundancies in the data models of both types of applications highlight and exacerbate the problem. The solution lies in reducing the complexity in these shared schemas.
“Elegance” is an approach for simplifying model complexity in shared schemas. In many large organizations, there may be 100,000 – 1,000,000 elements residing in the collective data models in that firm’s full application inventory. But those one million data elements probably describe only one to two thousand essential concepts.
Paring down that shared schema redundancy leads to cost benefits from better model understanding, fewer interfaces and processes to manage data, along with simplified project management. McComb showed how Elegance provided tangible benefits in the field at Sallie Mae, Proctor and Gamble, as well as LexisNexis.
Semantic technology plays an important role in this process by helping to promote reuse of similar concepts through the understanding of both the similarities and differences within models. Focusing on core “real world” concepts serves as a common denominator between differing application views, helping to create an upper ontology to describe nearly everything done within a business. The artifact of this work is a simplified enterprise ontology with an associated reduction in redundancy throughout the data models in the systems of an organization.
Data modeling continues to mature as an IT discipline, with modeling tools adding new features to support the growing needs of the enterprise, and new concepts within the practice providing opportunities for improved organizational efficiencies.