by Doug Stacey
If you are anything like me, as a data professional you’ve experienced many frustrations in your career due primarily to things that have occurred or choices that were made long before you were around (over which you had no influence and with which you must now live). However, with the recent popularity of service oriented application development, we now have a chance to lay a solid foundation when it comes to providing data services: one that will eliminate ambiguity around sourcing and meaning, one that provides a framework for consistently applying data management principles, and one that will enable the promise of a service oriented architecture (SOA).
What This Is About
Service orientation is emerging as ‘the’ way to provide a platform-independent, loosely-coupled business function that enables a quick response to changing business demands. What’s not to like? From a data perspective though there are some basic tenets to be followed when creating the interface for services that deal with data. I liken it to doing a good logical design before you dive in and create the physical structures.
It is important that we recognize the need to take an interface-based development approach. Some refer to this as ‘contract first.’ All a consumer of your service is going to see is this interface, so it is important that it be well defined and stable. In effect you are making a contract with the consumer and that contract is defined by the interface. More about that later.
Why Do We Need It?
The alternative to explicitly designing the interface to your service is to let it be driven by the application development process. There are many development frameworks for web services. Their focus is on methods and parameters, not messages and elements. During the compilation of an object, a developer can choose to generate an interface that will be based on the structure of its methods. This is referred to as ‘code first’ and, while easy for a developer, it is not what you want from a data management perspective. The content of the interface becomes directly dependent on the methods exposed by the object. Change one of those methods and re-compile and you’ve just changed your contract with all of your consumers.
Again, this is not unlike a developer creating whatever data structures they need to accomplish their objective, then presenting them to the Data Administrator or DBA to be implemented ‘as is.’ They don’t have the broader, enterprise perspective that a Data Administrator or data management professional does.
When creating a service that accesses data, a developer is likely to focus on what needs to be retrieved to satisfy their short term need. If they create a service that retrieves attributes A, B, C, and D because that’s what they need, they’ll be happy and on their way. The next project that comes along may need attributes A, B, E, and F. I’m sure you can see where this is going. In a matter of a few projects you could have 3 or 5 services that could retrieve attributes A and B. Now multiply that out by 10 years. Which service is correct? Which service should be used as the authoritative source when the next project comes along? In no time at all this will result in what many data professionals today would refer to as ‘the mess’ of 20 or 30 years of application development without strong policies and governance in place.
Developers are just doing what they are tasked to do by the enterprise: deliver a project as quickly as possible. The responsibility to prevent these types of situations belongs to the senior IT management, and ultimately the business. IT management should have the vision required to see that expedited delivery on projects without proper planning will lead to greatly increased costs in future maintenance and new project development. The business must be prepared to put in a little extra effort, and the resulting incremental expense, to craft and implement a strategy that will prevent the repetition of these types of mistakes.
What Are Data Services?
Data Services provide and update data in a service oriented architecture. They do not contain application logic; they do not implement process automation or work flow; they simply provide information.
A well defined interface for a data service is one which has been written to specifications that can be understood and trusted by consumers. It is based on XML Schema Definition (XSD), which defines the data contract or the precise format of the XML over SOAP, and WSDL Web Service Description Language layered over the XSD. This isn’t intended to be a technical discussion of data services so I’ll go no further than that in terms of a definition. Plenty of information can be had by searching the web on the terms above.
What Are Their Advantages?
The advantages of using data services are many. They can actually help us deal with those sins of the past. Consider a case where you have conflicting or redundant sources or systems of record for a given subject area. Once you provide an authoritative data service to present that data, you now have an abstraction layer behind which you can rationalize physical sources without affecting all of your consumers.
They also can provide a platform on which you can enforce data management policies. If a translation needs to be done to align coding structures that have evolved independently over the years; that can happen in the data service. Data cleansing is possible here as well, security rules can be applied, and audit-ability can be accomplished if required. Many things are possible when you’ve consolidated access to a common service.
Designing The Interface
In order to understand what interface you should provide via your data services, it’s back to the basics. Create a good logical model of the subject area without consideration of what services or means are required to retrieve the data, what methods they may expose, or how the data is physically stored. This model should represent the data required by any of your consumers. You can change it over time if new requirements arise, but typically it will be very stable. Make sure all elements of the model are clearly defined, as semantic ambiguity can and will cause problems for consumers.
Once the model is complete, you simply break up the access to the data it represents into like components. For instance, if the subject area you are dealing with is ‘customers,’ most every consumer will want basic identifying information. You may find however that some will only be interested in their billing and payment history, while others may be interested in interactions with customer service, and yet others with products purchased. The service itself can be designed broadly with the consumer requesting which sections of the model they require or, alternatively, more discreet services that just present specific sections can be created.
Either way, at this point you’ve made a ‘contract’ with your consumers. Without concern for physical storage or representation they can now consume the information from your service to accomplish their business process. Short of major new requirements, it should not be common to have to change this contract.
On the back end, then, you’ll map the data model to the physical data. This could require access to one or more objects or methods to be able to fulfill the interface requirements. It could even stretch across data stores and platforms with various legacy access techniques.
What to Watch Out for
Lack of data quality will certainly be a barrier to the adoption of your data services. Lack of semantic integrity across applications will also lead to problems. This should be resolved by the logical model, where the meaning of every attribute should be well documented. If, however, you are serving up data from one application and another application defines the data differently, satisfaction with the solution will be low.
As can often be the case when it comes to data management, packaged applications can also provide a wrinkle in your plans. Without access to the source code, they will likely not be able to consume your services. If they are the system of record for the information you need to provide, hopefully the package will be open enough to allow you access to their data model and the underlying data structures.
Lastly, the adoption rate of SOA in your organization could be an impediment. Your organization needs to be committed to moving forward with a service oriented architecture in order for your data service implementation to be a success.
ABOUT THE AUTHOR
Doug is currently a Data Architect in the Information Architecture group at Allstate Insurance Co. He has been in the field of data and data management for over 25 years. He has spoken at various conferences both nationally and internationally, has published numerous articles in trade publications, and served as Technical Editor and Board of Director Member for the International DB2 Users Group (IDUG). More recently, Doug helped lead Allstate to two Wilshire Awards: High Commendation for Best Meta Data Practices and Outstanding Award for Warehouse Metadata Implementation.